ArvindKumar for Data Engineer_Chennai_Sagent
ArvindKumar for Data Engineer_Chennai_Sagent
No: +91-9025365326
Data Engineer Email: [email protected]
Summary
• Enthusiastic persistent and result driven individual with 8.3 years of overall experience in complex
and heterogeneous IT environment. With around 4.5 years of experience in Python and Spark and
3.8 years of experience in Informatica Power center.
• Experience in Apache Spark along with Python.
• Hands on experience on AWS (Lambda, Storage S3 and Glue).
• Used Standard Python Modules like Json, PySpark.
• Good experience in writing Spark applications using Python for big data processing.
• Very Excellent development experience in Python, Spark, SQL Hands-on experience with Pyspark
Data Frame and Pandas Data Frame.
• Good working knowledge in the Agile methodology with hands on experience on JIRA.
• Good working experience on AWS distributions and efficiently used Python and Pyspark for data
ingestions and Data Transformations, respectively.
Skills
Experience
Role: Data Engineer
Company: HCL
Duration: 01/2021- Present
Client: Else, Oli
Project Description:
Purpose of the project was to get the behavior of an employees in an organization and we get
the data from S3 and process via Python and Pyspark, using PyCharm and AWS glue
Using Pyspark and performing ETL operations on raw data from client.
Worked on Merging of Data frames and filtering out desired results and providing it to ML team
Write to the S3 Bucket in Parquet Format.
Worked on transaction data and performing transformations for increasing value of data
received.
Performed Transformation and providing it back to ML engineers for Analysis
Query the S3 Data or Dynamo DB data in Athena using Crawler.
Collect data from multiple APIs and combine and process them using PySpark and Python
Utilizing GIT and TERRAFORM to implement the accepted POC build in a higher environment.
Role: Data Engineer
Company: Intellecto Global Services
Duration: 11/2019 to 12/20
Client: ELSEVIER
Project Description:
Purpose of the project was to get the data from MongoDB and process via Python - Pandas and
Pyspark using PyCharm.
Established connectivity to MongoDB using PyCharm and PyMongo for seamless data access
and manipulation.
Developed Python-Pandas scripts to create and update collections within the MongoDB
database.
Implemented comparison and merging strategies for collections based on client-specific
conditions to ensure data integrity and consistency.
Participated actively in code reviews and deployment processes to maintain code quality and
deployment efficiency.
Coordinated closely with onshore partners to gather and understand project requirements,
facilitating continuous delivery and alignment with client expectations.
Project Description:
Working as a Part of Developing a Health Care Record of patients and their medical history from
Data records available and feeding it to AI for Auto Prescription. Data retrieved through APIs and
processed through Pandas for Data insertion, manipulation, and filtering
Project Description:
The objective of this project is to handle the business logic between medical representatives and doctors
using Informatica Power center.
Education
M.E (Power Electronics and Drives) from Easwari Engineering College, Chennai in 2012.