Manoj Reddy Frisco, Frisco, TX
4088907768
Data Engineer | Azure Data Engineer | Big data Engineer [email protected]
linkedin.com/in/manoj-kumar-reddy
Manoj Kumar is a seasoned Data Engineer with 8 years of experience Skills
driving efficiency and innovation through Big Data and Cloud solutions.
His expertise lies in building and optimizing data pipelines, migrating • Amazon Athena
data warehouses to the cloud, and ensuring data quality and • Amazon Elastic MapReduce
performance. • Amazon Redshift
A results-oriented Data Engineer with a passion for leveraging • Amazon Web Services
technology to solve complex problems. His proven track record of • Analysis
delivering efficient, scalable, and high-quality data solutions makes him • Analytical Skill
an asset to any team.
• Apache Airflow
• Apache Hadoop
Highlights:
• Apache Hbase
• Apache Hive
Boosted Processing Efficiency by 50%: Designed and implemented
• Apache HTTP Server
data pipelines on Azure Data Factory, slashing processing times and
• Apache Kafka
accelerating valuable insights.
• Apache Oozie
Delivered Scalable Cloud Solutions: Migrated on-premises data
warehouses to Azure SQL Data Warehouse, resulting in significant • Apache Spark
cost savings and improved scalability for future growth. • Apache Sqoop
Enhanced Data Integrity: Implemented automated data validation • Application Programming
Interface
processes, ensuring data quality and driving more reliable business
decisions. • AWS Lambda
Optimized Data Processing: Reduced Spark job latency and • Azure Data Factory
achieved a 40% improvement in query execution time through • Azure DevOps Server
performance tuning techniques. • Azure SQL Data Warehouse
• Azure Storage
Beyond the Core • Batch Processing
• Big Data
DevOps Integration: Experience with CI/CD tools to automate • Boto3
software delivery and streamline development processes. • Cascading Style Sheet
Web Development Background: Possesses a foundational • Cloud Computing
understanding of web development technologies (HTML, CSS,
• Cosmos DB
JavaScript) from previous experience.
• Cost Reduction
• Creativity
TECHNICAL SKILLS:
• Cross Functional Skills
• Data Access
Programming languages: Java, C/C++, Python, Scala, SQL
• Data Aggregation
Big Data Components: MapReduce, HDFS, Spark, HBase, Hive,
• Data Analysis
Sqoop, Pyspark
• Database Optimization
Orchestration tools: Oozie, Airflow, ADF
• Data Export
AWS Services:S3, Athena, Lambda, Redshift, DynamoDB, EMR, etc.
• Data Extraction
Azure:ADLS, ADF, SQL Server, Databricks, Synapse, Logic apps,
Cosmos DB, etc • Data Integration
Data warehouse: Hive, Synapse, snowflake • Data Integrity
• Data Lake
• Data Management
• Data Modeling
Work Experience • Data Pipelines
• Data Processing
Tiger Analytics • Data Storage
Santa Clara, CA, USA • 06/2023 - 04/2024 • Data Structure
• Data Validation
Data Engineer
• Data Warehouse
• Enhanced data extraction efficiency by 50% by creating Spark • DevOps Operations
Applications using PySpark and Spark-SQL, resulting in expedited
data processing and cost savings. • Diligence
• Designed and implemented end to end data pipelines on Azure • DynamoDB
Data Factory to extract data from various sources, transform it • Eclipse
using Azure Databricks, and load it into Azure SQL Data • Electronic Health Record
Warehouse for analytics purposes resulting in a 50% reduction in
processing time. • Energetic
• Developed Python scripts, and UDFs using both Data frames/SQL • Enterprise Java Beans
in Spark for Data Aggregation. • Etl Tool
• Migrated on-premises data warehouses to Azure SQL Data • Exception Handling
Warehouse, achieving significant cost savings and improved • Hadoop Cluster
scalability.
• Hadoop Distributed File System
• Collaborated on ETL (Extract, Transform, Load) tasks, maintaining
data integrity, and verifying pipeline stability. • Hard Working
• Experienced in performance tuning of spark applications for setting • High Availability
the right batch interval size, the correct level of parallelism, and • Hypertext Markup Language
memory tuning. • Java
• Worked on Data integration and storage technologies with Jupyter • JavaScript
Notebook and MySQL.
• JavaScript Object Notation
• Collaborated with DevOps Engineers to develop automated CI/CD
and Test-driven development pipeline using Azure as per client • JavaServer Pages
standards. • Jenkins
• Written extensive Spark SQL queries to do transformations on the • jQuery
data to be used by downstream models. • MapReduce
• Enhanced quality of data insights through implementation of • Microsoft Azure
automated data validation processes.
• Microsoft SQL Server
• Teradata migration to Azure Delta Lake and creation of the external
tables in serverless Synapse • MySQL
• Performed debugging, data validation, and data clean-up analysis • NoSQL
within large datasets. • Optimization Techniques
• Optimized data storage, reducing storage costs by 20% and • Organizational Skills
improving data retrieval speed by 15%, leading to more efficient • Performance Tuning
data processing and cost savings for the company.
• Power BI
• Environments : Azure ADF, Databricks, Azure DevOps, Azure
DevOps, Synapse, PySpark, Teradata, Snowflake, etc. • Prioritization
• Pyspark
• Python
Bosch • Python Script
06/2021 - 06/2023 • Relational Database
Management System
Data Engineer
• Routine Inspection
• Developed a scalable data warehouse using Azure Blob Storage, • Scalability
Data Lake, and Synapse to store and manage large volumes of
data. • Scala Programming language
• Experience in migrating/ingesting other databases to Snowflake • Service Management
using the ETL tool Azure data factory. • Shell Script
• Worked on Azure Cosmos DB as the final staging layer which gets • Snowflake Software
connected to PowerBI.
• SnowSQL
• Has experience with SnowSQL and Snowpark. • Software Optimization
• Monitor and optimize query performance in Snowflake. • Spring Framework
• Has worked on Azure SQL, Data warehousing services like Azure • Spring Mvc
Synapse tool, and Data Modeling concepts. • SQL
• Used Python and Shell scripting to build pipelines. • SQL Azure
• Reduced the latency of spark jobs by tweaking the spark • Streamlining Process
configurations and following other performance and Optimization
techniques. • Team Development
• Teradata migration to Azure Delta Lake and creation of the external • Team Player
tables in serverless Synapse • Teradata Database
• Worked on Data integration and storage technologies with Jupyter • Test Driven Development
Notebook and MySQL. • Troubleshooting
• Performed extensive debugging, data validation, error handling • Unit Testing
mechanism, transformation types, and data clean-up analysis
within large datasets. • User Interface
• Conducted regular monitoring and troubleshooting of Azure data • Web Application
solutions, ensuring high availability and data reliability.
• Environments : AWS ETL, AWS lambda, Azure ADF, Databricks,
Azure DevOps, Big data tools : Azure DevOps, Teradata, Synapse, Education
PySpark, Snowflake, Snowpark, etc.
Bachelor Of Technology B
Tech In Information
Mu Sigma Technology
11/2019 - 04/2021 National Institute Of
Technology
Big Data Engineer Srinagar, India
• Experience in Developing Spark applications using Spark -
SQL/Scala in Databricks for data extraction, transformation, and
aggregation.
• Developed and optimized CI build jobs with the help of Jenkins.
• Performed performance tuning on Pyspark/SQL queries and data
processing jobs, resulting in a 40% reduction in query execution
time.
• Worked on Apache Kafka which is scheduled using UC4 to handle
batch data.
• Developed Python scripts, and UDFs using both Data frames/SQL
in Spark for Data Aggregation.
• Used Python and Shell scripting to build pipelines. Environments :
Azure ADF, Databricks, Big data tools, Jenkins, logic apps, Scala,
etc.
• Created scripts in Python (Boto3) which integrated with Amazon
API to control service operations.
• Involved in writing spark and hive scripts to run in EMR using
different operators available in Airflow and scheduling them.
• Worked with integration of Spark Streaming and Apache Kafka.
• Explored the Spark performance and optimization of the existing
algorithms in Hadoop using Spark Context, Spark-SQL, Data
Frame, Pair RDD's, Spark YARN.
• Implemented AWS Lambda with integration of DynamoDB, S3 etc.
• Used Spark-SQL to Load data into Hive tables and Write queries to
fetch data from these tables. Implemented partitioning and
bucketing in hive.
• Responsible for loading structured and semi-structured data into
Hadoop by creating static and dynamic partitions.
• Worked on Spark user-defined functions (UDF) using PySpark for
external functions.
• Written extensive Hive queries to do transformations on the data to
be used by downstream models.
• Environments : Python, PySpark, Hive, Apache Airflow, AWS EMR,
AWS Athena, AWS S3, AWS Lambda. etc.
Tcs
03/2018 - 08/2019
Hadoop Developer
• Responsibilities :
• Used Sqoop to import/export data from various RDBMS (Teradata,
Oracle, etc.) . to Hadoop cluster and vice versa.
• Developed, and maintained data integration programs in Hadoop
and RDBMS environments with both RDBMS and NoSQL data
stores for data access and analysis.
• Worked on batch processing of data sources using Apache Spark
with Java.
• Involved in converting Hive/SQL queries into Spark transformations
using Spark RDD's, and Java.
• Implemented solutions for ingesting data from various sources and
processing the Data-at-Restulizing Big Data technologies such as
Hadoop, Map Reduce Frameworks, HBase, and Hive.
• Increased Hive performance and optimization by using bucketing,
partitioning, and other techniques.
• Configured the Oozie workflows to manage independent jobs and
automate Shell, Hive, Sqoop, Spark, etc.
• Involved in populating the processed data either by spark/hive to
No-SQL HBase
• Environments : Java, Spark (Java) , Hive, Oozie, HBase, SQL, etc.
Value Labs
04/2016 - 02/2018
Java Developer
• Used HTML, CSS, JavaScript, and JSP pages for user interaction.
• Implemented and managed SQL database for use in the
background.
• Created a web application with Spring framework.
• Experience with using Apache web server.
• Solved problems using the combination of JavaScript, JSON, and
jQuery.
• Developed and implemented servlets and Java beans.
• Developed JSP pages and viewed and control related files using
the Spring MVC framework.
• Coordinate with the development team to identify automation
opportunities and improve technical support for end users.
• Perform code optimization, conduct unit testing, and develop
frameworks using object-oriented design principles.
• Worked effectively with cross-functional design teams to create
software solutions that elevated client-side experience.
• Environments : Java, Eclipse IDE, Spring MVC, SQL, etc.