0% found this document useful (0 votes)
19 views8 pages

Guru - Charan G.

Guru Charan G. is a results-driven Data Engineer with extensive experience in cloud engineering, data migration, and AI/ML solutions across various platforms including AWS, Azure, and GCP. He has a proven track record in designing scalable data architectures, implementing ETL processes, and enhancing data quality and governance in big data environments. Currently, he is focused on optimizing healthcare data systems and improving operational efficiencies through advanced analytics and real-time data processing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views8 pages

Guru - Charan G.

Guru Charan G. is a results-driven Data Engineer with extensive experience in cloud engineering, data migration, and AI/ML solutions across various platforms including AWS, Azure, and GCP. He has a proven track record in designing scalable data architectures, implementing ETL processes, and enhancing data quality and governance in big data environments. Currently, he is focused on optimizing healthcare data systems and improving operational efficiencies through advanced analytics and real-time data processing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Guru Charan G.

[email protected] || ☎: +1 475-319-3980

Connecticut || LinkedIn

Professional Overview:

Results-driven and visionary Data Engineer with extensive experience in managing high-performing cross-functional teams and driving
transformative data strategies. Skilled in cloud engineering, data pipelines, data migration, strategic planning, and ETL processes, with
expertise across AWS, Azure, GCP, Snowflake, and Databricks. Innovative with extensive experience in AI/ML, improvising data pipeline
architectures, data integration, data modeling, and implementing AI-driven projects to deliver impactful results.

Committed to leveraging deep expertise to advance AI and cloud engineering, while fostering continuous improvement and delivering
scalable, high-impact solutions in Big Data environments. Passionate about leveraging my expertise to contribute to the success of data-
driven projects, enhance operational efficiencies, and drive technological innovation across organizations.

Cloud Technologies & AI/ML Solutions:


 Experienced in designing and implementing scalable cloud data architectures and AI/ML solutions across AWS, Azure, and GCP,
utilizing technologies Snowflake, BigQuery, and IBM services.
 Skilled in big data processing, data migration, and building reliable, distributed systems with a focus on performance optimization
and data modeling.
 Expertise in driving data-driven decision making through machine learning integration, while leveraging platforms SAP
BusinessObjects, Teradata, and Oracle Data Visualization for actionable insights and improved system reliability.

Cloud Data Architecture & Data Migration:


 Proficient in designing and implementing robust cloud data architectures using Azure Synapse Analytics, AWS Redshift, Snowflake,
and BigQuery to support scalable data lakes and data warehousing solutions.
 Expertise in data migration, data modeling, and dimensional modeling, with a strong focus on data partitioning, data sharding, and
data syncing to ensure seamless data flow across platforms.
 Skilled in leveraging IBM Cloud and applying feature engineering to optimize performance and enhance data processing efficiency,
driving improved data management and analytics capabilities.

Data Preparation & ETL Process:


 Led the design and automation of ETL (Extract, Transform, Load) processes using tools Azure Data Factory, AWS Glue, Informatica,
and IBM DataStage to process and transform data from various sources, expertising in data extraction, data integration, data
transformation, and data aggregation to prepare data for analysis.
 Enhanced data quality through data cleansing, data validation, and data orchestration, ensuring data is both actionable and
accurate for decision-making.

Real-Time Data Processing & Streaming:


 Delivered real-time data processing and data streaming solutions using Apache Kafka, AWS Kinesis, Azure Event Hubs, and Google
Pub/Sub, enabling scalable, high-performance dataflow architectures to process data from IoT devices, transactional systems, and
external APIs.
 Managed data pipelines and implemented batch processing for large-scale data applications.
 Enhanced data pipeline monitoring and applied best practices for data performance optimization and data query optimization.

Big Data & Advanced Analytics:


 Led initiatives in big data environments using Apache Spark, Databricks, and GCP BigQuery to process and analyze vast amounts of
structured and unstructured data by appling advanced analytics techniques data clustering, data aggregation, and regression
analysis.
 Enhanced data performance optimization and data query optimization strategies to improve processing speed and reduce costs,
leveraging SQL, Python, and Java for analytics and data transformation tasks.

Machine Learning & AI Solutions:


 Integrated machine learning models using platforms Azure Machine Learning, AWS SageMaker, IBM Watson Analytics, and
Databricks, developing predictive models and deployed AI-driven solutions for data automation and decision-making.
 Worked closely with data scientists to implement artificial intelligence models, including computer vision and natural language
processing (NLP) algorithms, driving impactful business results.

Cloud Data Security & Compliance:


 Ensured data security by implementing robust encryption protocols, data backup and recovery mechanisms, and strong access
control policies.
 Utilized AWS KMS, Azure Key Vault, and IBM Guardium to ensure data is protected and compliant with industry standards GDPR
and HIPAA.
 Implemented data governance frameworks that track data lineage, metadata management, and ensure data fault tolerance.

Data Integration & Automation:


 Implemented data integration using data APIs and streamlined data orchestration processes with Apache Cassandra, SQL, and
Teradata to ensure seamless data flow across various platforms.
 Emphasized data containerization for improved scalability and streamlined deployment of data solutions, enabling quicker time-to-
market.
 Led automation efforts to integrate and orchestrate data across distributed environments, improving overall system efficiency.

Data Visualization & Reporting:


 Developed comprehensive data visualization solutions using tools Power BI, Tableau, Google Analytics, Oracle Data Visualization,
and SAP BusinessObjects to build interactive dashboards that allowed stakeholders to visualize key business metrics, track
performance, and improve data-driven decision-making and applied data performance optimization to ensure fast and accurate
reporting.
 Successfully empowered teams in implementing cloud data engineering best practices and enhanced systems architecture.
 Utilized tools JIRA, PowerBI, and Tableau for project tracking, reporting, and data visualization.

Cloud Data Solutions & Automation:


 Managed cross-cloud data solutions, enabling seamless integration and workflow between multi-cloud environments.
 Implemented data mesh and data federation strategies to improve consistency and reliability in cloud-based data solutions,
ensuring scalability, reliability, and performance optimization by applying data load balancing, data partitioning, and data indexing.

DevOps for Data Engineering:


 Developed and maintained CI/CD pipelines for data pipeline automation and deployment, improving data pipeline monitoring and
reducing manual intervention.
 Implemented observability practices, ensuring real-time monitoring, logging, and data validation, Focusing on data fault tolerance,
ensuring high availability and continuous integration of complex data workflows.

Cross-Functional & Strategic Collaboration:


 Collaborated with technical teams and business leaders to implement data-driven solutions that optimized operational workflows,
financial attribution, and supply chain processes.
 Utilized data visualization, analytics, and advanced machine learning algorithms to drive strategic insights, ensuring that all data
processes complied with internal data compliance regulations and best practices for data governance.

Tech Stack:

ETL Tools/OS : Informatica Power center, DataStage, SSIS, SSRS, Ab Initio, Airflow, MuleWindows:
9x/NT/2000/XP/Vista/7/8/10, UNIX, Mulesoft, ELK stack, Splunk monitoring.

Database : Oracle 19c/12c, MS Access 2016, SQL Server 2019, SSIS, Sybase and DB2, Teradata r15, Hive 2.3, Impala,
Cassandra 3.11, Amazon Aurora, Google Cloud Spanner, Azure Cosmos DB, PostgreSQL, MariaDB.

Programming : Java, Python (NumPy, Pandas, matplotlib), SQL, T-SQL, PLSQL, Shell Scripting, R, C#, C++, HTML5, Power
shell, ASP, Visual Basic, XML, Angular.

visualization : Tableau, Tableau server, Tableau Reader, SAP Business Objects, SSIS, SSRS, Crystal Reports, Power BI
tools/DataScience Business Objects, Advanced Excel, Machine Learning, Deep Learning Models, PCAData Science Pipeline,
TensorFlow, PyTorch, Scikit-learn, Keras.

Data Modeling Tools : Erwin 9.7, ER/Studio, Star-Schema Modeling, Snowflake Schema Modeling, FACT and dimension tables,
ETL Informatica, Pivot Tables.

Clouds / Services : Data Lakes, Data Factory, SQL Data warehouse, Data Lake Analytics, Databricks, Synapse, Blob storage,
Azure Monitor, Azure Application Insights, Virtual Machines, and other Azure services, Salesforce CRM,
Dockers, Kubernetes, AWS (S3, EMR, EC2, Glue, Athena) IAM, EMR, Kinesis, VPC, Dynamo DB, Redshift,
Amazon RDS, Lambda, DMS, Quick Sight, Amazon Elastic Load Balancing, Auto Scaling, CloudWatch, SNS,
SQS, AWS SageMaker, AWS Lambda, AWS Kinesis, AWS CloudTrail, Azure Machine Learning, Azure
Cognitive Services, Azure Kubernetes Service (AKS), Azure Synapse Analytics, Azure Functions, GCP, Google
BigQuery, Google Cloud AI Platform, Google Cloud Dataproc, Google Pub/Sub, Google Dataflow

Applications : Toad for Oracle, Oracle SQL Developer, MS Word 2017, MS Excel 2016, MS PowerPoint 2017, Teradata r15.
Big Data : Hadoop 3.0, Spark 2.3, MongoDB 3.6, MapReduce, Sqoop, Kafka, Snowflake, Apache Flink, Apache Beam,
Apache Kafka Streams

Professional Experience:

CLOUD DATA ENGINEER| Python & ML | BCBSA, WALLINGFORD, CT JUNE 2023 –


Current
The project aimed to enhance the scalability, performance, and security of healthcare data systems by migrating on-premise data
warehouses, implementing real-time data streaming, and enabling advanced analytics to improve operational efficiency and compliance
within the healthcare organization.

 Designed and deployed cloud-based solutions leveraging Google BigQuery, Google Cloud Storage (GCS), Google Cloud SQL, and
Google Data Studio in Data Architecture, Data Modeling, and Data Governance while integrating Python-based microservices for
additional processing layers to ensure scalability, availability, and performance optimization for healthcare data systems, in line
with business objectives.
 Migrated large healthcare datasets from on-premise systems to Google Cloud environments, using Google Cloud Storage, BigQuery,
and Python frameworks to automate Data Extraction and Data Sourcing, ensuring the integrity and security of data during the
transition and improving the overall Data Migration process to streamline cloud adoption.
 Built, tested, and maintained Data Pipelines using Google Cloud Dataflow, Google Cloud Dataproc, and Google Cloud SQL, alongside
custom Python applications for Data Orchestration, real-time data processing, Data Validation, and Data Cleansing, to support
seamless Data Integration and Data Transformation.
 Designed and implemented Data Models for large-scale healthcare datasets, focusing on efficient Data Transformation using Google
Cloud-based frameworks and Python-driven solutions, along with Dimensional Modeling techniques and Feature Engineering to
support predictive analytics and better business decision-making.
 Developed complex SQL queries and stored procedures for Data Extraction, Data Aggregation, and Data Transformation, enhancing
performance with Google Cloud-based ETL tools and Python-based processing routines, ensuring the accuracy and integrity of
healthcare data, particularly during the Data Migration phase.
 Conducted Data Mining operations to extract actionable insights from clinical, claims, and operational data, utilizing Google Cloud
libraries and Python algorithms for Data Streaming and supporting decision-making for senior leadership, improving Data
Analytics capabilities.
 Enhanced Data Migration processes using automated workflows in Google Cloud Dataflow and Python, streamlining the movement
of data from legacy systems to cloud-based environments and ensuring minimal downtime during real-time data processing and
Data Syncing.
 Collaborated with management and stakeholders to understand business needs and created Data Validation methods and tools,
leveraging Google Cloud technologies and Python to improve Data Quality and optimize the Data Migration processes.
 Developed and implemented ETL (Extract, Transform, Load) strategies and processes using dbt, Google Cloud Dataflow, and
Python-based frameworks to ensure seamless Data Integration and Data Modeling across various healthcare systems.
 Designed, tested, and implemented Google Data Studio reports and dashboards for operational and clinical insights, incorporating
Google Cloud tools and Python to enable customized Data Visualization, delivering high-quality data visualizations for healthcare
stakeholders.
 Performed Data Analysis and profiling on raw healthcare datasets to identify and resolve inconsistencies, ensuring high-quality
data for reporting and analytics, with Google Cloud tools and Python applied during Data Cleansing and Data Mining tasks.
 Conducted advanced analytics using Python, Google Cloud Dataproc, Google Cloud AI Platform, and Google BigQuery, performing
prescriptive and predictive modeling to derive actionable insights for healthcare operations, and applying Machine Learning
Engineering practices.
 Integrated Google BigQuery with other Google Cloud services like Google Cloud Storage and Google Data Studio, as well as Python
applications, to create end-to-end Data Pipeline solutions that enhance performance, scalability, and Data Clustering for large
datasets.
 Enhanced Data Quality and Data Governance by implementing new data acquisition and validation techniques with Google Cloud
and Python, ensuring compliance with healthcare industry standards like HIPAA.
 Led cross-functional teams to manage Data Migration projects, ensuring smooth transitions from on-premise data warehouses to
Google Cloud-based architectures while ensuring Data Security, integrity, and reliability, with Google Cloud and Python integrations
to improve migration efficiency.
 Developed and optimized dbt models, including macros and testing frameworks, alongside Google Cloud-based optimization
routines and Python-driven processes, to ensure consistency and Data Quality during Data Transformation and Data Migration
tasks.
 Managed complex Data Analysis and Data Modeling tasks to support business intelligence initiatives, including segmentation
techniques for customer data and improving reporting accuracy, with the help of Google Cloud-driven automation and Python-
based Data Transformation.
 Ensured the performance and scalability of Google BigQuery and Google Cloud SQL environments for large-scale data processing
and analytics, utilizing Google Cloud-based performance tuning techniques and Python-based optimizations for Data Query
Optimization and Data Partitioning.
 Presented Data Migration status, Data Quality test results, and project progress in Agile sprint meetings, ensuring alignment with
business goals, with Google Cloud tools and Python tools used to streamline Data Reporting and performance tracking.
 Implemented automation using Python, Google Cloud, and Google BigQuery to streamline Data Transformation processes, enabling
advanced analytics on healthcare datasets and applying Batch Processing for high-volume data handling.
 Led initiatives to enhance Data Quality, Data Governance, and Data Security during migration, leveraging tools like Google Cloud
Key Management and integrating Google Cloud-based security solutions with Python-driven encryption for secure Data Processing.
 Conducted Data Segmentation, Data Mining, and Data Modeling to improve insights and reporting for clinical, financial, and
operational teams, leveraging Google Cloud-based algorithms and Python-based techniques to improve Data Analysis results and
apply Data Clustering for optimized querying.
 Built and deployed scalable ETL (Extract, Transform, Load) pipelines for real-time Data Ingestion using Google Cloud Dataflow,
Google Cloud Pub/Sub, and Python applications for enhanced Data Transformation and real-time data processing.
 Worked closely with business stakeholders and senior executives to ensure data solutions met healthcare objectives and complied
with industry regulations, utilizing Google Cloud-based validation tools and Python-driven processing to streamline Data
Compliance and reporting processes.
 Utilized Google Data Studio and Google BigQuery together, with Python-based custom components, to create dynamic dashboards
that helped senior leadership monitor KPIs and make more informed decisions regarding patient care, Data Reporting, and
operational improvements.
 Enhanced Data Migration strategies by creating custom tools for Data Extraction, Data Transformation, and Data Load (ETL)
processes, incorporating Google Cloud-based optimizations and Python-based solutions to reduce migration times and improve the
reliability of the migrated datasets.
 Led workshops for stakeholders on using Data Mining techniques to identify trends and patterns in healthcare data, driving data-
driven decisions and improving overall healthcare service delivery, with Google Cloud tools and Python algorithms enabling faster
Data Insights.
 Designed and optimized Data Modeling techniques to ensure that large volumes of healthcare data could be processed efficiently,
with Google Cloud and Python used for custom Data Transformation to meet reporting and analytical needs.
 Applied advanced Data Mining algorithms in Google Cloud and Python to uncover patterns in healthcare claims data, improving
operational efficiency by predicting resource allocation needs and optimizing Data Sharding for better data distribution.
 Used Google Data Studio and Google BigQuery together, with Google Cloud-based custom components and Python integrations, to
create dynamic dashboards that helped senior leadership monitor KPIs, enhance Data Lineage, and make more informed decisions
regarding patient care and operational improvements.

Tools and Environments: Google Cloud (BigQuery, Cloud Storage, Cloud SQL, Dataflow, Dataproc, Pub/Sub, Composer, Cloud
Monitoring, Cloud Trace, Cloud Audit Logs, Cloud Key Management, Cloud Functions, Cloud Build, Cloud Deploy, Cloud AI Platform, Data
Studio, Cloud Pub/Sub, Cloud SQL) | Python | SQL | Spark | dbt | Apache Kafka | Power BI | Machine Learning | HIPAA | Parquet | JSON |
Dimensional Modeling | Feature Engineering | ETL | Jenkins | Apache Iceberg | PowerShell |

Sr. DATA ENGINEER | BANK OF AMERICA, JERSEY CITY, NJ MAY 2021 – MAY 2023
The project focused on optimizing and securing the Bank's data infrastructure on GCP, improving real-time analytics capabilities while
ensuring compliance with financial regulations. It involved migrating legacy data systems, streamlining data pipelines, improving
performance, and reducing operational costs through advanced GCP services and automation.

 Led the migration of on-premise data warehouses to Google BigQuery, enhancing scalability, performance, and security, achieving a
reduction in operational costs, while applying data modeling principles for optimized query performance, with Python-based
optimization techniques.
 Designed and orchestrated end-to-end data pipelines using Google Cloud Dataflow, ensuring efficient data ingestion,
transformation, and loading (ETL) processes across various financial data sources, to enable seamless reporting and analysis with
SQL-based transformations, incorporating Python for custom data transformation logic.
 Conducted data mining tasks to extract actionable insights from large-scale financial datasets, leveraging Python and tools like
Scikit-learn, NumPy, Pandas, Matplotlib, Seaborn, and StatsModels for exploratory data analysis (EDA), while utilizing Python for
large-scale data processing and integration.
 Managed and optimized large-scale datasets utilizing Google Cloud Storage (GCS) and Google Cloud Dataproc (HDFS), processing
petabyte-scale data with high efficiency and low latency for reliable financial data access, leveraging SQL queries and Python to
improve performance and scalability.
 Developed, tested, and implemented custom ETL solutions using Python, Scala, and Google Cloud Dataflow, focusing on complex
data transformations and integrating machine learning models for predictive insights into financial systems, improving data
modeling techniques.
 Enhanced Google Cloud Storage environments through lifecycle management policies and data tiering strategies, reducing storage
costs while maintaining accessibility and ensuring compliance with financial regulatory standards, aligned with data migration best
practices and utilizing Python for automation of these processes.
 Automated deployment and monitoring of data pipelines using Google Cloud Build and Google Cloud Deployment Manager,
leveraging CI/CD principles with YAML pipelines to ensure seamless and faster production updates, improving data migration
speed and efficiency, and incorporating Python in custom deployment scripts.
 Integrated Apache Airflow on Google Cloud Composer to automate and orchestrate complex data workflows, enabling efficient
scheduling and monitoring of ETL processes, improving the performance and reliability of data pipelines in the migration phase,
with Python used for custom task operators.
 Balanced workloads between Google BigQuery and dedicated clusters, applying data modeling techniques with Python-based
approaches for optimal performance and cost-efficiency across different types of queries and financial data workloads.
 Implemented comprehensive monitoring solutions using Google Cloud Monitoring, Cloud Trace, and Cloud Audit Logs, identifying
performance bottlenecks in data pipelines and databases to ensure proactive issue resolution, enhancing data migration processes
through Python-powered monitoring tools.
 Collaborated with data science teams to integrate Google AI Platform with existing data infrastructure, enabling advanced machine
learning workflows for financial data processing and analysis, leveraging data mining for better predictions.
 Developed and implemented data retention and archiving strategies using Google Cloud Storage and Google Cloud Coldline,
automating data lifecycle processes via Google Cloud Dataflow to meet compliance requirements and reduce storage costs,
supporting data migration goals.
 Enhanced data models in Google Cloud SQL and Google BigQuery, applying data modeling techniques like normalization, indexing,
and partitioning schemes to improve query performance for complex financial reporting and analysis, focusing on SQL query
optimization.
 Managed the integration of external financial data sources into Google Cloud using Google Cloud Functions and Google API Gateway,
automating data ingestion and processing from third-party financial services to ensure seamless data integration with internal
systems, accelerating data migration efforts.
 Implemented HA/DR (High Availability/Disaster Recovery) strategies for critical workloads using Google Cloud Storage Cross-
Region Replication, Google Cloud SQL High Availability configurations, and Google Cloud Filestore for data resilience, minimizing
downtime and data loss during outages, optimizing data migration resilience.
 Continuously optimized Google Cloud resource utilization using Google Cloud Cost Management and Google Cloud Advisor,
implementing cost-saving measures such as sustained use discounts, auto-scaling, and rightsizing of compute resources, achieving a
reduction in operational costs through efficient data migration management.
 Played a key role in applying data migration best practices, ensuring smooth transitions and minimized downtime during the shift
from on-premise to cloud-based systems, facilitating the adoption of scalable, secure, and compliant cloud infrastructure while
leveraging Google BigQuery for improved performance.
 Developed and deployed interactive Google Data Studio reports and dashboards for financial and operational data, allowing
stakeholders to visualize key metrics and trends with real-time data, helping decision-makers act on actionable insights quickly.
 Integrated Google Data Studio with Google BigQuery to deliver real-time data reporting capabilities, enabling financial analysts to
access up-to-date information on business performance directly from the cloud-based data warehouse.
 Utilized Google Data Studio to build parameterized reports, providing business users with the flexibility to drill down into specific
data sets, improving the ability to make informed decisions based on different segments of financial data.
 Enhanced Google Data Studio report performance by optimizing queries and applying data modeling techniques in Google
BigQuery, resulting in faster report generation and a better user experience for the financial and operational teams.
 Implemented security measures for Google Data Studio reports by integrating role-based access control (RBAC) using Identity and
Access Management (IAM), ensuring that sensitive financial data was only accessible to authorized users, and meeting regulatory
compliance requirements like HIPAA.

Tools and Environments: Google Cloud (BigQuery, Cloud Storage, Cloud SQL, Dataflow, Composer, Functions, Deployment Manager,
Pub/Sub, AI Platform, Monitoring, Trace, Cloud Logs, Coldline, Filestore) | Python | SQL | Scala | Apache Airflow | Scikit-learn | NumPy |
Pandas | Matplotlib | Seaborn | StatsModels | Cloud Dataproc (HDFS) | Google Data Studio | CI/CD (YAML pipelines) | Machine Learning |
SQL Query Optimization | dbt | Cloud Cost Management | Cloud Advisor |

DATA ENGINEER | CENTURYLINK, MIDDLEFIELD, CT JAN 2020 – OCT 2020


The implementation aimed to enhance customer data analytics, improve data governance, and deliver real-time insights for marketing
and customer service within the telecom industry. The project focused on optimizing data management practices to ensure compliance
with GDPR and CCPA regulations, while also supporting data migration strategies to modernize legacy systems. This initiative enabled
more accurate customer insights, improved decision-making, and streamlined operations, all while maintaining regulatory compliance
and enhancing the overall customer experience.

 Contributed to migrating huge customer data from on-premises SQL Server databases to a hybrid cloud environment, leveraging
AWS S3 for the data lake and GCP Cloud Storage for machine learning model storage, ensuring high availability, data consistency,
and secure storage across both platforms, with Python-based solutions for data transfer and integration.
 Designed and implemented data warehousing solutions using AWS Redshift, optimizing complex queries for large datasets to
support advanced customer analytics and reporting, while leveraging GCP BigQuery for big data analytics on high-volume customer
interaction data, incorporating Python for query optimization.
 Developed, tested, and managed ETL pipelines using AWS Glue to ingest, transform, and load data from diverse sources into AWS
Redshift, while using GCP Dataflow for processing real-time data streams, enabling seamless data integration across the hybrid
environment with Python for custom transformation logic.
 Addressed the customer service team’s need for real-time insights into customer behavior by streaming data from mobile
applications and CRM systems via AWS Kinesis and GCP Dataflow, enabling immediate analysis and proactive resolution of
customer service issues, using Python to streamline data processing.
 Collaborated with the Data Governance team to implement data classification, encryption, and compliance measures in alignment
with GDPR and CCPA regulations using AWS Key Management Service (KMS) and GCP Cloud Key Management, enforcing robust
security policies across both platforms, with Python-based encryption methods.
 Set up automated data quality checks and monitoring using AWS CloudWatch and GCP Cloud Monitoring, ensuring data accuracy
and reliability across the hybrid platform, with Python used for writing custom monitoring scripts and automation.
 Leveraged AWS SageMaker for building predictive models and GCP AI Platform for deploying machine learning models, creating
dynamic customer segments based on behavior and purchasing history to drive targeted marketing campaigns and personalized
offers, incorporating Python for model integration.
 Integrated AWS Secrets Manager and GCP Secret Manager for secure management of keys and credentials, ensuring data encryption
at rest and in transit while adhering to CenturyLink's stringent security protocols, utilizing Python for secure credential
management workflows.
 Tuned performance for large-scale data processing by optimizing Apache Spark jobs on AWS EMR and GCP Dataproc, partitioning
strategies, and indexing within AWS Redshift and GCP BigQuery, resulting in faster query execution and more efficient data
analysis, using Python for optimization algorithms.
 Collaborated with cross-functional teams, including Data Science, Marketing, and IT Operations, to deliver data solutions aligned
with business objectives, following Agile methodology in a multi-cloud environment, with Python enabling seamless
communication between AWS and GCP components.
 Designed and implemented batch and streaming data ingestion processes using AWS Glue for batch data and AWS Kinesis, GCP
Dataflow, and GCP Pub/Sub for real-time data streams, enabling continuous flow of customer interaction data from multiple
sources.
 Developed a multi-tiered data lake architecture on AWS S3 and GCP Cloud Storage, segregating raw, cleansed, and curated data
layers to support different use cases, from basic reporting to advanced analytics and machine learning.
 Implemented CI/CD pipelines using AWS CodePipeline and Jenkins for automating the deployment of data pipelines within AWS,
and Google Cloud Build for managing deployments in GCP, ensuring rapid and reliable deployment of new features and updates.
 Collaborated with data scientists to build and deploy machine learning models within AWS SageMaker and GCP AI Platform,
enabling advanced analytics churn prediction and personalized recommendations for CenturyLink customers.
 Integrated AWS Glue and GCP Cloud Data Catalog to catalog and classify data assets across the platforms, improving data
discoverability, lineage tracking, and enhancing the overall data governance framework.
 Configured role-based access control (RBAC) and managed identities for AWS and GCP resources, enforcing fine-grained access
controls and ensuring sensitive customer data is protected and accessed only by authorized personnel.
 Implemented a disaster recovery plan by leveraging AWS Backup, GCP Backup and Restore, and geo-redundant storage solutions,
ensuring critical customer data was replicated and recoverable, reducing recovery time and protecting business continuity.
 Developed complex data transformation workflows in AWS Glue and GCP Dataproc using PySpark to clean, enrich, and aggregate
large datasets, supporting downstream analytics and reporting requirements, incorporating Python for handling edge cases in data
transformation.
 Integrated external data sources (third-party demographic data and marketing lists) with internal customer data stored in AWS S3
and GCP Cloud Storage, providing a 360-degree view of customers to improve targeting and personalization efforts, utilizing Python
for ETL automation.
 Using AWS CloudWatch and GCP Cloud Monitoring, set up automated alerts for pipeline failures or performance degradation,
enabling proactive resolution of issues before they impacted downstream analytics or customer-facing applications, improving
operational efficiency with Python-based alert scripts.
 Enhanced cloud resource usage, reducing operational costs through efficient management of compute resources (AWS EC2, GCP
Compute Engine), storage tiers (AWS S3, GCP Cloud Storage), and data retention policies across both clouds.
 Integrated structured and unstructured data (call logs, customer feedback, web interactions) into a unified data lakehouse built on
AWS S3 and GCP Cloud Storage, enabling cross-platform queries and deeper insights into customer behavior and preferences, using
Python for managing cross-platform data flow.

Tools and Environments: AWS (S3, Glue, Redshift, Kinesis, CloudWatch, SageMaker, Secrets Manager, KMS, EMR, CodePipeline, EC2,
Backup) | GCP (Cloud Storage, BigQuery, Dataflow, AI Platform, Cloud Monitoring, Cloud Key Management, Cloud Pub/Sub, Dataproc,
Cloud Build, Cloud Data Catalog, Secret Manager, Compute Engine) | SQL | Apache Spark | PySpark | Apache Kinesis | ETL | CI/CD |
Jenkins | Machine Learning | GDPR | CCPA | RBAC | Data Lake |

DATA ENGINEER | ADP, HYDERABAD, IND MAY 2016 – JUN 2019


The project aimed to optimize the existing data integration and analytics platform by automating ETL processes to enhance the
management of workforce data, including payroll, employee records, and HR analytics. The focus was on improving data quality,
performance, and ensuring compliance with regulations GDPR. Additionally, the project supported data migration efforts to streamline
the transfer of legacy data, enabling more efficient real-time data processing. This enhanced the overall decision-making and operational
efficiency in Human Capital Management (HCM), driving better insights and compliance across the organization.

 Led the migration of on-premises databases to AWS cloud solutions, significantly improving system uptime, scalability, and
reducing maintenance costs.
 Built a scalable and efficient data integration and analytics platform that collected, processed, and analyzed customer data from
various sources, enabling data migration strategies and enhancing decision-making processes to improve customer engagement.
 Designed and developed scalable ETL pipelines using Azure services Azure Data Factory, Azure Functions, and Python-based
scripts to automate data extraction, transformation, and loading from multiple sources, databases, APIs, log files, supporting
seamless data migration across systems.
 Integrated data from various internal and external sources into a central data lake on Azure Data Lake Storage, ensuring efficient
data storage, retrieval, and optimized partitioning strategies while facilitating smooth migration of legacy data.
 Developed and maintained data models in Azure Synapse Analytics, supporting efficient querying, reporting, and analytics for
business stakeholders, ensuring consistency in data migration efforts across platforms.
 Used Azure Data Factory and Apache Spark to perform large-scale data processing and transformation tasks, ensuring high data
quality, consistency, and accuracy throughout the data migration process.
 Implemented data validation and cleansing routines, ensuring the consistency of data across different sources and maintaining data
accuracy for downstream processes, including data migration to new platforms.
 Collaborated with data scientists and analysts to understand data needs and provide access to well-structured, high-quality data for
model development, reporting, and migration projects.
 Supported the development of machine learning models by providing timely and reliable data, ensuring accuracy and performance
in predictive analytics, as well as aiding in data migration and integration efforts for model retraining.
 Automated routine data tasks using Azure Functions and Python scripts, reducing manual intervention and improving operational
efficiency, especially for data migration and ongoing data management tasks.
 Enhanced data pipelines for performance and cost efficiency, leveraging services Azure Virtual Machines, Azure Synapse Analytics,
and Azure Blob Storage, ensuring smooth and cost-effective data migration.
 Ensured security best practices by encrypting data at rest and in transit using Azure Key Vault, implementing Azure Active
Directory (AAD) policies to secure access, and maintaining network security through Azure Virtual Networks for data in transit,
including during migration.
 Ensured compliance with GDPR by implementing strict access controls, data retention policies, and audit logging through Azure
Monitor and Azure Purview, facilitating secure and compliant data migration processes.
 Set up monitoring and alerting using Azure Monitor to track data pipeline performance, implementing real-time failure recovery
mechanisms via Azure Logic Apps, minimizing disruption during data migration.
 Developed optimized schemas and data structures in Azure Synapse Analytics, improving query performance and reducing
redundant storage using Azure Data Lake to directly query stored data, simplifying migration from legacy systems.
 Automated scaling of Azure Virtual Machines using Virtual Machine Scale Sets, reducing infrastructure costs, while ensuring
scalability and availability for large-scale data migration projects.
 Architected ETL workflows with Azure Data Factory, leveraging dynamic data flows and PySpark for semi-structured data,
including handling inputs from Azure SQL Database, Cosmos DB, and on-premises databases, supporting complex data migration
scenarios.
 Integrated with external APIs using Azure API Management and Azure Functions, streamlining real-time data collection and
ingestion processes, and enhancing data migration from external platforms.
 Fine-tuned Azure Synapse Analytics for improved query performance by adjusting distribution, partitioning, and workload
management settings, optimizing OLAP queries, and ensuring the integrity of data during migration.
 Conducted regular security audits, anonymizing sensitive data to ensure compliance with GDPR and other regulatory frameworks,
protecting data during migration and throughout its lifecycle.
 Collaborated closely with business analysts and product managers, aligning data infrastructure solutions with business goals,
ensuring data accuracy for analytics, machine learning, and migration projects.
 Automated CI/CD pipelines for ETL workflows, optimizing code deployments and infrastructure updates using Terraform, reducing
manual effort and improving the reliability of data migration pipelines.

Tools and Environments: Azure (Data Factory, Functions, Data Lake Storage, Synapse Analytics, Blob Storage, Key Vault, AAD, Virtual
Networks, Monitor, Logic Apps, Virtual Machines, SQL Database, Cosmos DB, API Management, Purview, Virtual Machine Scale Sets) |
AWS | SQL | Python | Apache Spark | PySpark | Terraform | GDPR | ETL | Machine Learning | CI/CD | Terraform | Azure DevOps | Data
Migration | Data Integration | Azure Resource Management | GDPR | Data Architecture|

DATABASE ENGINEER | HSBC, HYDERABAD, IND SEP 2013 – APR 2016


The project focused on designing and implementing scalable, secure, and high-performance database solutions for financial applications,
with an emphasis on high availability, regulatory compliance, and seamless integration with cloud platforms. It also involved the
migration of critical data to modern cloud environments, ensuring smooth transitions and enabling real-time data processing and
analytics for improved decision-making and operational efficiency.

 Designed and implemented relational databases using Oracle, Microsoft SQL Server, and IBM DB2 for high-volume applications,
optimizing database schemas for financial reporting and fast data retrieval in Investment Banking, Corporate Finance, Asset
Management, and Securities Trading.
 Led the migration of on-premises databases to AWS cloud solutions, significantly improving system uptime, scalability, and
reducing maintenance costs.
 Leveraged services AWS Lambda, Amazon Redshift, Amazon S3, Amazon Kinesis, and AWS Database Migration Service to facilitate
seamless transitions.
 Implemented AWS DMS for smooth database migration to the cloud, ensuring data consistency and high availability using AWS
services RDS Multi-AZ, Aurora Global Databases, AWS IAM, and AWS Glue, critical for Data Integration.
 Leveraged programming languages Python, SQL, Java, Scala, and R for building and optimizing Trading Algorithms, Quantitative
Analysis, and Financial Risk Modeling solutions.
 Integrated advanced database encryption techniques to protect sensitive financial data, ensuring compliance with PCI-DSS, Basel
III, and RBI guidelines in line with Banking Regulation and Financial Compliance.
 Developed data governance policies to maintain data accuracy and consistency for financial audits and ensure protection from
unauthorized access in Risk Management and Trading Algorithms.
 Enhanced database schemas for banking applications, ensuring scalability, performance, and data integrity in high-transaction
environments, supporting Derivatives, Capital Markets, Quantitative Finance, and Equity Research.
 Implemented event-driven architecture using Apache Kafka for real-time data streaming, enhancing fraud detection capabilities
and reducing fraud response time in Risk Analytics and Investment Strategies.
 Led ETL processes using SSIS and Informatica, migrating large datasets into Teradata for improved reporting, and facilitating cloud
migration to AWS for enhanced scalability and real-time processing.
 Applied big data technologies Hadoop, Spark, and Kafka for fraud detection and customer behavior analysis, improving Risk
Management and Quantitative Analysis in Hedge Funds.
 Collaborated cross-functionally with teams to align database solutions with organizational goals and regulatory standards,
supporting Mergers and Acquisitions (M&A), Debt Issuance, and IPO (Initial Public Offering) strategies in Private Equity, Structured
Finance, and Credit Risk.
 Enhanced performance using AWS EC2, Elastic Load Balancing, and integrated robust security measures in line with industry
regulations, ensuring encryption, access control, and threat protection for Trading Algorithms and Market Data.
 Designed high-availability architectures using Oracle RAC and SQL Server Always On to ensure continuous access to financial data
during system failures, critical for Stress Testing and Operational Risk management.
 Executed disaster recovery strategies with Veritas NetBackup and RMAN to ensure fast data restoration during system failures,
safeguarding Financial Compliance and Capital Adequacy.
 Utilized tools Oracle Enterprise Manager, SQL Diagnostic Manager, and Tableau to monitor and improve database performance,
resolving bottlenecks in high-volume environments, crucial for Financial Data Management.
 Developed complex PL/SQL and T-SQL stored procedures, functions, and triggers for enhanced performance and rapid data
retrieval in Financial Risk Modeling, Portfolio Management, and Financial Analysis.
 Leveraged cloud services Amazon RDS, Aurora, Redshift, Lambda, and Amazon Managed Workflows for Apache Airflow to enhance
database performance, scalability, and real-time data processing, resulting in a reduction in infrastructure costs.
 Applied Business Intelligence (BI) tools SAP-BI, Cognos, and OBIEE to create Compliance Reporting and manage Financial Data
Warehousing for Capital Markets and Financial Modeling applications.
 Led Parallel ETL pipelines, utilizing Informatica PowerCentre, Pentaho, and Apache Hive, ensuring efficient batch processing and
real-time data ingestion for improved Quantitative Finance and Investment Banking operations.
 Integrated OLAP Cubes for multidimensional analysis of Financial Data in Risk Analytics and Portfolio Management, providing
insights into Risk Modeling, Stress Testing, and Capital Adequacy.
 Ensured the use of Data Catalogs and Unit Cataloging for seamless data management and traceability in Data Governance, crucial
for maintaining accurate Securities Trading data and improving Financial Risk Modeling.

Tools and Environments: Java, J2EE| Spring | Hibernate | Oracle | SQL Server | MySQL | MongoDB | Kafka | AWS (RDS, Lambda, S3,
Redshift, Aurora) | Oracle RAC | Veritas NetBackup | Teradata | Hadoop | Spark | SQL | PL/SQL | T-SQL | AWS DMS | IAM | KMS | AWS
Shield.
EDUCATION:

Bachelors in Computer Science, KLU University, TN & 2013


Masters in Computer Science, University of Bridgeport & 2020

You might also like