0% found this document useful (0 votes)
54 views1 page

Anshul Yadav Resume ML

Uploaded by

RohitSeasotiya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views1 page

Anshul Yadav Resume ML

Uploaded by

RohitSeasotiya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

ANSHUL YADAV

+14082426432 | [email protected] | San Jose, CA, USA | linkedin.com/in/yadavanshul36/ | github.com/yadavanshul


EDUCATION
San Jose State University August 2023 - May 2025
Master's, Data Science
• Coursework: Data Visualization, Mathematical Methods for Data Analytics, Database Systems and NoSQL, Big data technologies Apache Spark/
Hadoop (MapReduce), Deep Learning, data mining.

Vellore Institute of Technology July 2019 - May 2023


Bachelor's, Computer Science
• Coursework: Data structures, Data-Optimization, Data Analytics using Python, Database Management Systems, Soft Skills: Communication skills, Time
management, Interpersonal skills.

PROFESSIONAL EXPERIENCE
inn4smart solution Gurugram, Haryana, India
Intern Data Analyst August 2022 - August 2023
• Implemented an end-to-end data pipeline from user to data warehouse : Snowflake, processing 10,000+ records daily for seamless visualization with
Python for EDA.
• Optimized ETL workflows with PySpark and Apache Airflow, reducing processing time by 30% and enhancing data accuracy for analytics using data
models.
• Specialized in NLP and machine learning techniques to enhance user interaction with data-driven applications, implementing sentiment analysis models
to gauge consumer response with over 85% accuracy.
• Developed and tuned LLMs for chatbot applications in a smart home environment, improving the efficiency of natural language understanding.
• Developed NLP algorithms to automate the extraction and interpretation of unstructured data from diverse sources, enhancing data accuracy and speed
of insights in business intelligence tools.

Drabito Technologies Noida, UP, India


Summer Intern - Data Analyst April 2020 - August 2020
• Designed and implemented data workflows focused from MongoDB to Data Lake: GCP, processing 15,000+ records dail for NLP applications, from
extraction through predictive modeling using REST APIs
• Leveraged Python for natural language data visualization and preprocessing, enhancing machine learning model readiness for test automation.

PROJECTS & OUTSIDE EXPERIENCE


Comprehensive Crime Analysis and Reduction Strategy, Montgomery County, 16- 22 San Jose, CA, USA
ML, Python August 2023 - December 2023
• Implemented data preprocessing techniques like PCA to capture 90% of the variance with 12 components, enhancing data analysis accuracy.
• Leveraged Python libraries such as Plotly, Matplotlib, and Seaborn for data visualization.
• Reorganized data into date format for time series using Python, enhancing time analysis and improving EDA for market trend identification by 50%,
leveraging open-source tools for data manipulation and SQL for data querying.
• Improved Naive Bayes accuracy from 65.24% to 92.29% and KNN from 61.52% to 88.99% through data cleaning, outlier handling, and feature selection
using Scikit-learn.

Coronary Heart Disease Data Analysis San Jose, CA, USA


Logistic Regression, Random forest, Data preprocessing August 2023 - October 2023
• Applied statistical modeling techniques, including removing duplicates, filling 10% of null values, standard scaling, and feature engineering.
• Identified Exploratory data analysis and Strategically Designed Visual Insights, including heat maps, scatter plots, and bubble charts, to uncover hidden
correlations and trends within large datasets, enabling data-informed decision-making.
• Operationalized statistical modeling, such as logistic regression, with 72% accuracy and further improved results to 83% using a Voting Classifier,
incorporating Random Forest and feature scaling with Principal Component Analysis (PCA).
• Deploy predictive models in production, leveraging open-source tools and AWS services such as s3 bucket.

U.S. Climatological data analysis in Google Cloud Platform San Jose, CA, USA
MySQL, GCP, Airflow, ETL pipeline, KNN September 2023 - December 2023
• Automated ETL framework for U.S. Climatological data analysis using Apache Airflow, ensuring efficient data maintenance and updates with Archive
DAG and Daily DAG cycles.
• DAG cycle workflows with Google Cloud Platform, constructing a BigQuery star schema that streamlined real-time and historical data analytics,
culminating in the effective visualization of key performance indicators.
• Analyzed data from 756 California weather stations using independent t-tests, Shapiro-Wilk normality tests, and data modeling like k-means clustering
to identify regional climate change trends.
• Utilized Python for data extraction, EDA, and analysis in the Google Cloud Platform project, demonstrating proficiency in data analytics, data
visualization, and machine learning techniques.

Technical Skills
• Programming Languages: Python, SQL, MySQL, R, Java.
• Machine Learning Libraries: Pandas, Parallel-Pandas, Numpy, Scikit-Learn, Modin, SpaCy, NLTK, TensorFlow, Pytorch.
• Data Analysis: Machine Learning, NLP, LLM, Statistics Analysis, Reinforcement Learning, Statistical Models, Regression Analysis, Predictive Models,
Bayesian analysis.
• Data Engineer: AWS (Athena, Glue, Redshift), Google Data Studio, Databricks, GCP: Big Query, Spark, Microsoft Excel (VLOOKUP and pivot tables),
GitHub, Visual Studio.
• Data Visualization: Tableau, Looker, PowerBI, MATLAB, Seaborn, Matplotlib, Plotly.

Certification
• Google Data Analytics, Google Analytics (Coursera) - (Data-driven decision making, MS-Excel, Machine learning ).
• Microsoft Certified: Azure Fundamentals
• IBM Data Analyst, IBM (Coursera) - (Data Visualization, Predictive Models).
• Data Science and Advanced Analytics, VIT-AP & Advanced Data Analytics Tools, VIT-AP (NLP, Optimization Techniques).
• Tableau Essential Training - LinkedIn Learning – (Dashboard Development, Tableau Public, Advanced Calculations, Trend Analysis).

You might also like