“Ayush is a scrappy data scientist that has made significant contributions to the LeanTaaS team. He puts his hand up to take on challenging problems, digs into them and comes up with quick solutions with scalability and reliability in mind. For example, he has rewritten logic he has inherited from others and scaled algorithms we had issues with. He has also created a robust data quality framework and rebuilt a critical module in the LeanTaaS product that is now used by thousands of users without issues. There have been instances where others have given up on core problems we wanted to innovate on and Ayush came up with approaches to solve them. I am confident that when we give him a problem he will do what it takes to figure it out and get the job done. He is a key contributor to the team and an asset to the company.”
Ayush Parolia
Santa Clara, California, United States
582 followers
500+ connections
About
Highly skilled and results-driven Principal Data Scientist with 10 years of experience in…
Experience
Education
Licenses & Certifications
Publications
-
Adapting Event Embedding for Implicit Discourse Relation Recognition
Association for Computational Linguistics (ACL)
Courses
-
Artificial Intelligence
CS59000
-
Cryptography
CS 555
-
Natural Language Processing
CS59000
-
Statistical Machine Learning
CS57800
Projects
-
Sentiment Analysis on Yelp reviews
Implemented a Naïve Bayes Classification algorithm and use it on a sample of the reviews from the Yelp data set – to predict characteristics of the reviews based on words in the review. We also used various feature selection methods like unigram, bigram, n-gram, entropy method for feature selection etc. Tech used: Python, Scipy, Numpy
-
Digit Recognizer
Implemented a Convolutional Neural Network (CNN) algorithm to classify handwritten digits (0-9), we got our training data from MNIST database of handwritten digits. We also compared the accuracy of CNN with other models like Logistic Regression, SVM (after extracting features from the image). Tech used: Python, Scipy, Numpy, Scikit-Learn, Tensor Flow
-
Cascade Structure Prediction for Handwriting Recognition
Used Cascade Structure Prediction technique for handwriting recognition, our dataset consists of 6877 handwritten words. Our cascade consists of 4 Markov models of increasing order. Pixels are used as feature for the lowest order model, and 2, 3, and 4-grams of letters are the features for the three higher models respectively. Tech used: Python, Numpy, Scikit-Learn, Scipy
-
File verification using Merkle Tree (Information Security)
Developed a client-server simulation where client uploads a file on the server and sends a challenge to the server to verify the file. The client only stores the master-hash (root of the Merkle tree), and to validate the file, the client sends a challenge to the server, and server needs to provide the path siblings corresponding to the challenge node. Tech used: Java
-
Twitter Part-Of-Speech tagging
Designed a Part-of-Speech tagger for tweets, our dataset consists of 1827 tagged tweets. Our tagger is a Maximum Entropy Markov Model (MEMM) and incorporates various local features (such as capitalization, twitter orthography, phonetic normalization etc.) in a log-linear model. Tech used: Python, Scipy, Numpy
-
Automatically predicting implicit relations in text (NLP Project)
-
We implemented and analyzed RNN (Recursive Neural Net) based approaches for automatically predicting implicit relations in text. The discourse relation has potential applications in NLP tasks like text summarization, conversational systems.
-
Automatically solve high-school level Physics word problems (NLP Project)
-
Developing a system which can automatically solve high-school level Physics word problems. Using various NLP tasks we try to find out the best possible set of formulae which can be used to solve the given problem. Guide: Prof. Dan Goldwasser, Purdue University
-
Framework for multi-agent robotic system
-
Developed a framework for multi-agent robotic system, it provides an environment for execution and distribution of mobile agents in a robotic system. Framework constitutes of four components: Transmitter, Receiver, Interpreter and an Engine. This framework provides advanced API to programmer which allows them to create a self-replicating program that mutates and evolve.
Technologies Used : Java -
Web-based ER and Relational Modeling Tool
-
1) A web-based GUI to create ER Models using HTML forms. The tool also converts ER models to corresponding relational forms and normalize them up to 2NF which can be downloaded as an image or SQL script.
2) Technologies used : PHP, AJAX, Javascript, C++, Dia, DOT Graphical Modeling LanguageOther creators -
Youtube Crawler
-
Developed YouTube Crawler which crawls the YouTube’s video network, and gathers all the user feed about the videos. It uses hashing and BFS algorithm to make crawling faster, the crawler was implemented in Java. Performed quantitative and qualitative analysis on the social networking data (comments, likes, etc) retrieved.
Languages
-
English
-
-
Hindi
-
Recommendations received
1 person has recommended Ayush
Join now to viewOther similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore MoreOthers named Ayush Parolia
-
Ayush Parolia
Data Scientist
-
Ayush Parolia
-
Ayush Parolia
--
-
Ayush Parolia
Electronics & Communication Engineer
6 others named Ayush Parolia are on LinkedIn
See others named Ayush Parolia