“Had the chance to host Abhijith as a Speaker during my time at TECH I.S. the depth of knowledge and the meticulous detail that Abhijith displayed as a Speaker at the Data Science festival was immensely satisfying which goes on to only showcase subject matter expertise in as an excellent Data Scientist. Best wishes with your future endeavours. ”
Abhijith Asok
Bothell, Washington, United States
10K followers
500+ connections
About
Data scientist with experience across the corporate, social and research arms of data…
Activity
-
In this piece, myself and Debashis Bandyopadhyay discuss how the traditionally US dominated global public health research landscape is evolving due…
In this piece, myself and Debashis Bandyopadhyay discuss how the traditionally US dominated global public health research landscape is evolving due…
Liked by Abhijith Asok
-
Concepts I would master if I were interviewing for an Applied Scientist position in 2025: 1. 𝐑𝐀𝐆 Most popular enterprise use case of Generative…
Concepts I would master if I were interviewing for an Applied Scientist position in 2025: 1. 𝐑𝐀𝐆 Most popular enterprise use case of Generative…
Liked by Abhijith Asok
-
Ever ran into numerical errors while doing matrix computations? Recently, I was using Mahalanobis distance, to determine how similar a sentence is…
Ever ran into numerical errors while doing matrix computations? Recently, I was using Mahalanobis distance, to determine how similar a sentence is…
Liked by Abhijith Asok
Experience
Education
-
-
-
Activities and Societies: Department of Journalism and Media Affairs, Mime Club, SPREE
-
-
Activities and Societies: I was a Founding Team Member of 'The Graduate Consultancy'(Now GradMinds) during this time, in London as a global collaboration of four of us from India, United States, United Kingdom and Azerbaijan.
-
-
Activities and Societies: Singing, Dancing, Writing
Licenses & Certifications
-
-
-
-
-
-
-
-
Kaggle Python Tutorial on Machine Learning
DataCamp
Credential ID 699ca7a30169c1ee14217d4d8495f6a53788f6a5
Volunteer Experience
Publications
-
FAVOR: functional annotation of variants online resource and annotator for variation across the human genome
Nucleic Acids Research
FAVOR provides a comprehensive multi-faceted variant functional annotation online portal that summarizes and visualizes findings of all possible nine billion single nucleotide variants (SNVs) across the genome. It allows for rapid variant-, gene- and region-level queries of variant functional annotations.
-
Women’s strategies addressing sexual harassment and assault on public buses: an analysis of crowdsourced data
Crime Prevention and Community Safety : An International Journal ( Springer )
This paper uses crowdsourced data on women’s self-reports of harassment and assault on public buses in India. The data provide a basis to identify the strategies that women use to respond to and manage this everyday threat. The study examines 137 accounts of assault collected by a crowdsourced platform in which women detail, keeping silent (n = 27), fleeing (n = 38), or resisting (n = 72) such an assault. Findings show that confronting incidents in the moment by “making a scene” and “engaging…
This paper uses crowdsourced data on women’s self-reports of harassment and assault on public buses in India. The data provide a basis to identify the strategies that women use to respond to and manage this everyday threat. The study examines 137 accounts of assault collected by a crowdsourced platform in which women detail, keeping silent (n = 27), fleeing (n = 38), or resisting (n = 72) such an assault. Findings show that confronting incidents in the moment by “making a scene” and “engaging the crowd” works well in the closed, shared-space setting of a crowded public bus. The study concludes by asserting crowdmapping as a multi-faceted tool: it can allow women to be aware of potentially dangerous locales, empowers them to report incidents to help keep others safe, and provides a source of data to advise on best practices for navigating street harassment and assault in public buses.
Other authors -
-
Generalized Approach to Linear Data Transformation
Proceedings of the IEEE International Conference on Data Science and Engineering
This paper presents a generalized approach for the simple linear data transformation, Y=bX, through an integration of multidimensional coordinate geometry, vector space theory and polygonal geometry. The scaling is performed by adding an additional ‘Dummy Dimension’ to the n-dimensional data, which helps plot two dimensional component-wise straight lines on pairs of dimensions. The end result is a set of scaled extensions of observations in any of the 2n spatial divisions, where n is the total…
This paper presents a generalized approach for the simple linear data transformation, Y=bX, through an integration of multidimensional coordinate geometry, vector space theory and polygonal geometry. The scaling is performed by adding an additional ‘Dummy Dimension’ to the n-dimensional data, which helps plot two dimensional component-wise straight lines on pairs of dimensions. The end result is a set of scaled extensions of observations in any of the 2n spatial divisions, where n is the total number of applicable dimensions/dataset variables, created by shifting the hyperplane in n dimensions along the ‘Dummy Axis’. The derived scaling factor was found to be dependent on the coordinates of the common point of origin for diverging straight lines and the plane of extension, chosen on and perpendicular to the ‘Dummy Axis’, respectively. This result indicates the geometrical interpretation of a linear data transformation and hence, opportunities for a more informed choice of the factor ‘b’, based on a better choice of these coordinate values.
-
Public transport — another hotspot for sexual harassment
YourStory
According to the analysis carried out by Safecity to identify the reports pertaining to public transportation spaces, an alarming one-fifth of all the data collected are incidents that happen in a public transportation space of some kind. Although this data contained reports from multiple countries, barring a negligible percentage, the data entirely contains incidents collected from various parts of India. The team went ahead to split up the incidents on the basis of common modes of transport…
According to the analysis carried out by Safecity to identify the reports pertaining to public transportation spaces, an alarming one-fifth of all the data collected are incidents that happen in a public transportation space of some kind. Although this data contained reports from multiple countries, barring a negligible percentage, the data entirely contains incidents collected from various parts of India. The team went ahead to split up the incidents on the basis of common modes of transport and some general terms related to transport using keyword separations.
Patents
-
Composite Risk Score for Cloud Software Deployments
Filed 18/671736
-
Intelligent Table Suggestion and Conversion for Text
Filed 17/524646
-
Computing System for Determining Quality of Virtual Machine Telemetry Data
Filed 17/064685
Courses
-
Algebra
-
-
Applied Linear Algebra and Big Data
APMTH 120
-
Applied Machine Learning
BST 263
-
Basics of Statistical Inference
BST 222
-
Computer Programming
-
-
Computing for Big Data
BST 262
-
Data Science 2 (Neural Networks and Deep Learning)
BST 261
-
Data Structures and Algorithms
CS 124
-
Epidemiology Methods
-
-
Graphs and Networks
-
-
Introduction to Data Science
BST 260
-
Introduction to Social and Biological Networks
BST 267
-
Operations Research
-
-
Optimisation
-
-
Probability and Statistics
-
-
Real Analysis
-
-
Reproducible Data Science
BST 270
Projects
-
Predicting ICD-9 codes from doctor's discharge notes
-
The project took in doctor's discharge notes in text form and tried to predict the ICD-9 diagnosis codes from them. Word embeddings were created using the novel Fasttext model and the modelling was done using a Recurrent Neural Network architecture, primarily composed of GRU units. The top 5 ICD-9 codes and their text documents were chosen due to time and resource constraints, but it could be extended without much additional effort.
-
Predicting GPS location from Wi-Fi data
-
The project aimed to look at the Wi-Fi data collected from smartphones(Wi-Fi ID, signal strength, accuracy etc.) and use those as predictors to predict the GPS location of an individual. In the realm of digital phenotyping, data from smartphones can be used to predict a person's mental state in advance, to enable faster response systems to anxiety attacks, depression etc. However, since grabbing the GPS data is more challenging than other kinds of data, it is usually the case that there are far…
The project aimed to look at the Wi-Fi data collected from smartphones(Wi-Fi ID, signal strength, accuracy etc.) and use those as predictors to predict the GPS location of an individual. In the realm of digital phenotyping, data from smartphones can be used to predict a person's mental state in advance, to enable faster response systems to anxiety attacks, depression etc. However, since grabbing the GPS data is more challenging than other kinds of data, it is usually the case that there are far less data points for GPS coordinates compared to the other kinds of data. Therefore, we created a basic model using artificial neural networks, that took in WI-Fi data and churned out GPS coordinates. The project was put on hold after initial modelling as we await more data.
-
Predicting virality of Mashable articles
-
Media companies like Mashable produce tens of thousands of articles per year, all with varying degrees of virality. The virality of the content produced is key to a media company’s profitability. An accurate model that could predict parameters that increase the virality of an article, specifically, the number of social shares it receives, would be extremely valuable.
We started with a base dataset containing meta-data of nearly 40,000 unique Mashable blog articles over the past 5 years…Media companies like Mashable produce tens of thousands of articles per year, all with varying degrees of virality. The virality of the content produced is key to a media company’s profitability. An accurate model that could predict parameters that increase the virality of an article, specifically, the number of social shares it receives, would be extremely valuable.
We started with a base dataset containing meta-data of nearly 40,000 unique Mashable blog articles over the past 5 years. The meta-data includes 61 different attributes ranging from metrics like word counts to sentiment analysis. This dataset is hosted on the Machine Learning Repository from the Center for Machine Learning and Intelligent Systems at the University of California Irvine.
In addition to the meta-data, we were also interested in the actual article data, so we used a webscraper (using ‘rvest’) to collect the actual article titles, date published, author and article content. We joined these values into the base dataset.
The problem was attempted in the beginning using advanced regression but was soon shifted to tree-based models with the final model being an extreme gradient boosting model.
While we expected our task of predicting the number of shares for an article to be challenging, we now realize it is even more challenging than initially thought. For example, two articles discuss very popular tech products, Xbox and iPhone and have similar values for key metrics identified in the Variable Importance table above. However, they have drastically different number of shares, 900 vs 197,000.
That said, we learned that it is possible to create a good model that utilizes a number of predictors to determine the number of shares a given article.
The final boosted model had an MAE of 2724.32. On average, our model is able to predict the shares of mashable articles with a maximum positive/negative difference of just over 2,700 shares.
Honors & Awards
-
Karuna Majumdar Fellowship
Dr. Hasi Majumdar Venkatachalam MD
The Fellowship fund was created by Dr. Hasi Majumdar Venkatachalam MD as a monetary assistance to students from India.
-
Winner, Sales Forecasting Challenge, ZS India
ZS Associates
I was the solo winner across ZS India, in a team challenge of ~50 registered teams, to build models to forecast the sales for 183 different pharmaceutical products. The winning solution used an enhanced variant of ARIMA.
-
Winner, Insurance Premium Prediction Challenge, ZS Pune
ZS Associates
I was the solo winner at the Pune office for a national company-wide team prediction challenge to predict Medical Insurance Premiums, based on a wide variety of variables of multiple data types. The winning solution utilized generalized non-linear modelling, coupled with smoothing splines.
-
Finalist
Atlantic Council / StartupDosti
Finalist at the Indo-Pak Business Plan competition. Finale from April 24-30,2014 at Thailand.
-
Finalist
Global Cooperative Challenge / SENStation
Finalist at the GCC Business Plan Competition, organised by SENStation.
-
Best Intern
MusicPerk
Test Scores
-
GRE
Score: 329
Quant : 170 / 170 (97th percentile)
Verbal : 159 / 170 (82nd percentile)
AWA : 4.5 / 6 (82nd percentile) -
TOEFL
Score: 117
Reading : 30 / 30
Listening : 30 / 30
Speaking : 27 / 30
Writing : 30 / 30
Languages
-
Malayalam
Native or bilingual proficiency
-
English
Full professional proficiency
-
Hindi
Professional working proficiency
-
Tamil
Elementary proficiency
Recommendations received
5 people have recommended Abhijith
Join now to viewMore activity by Abhijith
-
As many of you know, athletics has been a part of my life since I was a boy. I'm passionate about living my faith and fitness brand and I often get…
As many of you know, athletics has been a part of my life since I was a boy. I'm passionate about living my faith and fitness brand and I often get…
Liked by Abhijith Asok
-
Last month, I wrapped up an incredible chapter in Saarbrücken, Germany, at the INM-Leibniz Institute for New Materials, where I had the opportunity…
Last month, I wrapped up an incredible chapter in Saarbrücken, Germany, at the INM-Leibniz Institute for New Materials, where I had the opportunity…
Liked by Abhijith Asok
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore MoreOthers named Abhijith Asok
-
ABHIJITH ASOK
Project Engineer
-
ABHIJITH K ASOK
Manager Recruitment- SIB Operations and Services Limited
-
Abhijith Asok
Student at University of Kerala
-
Abhijith Asok
--
11 others named Abhijith Asok are on LinkedIn
See others named Abhijith Asok