Mehul Marvania

Mehul Marvania

San Francisco Bay Area
13K followers 500+ connections

About

• Senior Machine Learning Engineer at Guidewire.

• Master of Science: Sept 2014 to…

Activity

Join now to see all activity

Experience

  • Guidewire Software Graphic

    Guidewire Software

    San Francisco Bay Area

  • -

    Greater Boston Area

  • -

    Greater Boston Area

  • -

    Greater Boston Area

  • -

    Boston,United States

  • -

    Boston

  • -

    Hazira,Gujarat

  • -

    Surat,India

Education

  • Harvard Extension School Graphic

    Harvard Extension School

    -

    Map Reduce and Spark 2.0, Spark ML, PySpark, Tensor Flow, Data Processing- AWS Kinesis and Kafka, Natural Language Processing

  • -

Licenses & Certifications

Projects

  • Big Data Analytics on Stock Trading Data (Spark 2.0, Hadoop, PySpark)

    - Present

    • Developed Spark Streaming and Structured Streaming API based program in Python to update number of stocks traded every 3 seconds. Wrote bash script to stream the stock data into Hadoop directory and eliminated race condition.
    • Implemented faster Stateful stream processing in Apache Spark Streaming

  • Data Warehouse and Business Intelligence for Contoso Retail

     Created a data warehouse for a fictional company called 'Contoso Retail' in MS SQL Server of size 5.8 GB
     Integrated data for sales applications using ETL tools: Talend, SSIS
     Implemented Error Handling, Load Statistics, Slowly Changing Dimensions, and Currency Conversion for Data Integration process using Talend 6.1 and SSIS.
     Used bulk output connections instead of standard output connections to reduce the data integration duration by 80%.
     Analyzed data to answer…

     Created a data warehouse for a fictional company called 'Contoso Retail' in MS SQL Server of size 5.8 GB
     Integrated data for sales applications using ETL tools: Talend, SSIS
     Implemented Error Handling, Load Statistics, Slowly Changing Dimensions, and Currency Conversion for Data Integration process using Talend 6.1 and SSIS.
     Used bulk output connections instead of standard output connections to reduce the data integration duration by 80%.
     Analyzed data to answer specific business problems by creating BI reports and dashboards using- Tableau, QlikSense, Excel PowerPivot.

    Other creators
  • CNN - COVID19 detection system using X-Rays

    -

    Built Sequential CNN model to detect patients with COVID-19 infection using X-Rays images.

  • Natural Language Processing - document summarization

    -

    Implemented document summarization techniques-LDA and KL in python and received accuracy as high as 80% match to actual summarization.

  • Generation of Cellular Automata (Java)

    -

    • Developed an applet which generated the cellular automaton patterns according to the input selected by the users
    • Displayed the usage of multi-threading and concurrent programming techniques which could help user in playing, pausing and stopping the thread which would result in same with the pattern generated on screen
    • Explored the Java Swing library and implemented it to provide a layer of abstraction between the code structure and graphical swing based GUI (graphical user…

    • Developed an applet which generated the cellular automaton patterns according to the input selected by the users
    • Displayed the usage of multi-threading and concurrent programming techniques which could help user in playing, pausing and stopping the thread which would result in same with the pattern generated on screen
    • Explored the Java Swing library and implemented it to provide a layer of abstraction between the code structure and graphical swing based GUI (graphical user interface).

  • Machine Learning projects ( R and Python)

    -

    1. Survival prediction of the Titanic passenger (Kaggle.com)
     Used Random forest and bagging ensemble algorithm to predict survival of titanic passengers. Worked on Python (NumPy and Pandas packages) as well as R to carry out data analysis. Achieved 81% accuracy.
    2. Filtering mobile phone spam messages – Naive Bayes
     Filtered spam messages using Naive Bayes probability algorithm in R. Used text mining package 'tm'.
     Achieved 83% accurate performance in filtering spam messages.

  • Collecting, Storing and Retrieving historical Metoffice UK weather data (R and MongoDB)

    -

    • Built my own web scraper in R to scrap the historical weather data (Years 1885-2015) of UK Government website.
    • Used MongoDB to store large data, added indexes and Retrieved data from database using Queries.
    • Performed various Data Wrangling techniques in R to clean the data and made it suitable for further Data Analysis.

  • Design of database to monitor and predict the spread of the recent Ebola outbreak( MySQL)

    -

     Created a Relational-Database to study and predict the spread of Ebola Virus Disease, using MySQL.
     Normalized the database to 3NF.
     Wrote and executed PL/SQL queries to create tables, load data and retrieve information at different levels.
     Denormalized the model and added various indexes to improve the query response time. Query response time was improved by 20%.
     Generated test data for 1000 reported cases, using R and MS-Excel.
     Data was used to test various…

     Created a Relational-Database to study and predict the spread of Ebola Virus Disease, using MySQL.
     Normalized the database to 3NF.
     Wrote and executed PL/SQL queries to create tables, load data and retrieve information at different levels.
     Denormalized the model and added various indexes to improve the query response time. Query response time was improved by 20%.
     Generated test data for 1000 reported cases, using R and MS-Excel.
     Data was used to test various database objects, such as Stored Procedures, Triggers, Views, Indexes, etc.
     Created various users & assigned them privileges to improve the security of the database.
     Proposed a backup schedule.

    Other creators
  • Forecasting stock movement using R (Time Series Analysis)

    -

    • Used Sentimental Analysis to identify the effect of any News/Event on the stock movement trend.
    • Time series prediction of Apple Inc. (aapl) using ARIMA and Linear Regression method in R-studio.

  • Hypothesis Statistical Data analysis for Profit Margins (Probability and Statistics)

    -

    Helped identify the most profitable product out of many to a manufacturing organization by performing statistical analysis on profit margins.
    Used MINITAB to perform the analysis on last 2 years’ data and forecast the profit for next year.

  • Performance Analysis Methodology for Parabolic Dish Solar Concentrators (Indian Ministry of New and Renewable Energy- Sponsored)

    -

    Suggested and reinforced two improvements which increased efficiency of solar system by 1.2%.
    Compiled past 11 year’s solar resource data and prepared an efficiency model in MS Excel to predict the performance of the system.
    Tested the viability of the Dish Concentrator Technology which was piloted first time in India.
    Analyzed and conducted cost evaluation for biomass-fueled boiler system worth $50K.

    Other creators
    See project
  • Performance Analysis Methodology for Parabolic Dish Solar Concentrators for Process Heating Using Thermic Fluid

    -

    A research-based project done by students of Final Year, B. E. (Mechanical Engineering) on development of a Performance Analysis Methodology for a MNRE-sponsored Parabolic Dish Solar Concentrator system used for heating thermic fluid for process heating application at Universal Medicap Ltd., Vadodara.

    Other creators
    See project

Recommendations received

4 people have recommended Mehul

Join now to view

More activity by Mehul

View Mehul’s full profile

  • See who you know in common
  • Get introduced
  • Contact Mehul directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Mehul Marvania