About
I am a scientist, a creator, and a builder.
I believe in data, innovation, and…
Experience
Education
Volunteer Experience
Publications
-
Machine Learning and Rule-based Approaches to Assertion Classification
Journal of the American Medical Informatics Association (JAMIA)
The authors study two approaches to assertion classification. One of these approaches, Extended NegEx (ENegEx), extends the rule-based NegEx algorithm to cover alter-association
assertions; the other, Statistical Assertion Classifier (StAC), presents a machine learning solution to assertion
classification. The StAC models that are developed on discharge summaries can be successfully applied to
radiology reports. These models benefit the most from words found in the ? 4 word window of…The authors study two approaches to assertion classification. One of these approaches, Extended NegEx (ENegEx), extends the rule-based NegEx algorithm to cover alter-association
assertions; the other, Statistical Assertion Classifier (StAC), presents a machine learning solution to assertion
classification. The StAC models that are developed on discharge summaries can be successfully applied to
radiology reports. These models benefit the most from words found in the ? 4 word window of the target and
can outperform ENegEx.
Patents
-
Context-sensitive salient keyword unit surfacing for multi-language survey comments
Issued US US20200311203A1
Courses
-
Entrepreneurship Essentials
Harvard HBX
-
Foundations of User Experience (UX) Design
Coursera
-
Information Retrieval and Web Search
CS276 (Stanford)
-
Introduction to Innovation and Entrepreneurship
XMS&E100 (Stanford SCPD)
-
Introduction to Statistics
EDP 857615 (Berkeley)
-
Machine Learning
CS229 (Stanford)
-
Mining Massive Data Sets
CS246 (Stanford)
-
Modern Applied Statistics: Data Mining
STATS315B (Stanford)
-
Statistical Methods in Finance
STATS240P (Stanford)
Projects
-
Predicting Type II Diabetes Diagnosis from EHR Data
-
This project uses multiple machine learning techniques - Gradient Boosted Trees, Neural Network, and Support Vector Machines on EHR data to predict diagnosis for patients with diabetes type II. Data is supplied by Practice Fusion in their Kaggle project.
Languages
-
Mandarin
Native or bilingual proficiency
-
English
Native or bilingual proficiency
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore MoreOthers named Sharon Zhang in United States
144 others named Sharon Zhang in United States are on LinkedIn
See others named Sharon Zhang