Skip to content
View notabelardoriojas's full-sized avatar
  • Tallahasee, Floirda

Block or report notabelardoriojas

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
notabelardoriojas/README.md

Hello! You've found your way to my GitHub portfolio

Here you'll find a bunch of Data Science projects that showcase my skills!

Environmental Sound Classification πŸ”ŠπŸŽ§

  • Investigating Different Spectrograms and Audio Augmentation Methods on Convolutional Learning: Medium, GitHub
  • Investigated the use of four different types of audio spectrograms on training: Mel Spectrograms, Mel Frequency Cepstral Coefficients, Tempograms, and Chromagrams.
  • Investigated the use of stacked spectrograms on top of each other to produce 2D, 3D, and 4D images for training.
  • Investigated the use of different audio augmentations on training, and different proportions of audio augmentation on the training set.

Pokemon vs Sklearn: Predicting 50,000 Battles with 8 Different ClassifiersπŸ‘ΎπŸ€–

  • Engineered features from a dataset of 800 Pokemon and 50,000 battles to predict the winner using 8 different classifiers
  • Classifiers tested: Logistic Regression, Decision Tree, Random Forest, XGBoost, Gaussian Naive Bayes, Support Vector Machines, Stochastic Gradient Descent Classifier, K-Nearest Neighbor Classifier
  • Used Hyperopt to tune the hyperparameters of best performing classifier (XGBoost) to 98.5% test accuracy

Artwork Style Prediction: Using Vision Transformers with Shifted Patch Tokenization and Locality Self Attention πŸ–ΌπŸ‘

  • Classified 7,000+ art pieces into five classes: drawing, painting, iconography, engraving, and sculpture.
  • Used Vision Transformer suitable for training on small datasets by implementing shifted patch tokenization and locality self attention to force the attention module to pay more attention to the inter-token relations.
  • Reached 82% test accuracy with this method.

Data Visualization in R (various projects) πŸ‘¨πŸ½β€πŸ’»πŸ“ˆ

  • Projects:
  • Age differences for male and female Olympic gymnasts who were successful or not in earning a medal, and how the age distribution changed over the years.
  • Is GPA related to student income, the father’s educational level, or the student’s perception of what an ideal diet is? Visualization and ANOVA analysis.
  • What are the differences between taxons when looking at their expected gestation length, litter size, age of conception of the mother and father, and weight? PCA analysis
  • Homework assignments:
  • Chicken weights vs type of feed
  • Highway fuel economy versus number of cylinders in cars and the distribution of each car’s city fuel economy by class and type of drive train with boxplots and ridgelines
  • Popularity of college majors time series (growth vs decline) and Texas housing data pie charts
  • Linear trendlines for animal vore types (carnivore, herbivore, etc), weight vs amount of sleep

Abstract Data Types/Data Structures πŸ’»πŸ’Ύ (Private Repo, see projects document for access)

  • Impossible Hangman: Full game implementation, impossible to win because the computer chooses from a word list that maximizes the number of possible words.
  • Impossible Boggle: Full game implementation with Tries and Depth First Search. Guess words in a N by N grid of letters, the computer prints out all possible words.
  • Treaps: Self balancing Binary Search Tree that maintains BST properties and heap properties with randomly assigned priorities.
  • B+ Trees: Implementation of a B+ tree, an m-ary tree optimal for storing large amounts of data on disk, and traverses the structure by minimizing the number of reads to disk.

Tableau Sales Insights Dashboard πŸ’Έ πŸ“ˆ

  • Used to SQL extract, transform and load data from database of sale transactions, customer accounts, markets, and products for a fictional online store AlitQ.
  • Revenue analysis: revenue by market, sales quantity by market, top 5 products and customers, and revenue by year.
  • Profit analysis: Profit by market, profit trend, customers table, customer type (E-commerce vs brick and mortar).
  • Fully interactable, filter by market, year, quarter, month, customer type, etc.

COVID-19 OWID Dataset SQL and Tableau Analysis πŸ˜·πŸ“ˆ

  • Used MySQL to create views from the COVID-19 Dataset supplied by Our World In Data
  • Built an interactive dashboard in Tableau to visualize the results.
  • Fully Vaccinated % by country over time, highest infection rate, total cases by country over time, deaths by continent, global numbers.

Data Cleaning Exercise - Nashville Housing Market Dataset 🧼🏑

  • Straightforward data cleaning exercise on data from the Nashville housing market.

ARCGIS Map πŸ—ΊπŸŒ€

  • Used ARCGIS to generate a map of Hurricane Evacuation Routes and Median Household Income by zipcode in Leon County
  • Are our current hurricane evacuation measures underserving low income communities? Decide for yourself on my ARCGIS Webapp.

Alteryx Exercise πŸ‘¨πŸ½β€πŸ’»πŸ‹πŸ½

  • Used Alteryx to clean a dataset of Sales Opportunities

Musical Genre Classification 🎧 🎨

  • Used Convolutional Neural Networks in Tensorflow to predict musical genres from audio files with multilabel classification
  • Created and cleaned a dataset of 9,000+ ethically sourced audio files using the Spotify, Deezer and StreamRip APIs.
  • Studied the effects of audio data augmentation, sample length, type of spectogram, and use of delta features as channels embedded in the image on test and validation accuracy.
  • Made a cool little video showing off what this model can do on some of my favorite songs.

School Projects and Assignments 🍒 πŸ“š

  • Double majored in Scientific Computing and Statistics, graduated Spring of '22 from Florida State University. Go noles!
  • 2018: Computational Thinking, Introduction to Scientific Computing
  • 2019: Discrete Algorithms, Programming for Scientific Applications, Symbolic and Numerical Computations
  • 2020: Continuous Algorithms I and II, Data Mining, Introduction to Deep Learning
  • 2021: Computational Evolutionary Biology
  • 2022: Applied Machine Learning, High Performance Computing

Popular repositories Loading

  1. music-genre-classification music-genre-classification Public

    Using CNNs to predict music genres from audio files.

    Python 1

  2. Environmental-Sound-Classification Environmental-Sound-Classification Public

    Testing different spectrograms and augmentation methods on convolutional learning for audio classification.

    Jupyter Notebook 1

  3. notabelardoriojas notabelardoriojas Public

    Config files for my GitHub profile.

  4. Covid19-SQL-Tableau-Analysis Covid19-SQL-Tableau-Analysis Public

    Data exploration using MySQL on the Our World in Data COVID-19 Dataset. Data Visualization done in Tableau Desktop.

  5. Tableau-Sales-Insights Tableau-Sales-Insights Public

  6. Rapper-Face-Classification Rapper-Face-Classification Public

    Predicting rappers' faces from a picture.

    Jupyter Notebook