Presentation Slides
Presentation Slides
CLASSIFICATION
CHIH-WEI CHANG & BENJAMIN DORAN
OVERVIEW
• Dataset
• What is the data
• Feature extraction
• Models
• Types
• Variability of data
• Overfitting
• Accuracy
MODELS
• We tested: • Libraries Used:
• Naive Bayes • Sci-Kit Learn
• Random forest • Tensorflow
• SVM • Librosa(for handling .wav file)
• Deep Neural Networks
• Recurrent Neural Networks
• Convolutional Neural Networks
The UrbanSound Dataset
• Created by Justin Salamon & Christopher Jacoby & Juan Pablo Bello
• Contains 8732 labeled sound excerpts (shorter than 4s) of real field-recording urban
sounds from 10 classes : (1).air conditioner, (2).car horn, (3).children playing, (4). Dog
bark, (5). Drilling, (6). Engine idling, (7). Jackhammer, (8) gun shot, (9) siren, and (10.)
street music.
• Our 3 approaches:
Approach 1: Extract “characteristics” of each sound clips, so the number of
characteristics is independent from original data shape.
• Sklearn Is only able to take 2D data, meaning we needed to flatten data such as the MFCC data
that gave us additional rows per sample.
• Sklearn and TensorFlow are both unable to handle variable sized data meaning we often
needed to use zero padding.
• const_MFCC: Random Forest 52%, SVM 55%, NB 20%
ACCURACY • feature193: Random Forest 61%, SVM 60%, NB 24%
ACCURACY
OVERFITTING