0% found this document useful (0 votes)

41 views

BOW Assignment 210097

The document describes implementing a bag of visual words model for object recognition. It involves 4 main steps: 1) determining image features, 2) constructing a visual vocabulary through clustering image features, 3) classifying images based on the vocabulary, and 4) obtaining the most optimal class for a query image. The model is trained on a dataset containing 4 object classes and uses SIFT to extract features, k-means clustering to generate the visual vocabulary, and an SVM for classification. Pseudocode is provided for key steps like feature extraction, vocabulary generation, training, and testing the model.

Uploaded by

Sehr Malik

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views

BOW Assignment 210097

Uploaded by

Sehr Malik

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Reg.

No: 210097
Name: Sehrish Rafique
Submitted To: Dr. Ehatisham
Date: December 20th, 2021

Implementing Bag of Visual words for Object Recognition

Task Scenario
There is an input image containing a cup, saucer, bottle, etc. The task is to be able to recognize
which of the objects are contained in the image.
It follows 4 simple steps
- Determination of Image features of a given label
- Construction of visual vocabulary by clustering, followed by frequency analysis
- Classification of images based on vocabulary generated
- Obtain most optimum class for query image

Dataset
The dataset contains the images of 4 types of objects:

- Soccer Ball
- Accordion
- Dollar Bill
- Motorbike

Implementation
Let’s begin with a few introductory concepts required Bag of words. We shall cover 4 parts.
- Clustering
- Bag of Visual Words Model
- Generating Vocabulary
- Training and testing

Beginning with KMeans clustering. Suppose there are X objects, that are to be divided into K
clusters. The input can be a set of features, X={x1,x2,...,xn}X={x1,x2,...,xn}. The goal is basically
to minimize the distance between each point in the scatter cloud and the assigned centroids.
Code Snippet
```
import numpy as np
from matplotlib import pyplot as plt
from sklearn.cluster import KMeans
from sklearn.datasets import make_blobs
# create a dataset sample space that will be used
# to test KMeans. Use function : make_blobs
n_samples = 1000
n_features = 5;
n_clusters = 3;
X, y = make_blobs(n_samples, n_features)
# X => array of shape [nsamples,nfeatures] ;;; y => array of shape[nsamples]
# X : generated samples, y : integer labels for cluster membership of each sample
# performing KMeans clustering
ret = KMeans(n_clusters = n_clusters).fit_predict(X)
print(ret)
__, ax = plt.subplots(2)
ax[0].scatter(X[:,0], X[:,1])
ax[0].set_title("Initial Scatter Distribution")
ax[1].scatter(X[:,0], X[:,1], c=ret)
ax[1].set_title("Colored Partition denoting Clusters")
# plt.scatter
plt.show()
```
Every image has certain discernable features, patterns with which humans decide as to what the
object perceived is.

Output:

Initial Scatter Distribution vs Colored Partition denoting Clusters

Bag of Visual Words
This is a supervised learning model. There will be a training set and a testing set.

This has following modules:

- Bag.py → It contains the main functions.

- Heplers.py → It contains various helper functionalities. It contains Imagehelpers,
FileHelper, BOVhelpers. Imagehelpers(this contains color scheme conversion, feature
detection).

FileHelper returns a dictionary of each object-name with a corresponding list of all images. It also
returns total image count. It returns a dictionary with key = object_name and value = list of images
and total number of images.

```
class FileHelper:
def getFiles(self, path):
"""
- returns a dictionary of all files
having key => value as objectname => image path
- returns total number of files.
"""
imlist = {}
count = 0
for each in glob(path + "*"):
word = each.split("/")[-1]
print " #### Reading image category ", word, " ##### "
imlist[word] = []
for imagefile in glob(path+word+"/*"):
print "Reading file ", imagefile
im = cv2.imread(imagefile, 0)
imlist[word].append(im)
count +=1
return [imlist, count]
```
ImageHelpers’s primary function is to provide with SIFT features present in an image.
We require these image features to develop our vocabulary.

```
class ImageHelpers:
def __init__(self):
self.sift_object = cv2.xfeatures2d.SIFT_create()

def gray(self, image):

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
return gray

def features(self, image):

keypoints, descriptors = self.sift_object.detectAndCompute(image, None)
return [keypoints, descriptors]
```

How to develop visual vocabulary?

A visual word is. Simply anything that can be used to describe an image, we consider them as a
visual word. Thus, our image becomes a combination of visual words (that are essentially features).

We define this structure as a histogram. Essentially, histogram is just a measure of frequency

occurrence of a particular item. here in our case, we will be describing each image as a histogram
of features. How many features out of the total vocabulary are required to make sense of what the
computer is looking at.

Linking vocabulary and clustering:

Using SIFT, we detect and compute features inside each image. SIFT returns us a m×128m×128-
dimension array, where m is the number of features extrapolated. Similarly, for multiple images,
say 1000 images, we shall obtain

feature 0

feature 1

…….

Feature N

where featureN is an array of dimension m×128.

Developing Vocabulary Each cluster denotes a particular visual word. Every image can be
represented as a combination of multiple visual words. The best method is to generate a histogram
that contains the frequency of occurrence of each visual word. Thus the vocabulary comprises of
a set of histograms of encompassing all descriptions for all images.
```

def developVocabulary(self,n_images, descriptor_list, kmeans_ret = None):

self.mega_histogram = np.array([np.zeros(self.n_clusters) for i in range(n_images)])

old_count = 0

for i in range(n_images):

l = len(descriptor_list[i])

for j in range(l):

if kmeans_ret is None:

idx = self.kmeans_ret[old_count+j]

else:

idx = kmeans_ret[old_count+j]

self.mega_histogram[i][idx] += 1

old_count += l

print("Vocabulary Histogram Generated")

```

As seen, the input is n_images i.e. the total number of images and descriptor_list, that contains the
feature descriptor array ( one discussed above, the full stacked up list of features). Our histogram
is therefore of the size n_images×n_clusters thereby defining each image in terms of generated
vocabulary.

Training the machine to understand the images using SVM

This method contains the entire module required for training the bag of visual words model.

```

def trainModel(self):

# read file. prepare file lists.

self.images, self.trainImageCount = self.file_helper.getFiles(self.train_path)

# extract SIFT Features from each image

label_count = 0

for word, imlist in self.images.iteritems():

self.name_dict[str(label_count)] = word

print("Computing Features for ", word)

for im in imlist:

# cv2.imshow("im", im)

# cv2.waitKey()

self.train_labels = np.append(self.train_labels, label_count)

kp, des = self.im_helper.features(im)

self.descriptor_list.append(des)

label_count += 1

# perform clustering

bov_descriptor_stack = self.bov_helper.formatND(self.descriptor_list)

self.bov_helper.cluster()

self.bov_helper.developVocabulary(n_images = self.trainImageCount,
descriptor_list=self.descriptor_list)

# show vocabulary trained

# self.bov_helper.plotHist()

self.bov_helper.standardize()

self.bov_helper.train(self.train_labels)

```

This method recognizes a single image. It can be utilized individually as well.

```

def recognize(self,test_img, test_image_path=None):

kp, des = self.im_helper.features(test_img)

# print kp

print(des.shape)

# generate vocab for test image

vocab = np.array( [[ 0 for i in range(self.no_clusters)]])

# locate nearest clusters for each of

# the visual word (feature) present in the image

# test_ret =<> return of kmeans nearest clusters for N features

test_ret = self.bov_helper.kmeans_obj.predict(des)

# print test_ret

# print vocab

for each in test_ret:

vocab[0][each] += 1

print(vocab)

# Scale the features

vocab = self.bov_helper.scale.transform(np.atleast_2d(vocab))

# predict the class of the image

lb = self.bov_helper.clf.predict(vocab)

# print "Image belongs to class : ", self.name_dict[str(int(lb[0]))]

return lb

```

This method is to test the trained classifier. Reading all images from testing path and use
BOVHelpers.predict() function to obtain classes of each image.

```

self.testImages, self.testImageCount = self.file_helper.getFiles(self.test_path)

predictions = []
for word, imlist in self.testImages.iteritems():

print("processing " ,word)

for im in imlist:

# print imlist[0].shape, imlist[1].shape

print(im.shape)

cl = self.recognize(im)

print(cl)

predictions.append({

'image':im,

'class':cl,

'object_name':self.name_dict[str(int(cl[0]))]

})

```

For evaluation and plotting:

```

def predict(self, iplist):

predictions = self.clf.predict(iplist)

return predictions

def plotHist(self, vocabulary = None):

print("Plotting histogram")

if vocabulary is None:

vocabulary = self.mega_histogram

x_scalar = np.arange(self.n_clusters)
y_scalar = np.array([abs(np.sum(vocabulary[:,h], dtype=np.int32)) for h in
range(self.n_clusters)])

print(y_scalar)

plt.bar(x_scalar, y_scalar)

plt.xlabel("Visual Word Index")

plt.ylabel("Frequency")

plt.title("Complete Vocabulary Generated")

plt.xticks(x_scalar + 0.4, x_scalar)

plt.show()

```

Output:
Accordion detected

Bike Detected
Histogram:

Confusion Matrix

Shortform Irb Application
67% (3)
Shortform Irb Application
11 pages
Bag-Of-Words Models: Noah Snavely
No ratings yet
Bag-Of-Words Models: Noah Snavely
47 pages
IT5409 - Ch7 - Part2 - Object Recognition - v2 - 4pages
No ratings yet
IT5409 - Ch7 - Part2 - Object Recognition - v2 - 4pages
38 pages
lecture6-2 (1)
No ratings yet
lecture6-2 (1)
37 pages
Local Features and Bag of Words Models
No ratings yet
Local Features and Bag of Words Models
60 pages
Introduction To Object Recognition: Slides Adapted From Fei-Fei Li, Rob Fergus, Antonio Torralba, and Others
No ratings yet
Introduction To Object Recognition: Slides Adapted From Fei-Fei Li, Rob Fergus, Antonio Torralba, and Others
60 pages
Understanding_bag-of-words_model_A_statistical_fra
No ratings yet
Understanding_bag-of-words_model_A_statistical_fra
16 pages
Bag of Features
No ratings yet
Bag of Features
49 pages
CV Assignment 2 Group02
No ratings yet
CV Assignment 2 Group02
12 pages
Understanding Bag-Of-Words Model: A Statistical Framework
No ratings yet
Understanding Bag-Of-Words Model: A Statistical Framework
10 pages
Bag of Feature
No ratings yet
Bag of Feature
75 pages
Human Computation and Computer Vision
No ratings yet
Human Computation and Computer Vision
50 pages
Lab6 1
No ratings yet
Lab6 1
6 pages
Def Load - Data (Data - Directory)
No ratings yet
Def Load - Data (Data - Directory)
3 pages
CS231n Convolutional Neural Networks For Visual Recognition
No ratings yet
CS231n Convolutional Neural Networks For Visual Recognition
1 page
eccv06
No ratings yet
eccv06
15 pages
Training Code
No ratings yet
Training Code
4 pages
"I C U N N ": Mage Lassification Sing Eural Etworks
No ratings yet
"I C U N N ": Mage Lassification Sing Eural Etworks
15 pages
Image Classification
No ratings yet
Image Classification
18 pages
Bag of Words
No ratings yet
Bag of Words
72 pages
Handwritten Character Recognition With Neural Network
No ratings yet
Handwritten Character Recognition With Neural Network
12 pages
MLP - Week 5 - MNIST - Perceptron - Ipynb - Colaboratory
No ratings yet
MLP - Week 5 - MNIST - Perceptron - Ipynb - Colaboratory
31 pages
CRATERS_entire_program
No ratings yet
CRATERS_entire_program
5 pages
Ds File
No ratings yet
Ds File
58 pages
2023 Article Jatit 19Vol101No14-3
No ratings yet
2023 Article Jatit 19Vol101No14-3
6 pages
CIS 6213 Applied Machine Learning Coursework
No ratings yet
CIS 6213 Applied Machine Learning Coursework
5 pages
Object Recog
No ratings yet
Object Recog
102 pages
Lab4 103169894
No ratings yet
Lab4 103169894
34 pages
VGG Image Classification Practical
No ratings yet
VGG Image Classification Practical
11 pages
Da Programs
No ratings yet
Da Programs
10 pages
Downloaded by R GAYATHRI (R.gayathri@aalimec - Ac.in)
No ratings yet
Downloaded by R GAYATHRI (R.gayathri@aalimec - Ac.in)
56 pages
2e9c7b5ca372fc0e
No ratings yet
2e9c7b5ca372fc0e
7 pages
Tensorflow and Keras Apis: 0.1 Computer Vision: Neural Networks and Deep Learning
No ratings yet
Tensorflow and Keras Apis: 0.1 Computer Vision: Neural Networks and Deep Learning
32 pages
Semantics-Preserving Bag-of-Words Models and Applications
No ratings yet
Semantics-Preserving Bag-of-Words Models and Applications
13 pages
DM ML Practical
No ratings yet
DM ML Practical
13 pages
Random Forest 1 Image
No ratings yet
Random Forest 1 Image
5 pages
DogCat Report
No ratings yet
DogCat Report
10 pages
From Text to Mask Localizing Entities Using the
No ratings yet
From Text to Mask Localizing Entities Using the
43 pages
748747019-ad3511-deep-learning-lab-manual-iii-yearjnn (1)-1
No ratings yet
748747019-ad3511-deep-learning-lab-manual-iii-yearjnn (1)-1
51 pages
B1
No ratings yet
B1
4 pages
Document
No ratings yet
Document
5 pages
Part 11 MD
No ratings yet
Part 11 MD
53 pages
Cad and Dog 2
No ratings yet
Cad and Dog 2
5 pages
CVLAB_1
No ratings yet
CVLAB_1
6 pages
Amazon-Fine-Food-Review - K-Means, Agglomerative & DBSCAN Clustering
No ratings yet
Amazon-Fine-Food-Review - K-Means, Agglomerative & DBSCAN Clustering
79 pages
Machine Vison Homework 10
No ratings yet
Machine Vison Homework 10
11 pages
Arabic OCR Report
No ratings yet
Arabic OCR Report
20 pages
_PhD Visual Object Category Recognition
No ratings yet
_PhD Visual Object Category Recognition
193 pages
Vit
No ratings yet
Vit
11 pages
Writer Recognition by Computer Vision: Jeffrey P. Woodard Christopher P. Saunders Mark J. Lancaster
No ratings yet
Writer Recognition by Computer Vision: Jeffrey P. Woodard Christopher P. Saunders Mark J. Lancaster
19 pages
Dl 5 Excuted
No ratings yet
Dl 5 Excuted
13 pages
Q1 Fisher Kernels On Visual Vocabularies For Image Categorization
No ratings yet
Q1 Fisher Kernels On Visual Vocabularies For Image Categorization
8 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Helping Blind People to Be Aware of the Logos Around Themselves
No ratings yet
Helping Blind People to Be Aware of the Logos Around Themselves
2 pages
DLCV Day2
No ratings yet
DLCV Day2
5 pages
21BCS11337 - Exp 2.2
No ratings yet
21BCS11337 - Exp 2.2
6 pages
Image Classification With Convolutional Neural Networks: Plotting
No ratings yet
Image Classification With Convolutional Neural Networks: Plotting
16 pages
Computer_Vision_Presentation
No ratings yet
Computer_Vision_Presentation
19 pages
CV Lecture 07 BagOfFeatures
No ratings yet
CV Lecture 07 BagOfFeatures
42 pages
CV Lab Exp 1 - DIKSHA SHAARMA
No ratings yet
CV Lab Exp 1 - DIKSHA SHAARMA
7 pages
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
MMTC Audit Report 41 - revised-AR-2022-23 (23-11-2023)
No ratings yet
MMTC Audit Report 41 - revised-AR-2022-23 (23-11-2023)
272 pages
Complete Functional Programming in C#: How To Write Better C# Code 1st Edition Enrico Buonanno PDF For All Chapters
100% (4)
Complete Functional Programming in C#: How To Write Better C# Code 1st Edition Enrico Buonanno PDF For All Chapters
62 pages
TM4C123 TIVA kit with LCD”
No ratings yet
TM4C123 TIVA kit with LCD”
6 pages
ITLINK IT8145T5 Datasheet
No ratings yet
ITLINK IT8145T5 Datasheet
3 pages
CCTV Notes
No ratings yet
CCTV Notes
111 pages
Integrated Project Report On Expensify App: Computer Science and Engineering B.E. Batch-2017 in June 2020
No ratings yet
Integrated Project Report On Expensify App: Computer Science and Engineering B.E. Batch-2017 in June 2020
18 pages
Phyton Report
No ratings yet
Phyton Report
30 pages
Troubleshooting - OKI
No ratings yet
Troubleshooting - OKI
2 pages
Prith_Banerjee
No ratings yet
Prith_Banerjee
5 pages
SDH & DWDM
No ratings yet
SDH & DWDM
46 pages
Easysnmp
No ratings yet
Easysnmp
21 pages
17 Multilevel Page Table TLB
No ratings yet
17 Multilevel Page Table TLB
32 pages
Calculus and Analytical Geometry - Integration
No ratings yet
Calculus and Analytical Geometry - Integration
74 pages
Notice Installation Vulca Voice Eng
No ratings yet
Notice Installation Vulca Voice Eng
38 pages
NB-MED - 2.2 - Rec - 4-2010 Software and Medical Devices
No ratings yet
NB-MED - 2.2 - Rec - 4-2010 Software and Medical Devices
16 pages
TWEETEVAL: Unified Benchmark and Comparative Evaluation For Tweet Classification
No ratings yet
TWEETEVAL: Unified Benchmark and Comparative Evaluation For Tweet Classification
7 pages
1LrNFzx4f4UlMXOcxUUcqK2lCzQpaS BC Transcript
No ratings yet
1LrNFzx4f4UlMXOcxUUcqK2lCzQpaS BC Transcript
16 pages
ECE Java Skill Course Lab Experiments
No ratings yet
ECE Java Skill Course Lab Experiments
18 pages
Wireless ATM: Francine Lalooses David Lancia Arkadiusz Slanda Donald Traboini
No ratings yet
Wireless ATM: Francine Lalooses David Lancia Arkadiusz Slanda Donald Traboini
16 pages
Ds-2sf8c425mxg-Elw-26
No ratings yet
Ds-2sf8c425mxg-Elw-26
8 pages
Configuring Security Policies: This Lab Contains The Following Exercises and Activities
No ratings yet
Configuring Security Policies: This Lab Contains The Following Exercises and Activities
9 pages
Config Guide Subscriber Access
No ratings yet
Config Guide Subscriber Access
2,338 pages
Conector rj45 Panduit
No ratings yet
Conector rj45 Panduit
4 pages
Data Communications: Course Teacher: Md. Firoz Mridha Assistant Professor University of Asia Pacific
No ratings yet
Data Communications: Course Teacher: Md. Firoz Mridha Assistant Professor University of Asia Pacific
72 pages
Teach Yourself Corel Draw In24 Hours
100% (2)
Teach Yourself Corel Draw In24 Hours
208 pages
Electronic Commerce Eighth Edition
No ratings yet
Electronic Commerce Eighth Edition
94 pages
Company Name Designation Project Name - Pacs Duration Company Name Designation Project Name - IT Support System Duration - 4 Month
No ratings yet
Company Name Designation Project Name - Pacs Duration Company Name Designation Project Name - IT Support System Duration - 4 Month
2 pages
Professional Summary:: Name Srinivas Phone: 9177810299
No ratings yet
Professional Summary:: Name Srinivas Phone: 9177810299
3 pages
7.adams Car Level 02
No ratings yet
7.adams Car Level 02
49 pages