BOW Assignment 210097
BOW Assignment 210097
No: 210097
Name: Sehrish Rafique
Submitted To: Dr. Ehatisham
Date: December 20th, 2021
Dataset
The dataset contains the images of 4 types of objects:
- Soccer Ball
- Accordion
- Dollar Bill
- Motorbike
Implementation
Let’s begin with a few introductory concepts required Bag of words. We shall cover 4 parts.
- Clustering
- Bag of Visual Words Model
- Generating Vocabulary
- Training and testing
Beginning with KMeans clustering. Suppose there are X objects, that are to be divided into K
clusters. The input can be a set of features, X={x1,x2,...,xn}X={x1,x2,...,xn}. The goal is basically
to minimize the distance between each point in the scatter cloud and the assigned centroids.
Code Snippet
```
import numpy as np
from matplotlib import pyplot as plt
from sklearn.cluster import KMeans
from sklearn.datasets import make_blobs
# create a dataset sample space that will be used
# to test KMeans. Use function : make_blobs
n_samples = 1000
n_features = 5;
n_clusters = 3;
X, y = make_blobs(n_samples, n_features)
# X => array of shape [nsamples,nfeatures] ;;; y => array of shape[nsamples]
# X : generated samples, y : integer labels for cluster membership of each sample
# performing KMeans clustering
ret = KMeans(n_clusters = n_clusters).fit_predict(X)
print(ret)
__, ax = plt.subplots(2)
ax[0].scatter(X[:,0], X[:,1])
ax[0].set_title("Initial Scatter Distribution")
ax[1].scatter(X[:,0], X[:,1], c=ret)
ax[1].set_title("Colored Partition denoting Clusters")
# plt.scatter
plt.show()
```
Every image has certain discernable features, patterns with which humans decide as to what the
object perceived is.
Output:
FileHelper returns a dictionary of each object-name with a corresponding list of all images. It also
returns total image count. It returns a dictionary with key = object_name and value = list of images
and total number of images.
```
class FileHelper:
def getFiles(self, path):
"""
- returns a dictionary of all files
having key => value as objectname => image path
- returns total number of files.
"""
imlist = {}
count = 0
for each in glob(path + "*"):
word = each.split("/")[-1]
print " #### Reading image category ", word, " ##### "
imlist[word] = []
for imagefile in glob(path+word+"/*"):
print "Reading file ", imagefile
im = cv2.imread(imagefile, 0)
imlist[word].append(im)
count +=1
return [imlist, count]
```
ImageHelpers’s primary function is to provide with SIFT features present in an image.
We require these image features to develop our vocabulary.
```
class ImageHelpers:
def __init__(self):
self.sift_object = cv2.xfeatures2d.SIFT_create()
Using SIFT, we detect and compute features inside each image. SIFT returns us a m×128m×128-
dimension array, where m is the number of features extrapolated. Similarly, for multiple images,
say 1000 images, we shall obtain
feature 0
feature 1
…….
Feature N
Developing Vocabulary Each cluster denotes a particular visual word. Every image can be
represented as a combination of multiple visual words. The best method is to generate a histogram
that contains the frequency of occurrence of each visual word. Thus the vocabulary comprises of
a set of histograms of encompassing all descriptions for all images.
```
old_count = 0
for i in range(n_images):
l = len(descriptor_list[i])
for j in range(l):
if kmeans_ret is None:
idx = self.kmeans_ret[old_count+j]
else:
idx = kmeans_ret[old_count+j]
self.mega_histogram[i][idx] += 1
old_count += l
```
As seen, the input is n_images i.e. the total number of images and descriptor_list, that contains the
feature descriptor array ( one discussed above, the full stacked up list of features). Our histogram
is therefore of the size n_images×n_clusters thereby defining each image in terms of generated
vocabulary.
This method contains the entire module required for training the bag of visual words model.
```
def trainModel(self):
self.name_dict[str(label_count)] = word
for im in imlist:
# cv2.imshow("im", im)
# cv2.waitKey()
self.descriptor_list.append(des)
label_count += 1
# perform clustering
bov_descriptor_stack = self.bov_helper.formatND(self.descriptor_list)
self.bov_helper.cluster()
self.bov_helper.developVocabulary(n_images = self.trainImageCount,
descriptor_list=self.descriptor_list)
# self.bov_helper.plotHist()
self.bov_helper.standardize()
self.bov_helper.train(self.train_labels)
```
```
print(des.shape)
test_ret = self.bov_helper.kmeans_obj.predict(des)
# print test_ret
# print vocab
vocab[0][each] += 1
print(vocab)
vocab = self.bov_helper.scale.transform(np.atleast_2d(vocab))
lb = self.bov_helper.clf.predict(vocab)
return lb
```
This method is to test the trained classifier. Reading all images from testing path and use
BOVHelpers.predict() function to obtain classes of each image.
```
predictions = []
for word, imlist in self.testImages.iteritems():
for im in imlist:
print(im.shape)
cl = self.recognize(im)
print(cl)
predictions.append({
'image':im,
'class':cl,
'object_name':self.name_dict[str(int(cl[0]))]
})
```
predictions = self.clf.predict(iplist)
return predictions
print("Plotting histogram")
if vocabulary is None:
vocabulary = self.mega_histogram
x_scalar = np.arange(self.n_clusters)
y_scalar = np.array([abs(np.sum(vocabulary[:,h], dtype=np.int32)) for h in
range(self.n_clusters)])
print(y_scalar)
plt.bar(x_scalar, y_scalar)
plt.ylabel("Frequency")
plt.show()
```
Output:
Accordion detected
Bike Detected
Histogram:
Confusion Matrix