0% found this document useful (0 votes)
148 views92 pages

Computer Vision LAB 8 SEM

The document provides instructions on performing basic image handling and processing operations using Python and the Pillow library. It covers opening, cropping, applying filters, adding borders, and resizing images. Example code is given for each operation.

Uploaded by

adarsh24jdp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
148 views92 pages

Computer Vision LAB 8 SEM

The document provides instructions on performing basic image handling and processing operations using Python and the Pillow library. It covers opening, cropping, applying filters, adding borders, and resizing images. Example code is given for each operation.

Uploaded by

adarsh24jdp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 92

LAB MANUAL

COMPUTER VISION
LAB
SEMESTER 8TH

Prepared By
Prof. Vijaya Chaturvedi Prof. Jharna Chopra

Shri Shankaracharya Technical Campus


Department of Computer Science & Engineering
LAB MANUAL

COMPUTER VISION LAB


SEMESTER 8TH

PREPARED AS PER THE SYLLABUS PRESCRIBED BY

CHHATTISGARH SWAMI VIVEKANAND TECHNICAL UNIVERSITY, BHILAI

List of DOs & DON’Ts.

(Give instructions as per Computer Science & Engineering Laboratories)

DOs:

▪ Remove your shoes outside the laboratory.

▪ Come prepared in the lab regarding the experiment to be performed in the


lab.

▪ Take help from the Manual / Work Book for preparation of the experiment.

▪ For any abnormal working of the machine consult the Faculty In-charge/ Lab
Assistant.
▪ Shut down the machine and switch off the power supply after performing the
experiment.

▪ Maintain silence and proper discipline in the lab.

▪ Enter your machine number in the Login register.

DON’Ts :

▪ Do not bring any magnetic material in the lab.

▪ Do not eat or drink any thing in the lab.

▪ Do not tamper the instruments in the Lab and do not disturb their settings.

LIST OF EXPERIMENTS

AS PER SYLLABUS PRESCRIBED BY THE UNIVERSITY


Program / Semester: B.Tech (VIII Sem) Branch: Computer Science &
Engineering
Subject: Computer Vision Laboratory Course Code: D022821(022)
Total / Minimum-Pass Marks (End Semester Exam): 40 / L: 0 T: 0 P: 2 Credits: 1
20

Course Objectives:
1. To be able to use Python for Image handling and processing.
2. To perform Geometric transformations and computer homography matrix in Python.
3. To be able to perform perspective transformation, edge detection, line detection and
corner detection.
4. To be able to implement SIFT, SURF and HOG in Python.

Write programs to perform following activities:


1. Perform basic Image Handling and Processing operations on the image.
2. Geometric Transformation
3. Compute Homography Matrix
4. Perspective Transformation
5. Camera Calibration
6. Compute Fundamental Matrix
7. Edge Detection, Line Detection and Corner Detection
8. SIFT Feature descriptor
9. SURF and HOG feature descriptor
10. Project based on Computer Vision Applications.

Recommended Books:
1. Programming Computer Vision with Python, Jan Erik Solem, O'Reilly Media, ISBN:
9781449316549.
2. Practical Machine Learning for Computer Vision: End-to-End Machine
Learning for Images, Valliappa Lakshmanan, O'Reilly Media, ISBN:
9391043836.

EXPERIMENT-1

AIM: Perform basic Image Handling and Processing operation on the


image.
How To Work With Images
/ Image Editing, Python / By Mike / September 14, 2021 / 30DaysOfPython, Pillow, Python
The Python Imaging Library (PIL) is a 3rd party Python package that adds image processing
capabilities to your Python interpreter. It allows you to process photos and do many common image
file manipulations. The current version of this software is in Pillow, which is a fork of the original
PIL to support Python 3. Several other Python packages, such as wxPython and ReportLab, use
Pillow to support loading many different image file types. You can use Pillow for several use cases
including the following:
• Image processing
• Image archiving
• Batch processing
• Image display via Tkinter
In this articla, you will learn how to do the following with Pillow:

• Opening Images
• Cropping Images
• Using Filters
• Adding Borders
• Resizing Images
As you can see, Pillow can be used for many types of image processing. The images used in this
article are some that the author has taken himself. They are included with the code examples on
Github. See the introduction for more details.

Now let’s get started by installing Pillow!

Installing Pillow
Installing Pillow is easy to do with pip. Here is how you would do it after opening a terminal or
console window:
python -m pip install pillow

Now that Pillow is installed, you are ready to start using it!

Opening Images
Pillow let’s you open and view many different file types. For a full listing of the image file types that
Pillow supports, see the following:

• https://pillow.readthedocs.io/en/stable/handbook/image-file-formats.html
You can use Pillow to open and view any of the file types mentioned in the “fully supported formats”
section at the link above. The viewer is made with Tkinter and works in much the same way as
Matplotlib does when it shows a graph.

To see how this works, create a new file named open_image.py and enter the following code:
# open_image.py

from PIL import Image


image = Image.open('jellyfish.jpg')
image.show()
Here you import Image from the PIL package. Then you use Image.open() to open up an image. This will
return an PIL.JpegImagePlugin.JpegImageFile object that you can use to learn more about your image.
When you run this code, you will see a window similar to the following:

This is pretty handy because now you can view your images with Python without writing an entire
graphical user interface. You can use Pillow to learn more about your images as well. Create a new
file named get_image_info.py and add this code to it:
# get_image_info.py

from PIL import Image

def get_image_info(path):

image = Image.open(path)

print(f'This image is {image.width} x {image.height}')

exif = image._getexif()

print(exif)

if __name__ == '__main__':

get_image_info('ducks.jpg')

Here you get the width and height of the image using the image object. Then you use
the _getexif() method to get metadata about your image. EXIF stands for “Exchangeable image file
format” and is a standard that specifies the formats for images, sound, and ancillary tags used by
digital cameras. The output is pretty verbose, but you can learn from that data that this particular
photo was taken with a Sony 6300 camera with the following settings: “E 18-200mm F3.5-6.3 OSS
LE”. The timestamp for the photo is also in the Exif information.
However, the Exif data can be altered if you use photo editing software to crop, apply filters or do
other types of image manipulation. This can remove part or all of the Exif data. Try running this
function on some of your own photos and see what kinds of information you can extract!

Another fun bit of information that you can extract from the image is its histogram data. The
histogram of an image is a graphical representation of its tonal values. It shows you the brightness of
the photo as a list of values that you could graph. Let’s use this image as an example:
To get the histogram from this image you will use the image’s histogram() method. Then you will use
Matplotlib to graph it out. To see one way that you could do that, create a new file
named get_histrogram.py and add this code to it:
# get_histrogram.py

import matplotlib.pyplot as plt


from PIL import Image
def get_image_histrogram(path):
image = Image.open(path)
histogram = image.histogram()
plt.hist(histogram, bins=len(histogram))
plt.xlabel('Histogram')
plt.show()
if __name__ == '__main__':
get_image_histrogram('butterfly.jpg')
When you run this code, you open the image as before. Then you extract the histogram from it and
pass the list of values to your Matplotlib object where you call the hist() function. The hist() function
takes in the list of values and the number of equal-width bins in the range of values.
When you run this code, you will see the following graph:

This graph shows you the tonal values in the image that were mentioned earlier. You can try passing
in some of the other images included on Github to see different graphs or swap in some of your own
images to see their histograms.

Now let’s discover how you can use Pillow to crop images!
Cropping Images
When you are taking photographs, all too often the subject of the photo will move or you didn’t
zoom in far enough. This results in a photo where the focus of the image isn’t really front-and-center.
To fix this issue, you can crop the image to that part of the image that you want to highlight.

Pillow has this functionality built-in. To see how it works, create a file named cropping.py and add the
following code to it:
# cropping.py

from PIL import Image


def crop_image(path, cropped_path):
image = Image.open(path)
cropped = image.crop((40, 590, 979, 1500))
cropped.save(cropped_path)
if __name__ == '__main__':
crop_image('ducks.jpg', 'ducks_cropped.jpg')
The crop_image() function takes in the path of the file that you wish to crop as well as the path to the
new cropped file. You then open() the file as before and call crop(). This method takes the beginning
and ending x/y coordinates that you are using to crop with. You are creating a box that is used for
cropping.
Let’s take this fun photo of ducks and try cropping it with the code above:

Now when you run the code against this, you will end up with the following cropped image:
The coordinates you use to crop with will vary with the photo. In fact, you should probably change
this code so that it accepts the crop coordinates as arguments. You can do that yourself as a little
homework. It takes some trial and error to figure out the crop bounding box to use. You can use a
tool like Gimp to help you by drawing a bounding box with Gimp and noting the coordinates it gives
you to try with Pillow.

Now let’s move on and learn about applying filters to your images!

Using Filters
The Pillow package has several filters that you can apply to your images. These are the current filters
that are supported:

• BLUR
• CONTOUR
• DETAIL
• EDGE_ENHANCE
• EDGE_ENHANCE_MORE
• EMBOSS
• FIND_EDGES
• SHARPEN
• SMOOTH
• SMOOTH_MORE
Let’s use the butterfly image from earlier to test out a couple of these filters. Here is the image you
will be using:
Now that you have an image to use, go ahead and create a new file named blur.py and add this code to
it to try out Pillow’s BLUR filter:
# blur.py

from PIL import Image

from PIL import ImageFilter

def blur(path, modified_photo):

image = Image.open(path)

blurred_image = image.filter(ImageFilter.BLUR)

blurred_image.save(modified_photo)

if __name__ == '__main__':

blur('butterfly.jpg', 'butterfly_blurred.jpg')

To actually use a filter in Pillow, you need to import ImageFilter. Then you pass in the specific filter
that you want to use to the filter() method. When you call filter(), it will return a new image object. You
then save that file to disk.
This is the image that you will get when you run the code:
That looks kind of blurry, so you can count this as a success! If you want it to be even blurrier, you
could run the blurry photo back through your script a second time.

Of course, sometimes you take photos that are slightly blurry and you want to sharpen them up a bit.
Pillow includes that as a filter you can apply as well. Create a new file named sharpen.py and add this
code:
# sharpen.py

from PIL import Image

from PIL import ImageFilter

def sharpen(path, modified_photo):

image = Image.open(path)

sharpened_image = image.filter(ImageFilter.SHARPEN)

sharpened_image.save(modified_photo)

if __name__ == '__main__':

sharpen('butterfly.jpg', 'butterfly_sharper.jpg')

Here you take the original butterfly photo and apply the SHARPEN filter to it before saving it off.
When you run this code, your result will look like this:
Depending on your eyesight and your monitor’s quality, you may or may not see much difference
here. However, you can rest assured that it is slightly sharper.

Now let’s find out how you can add borders to your images!

Adding Borders
One way to make your photos look more professional is to add borders to them. Pillow makes this
pretty easy to do via their ImageOps module. But before you can do any borders, you need an image.
Here is the one you’ll be using:
Now that you have a nice image to play around with, go ahead and create a file named border.py and
put this code into it:
# border.py

from PIL import Image, ImageOps

def add_border(input_image, output_image, border):

img = Image.open(input_image)

if isinstance(border, int) or isinstance(border, tuple):

bimg = ImageOps.expand(img, border=border)

else:

raise RuntimeError('Border is not an integer or tuple!')

bimg.save(output_image)

if __name__ == '__main__':

in_img = 'butterfly_grey.jpg'

add_border(in_img, output_image='butterfly_border.jpg',

border=100)

The add_border() function takes in 3 arguments:


• input_image – the image you want to add a border to
• output_image – the image with the new border applied
• border – the amount of border to apply in pixels
In this code, you tell Pillow that you want to add a 100 pixel border to the photo that you pass in.
When you pass in an integer, that integer is used for the border on all four sides. The default color of
the border is black. The key method here is expand(), which takes in the image object and the border
amount.
When you run this code, you will end up with this lovely result:
You can pass in a tuple of values to make the border different widths. For example, if you passed
in (10, 50), that would add a 10-pixel border on the left and right sides of the images and a 50-pixel
border to the top and bottom. Try doing that with the code above and re-running it. If you do, you’ll
get the following:

Isn’t that nice? If you want to get really fancy, you can pass in different values for all four sides of
the image. But there probably aren’t very many use-cases where that makes sense.

Having a black border is nice and all, but sometimes you’ll want to add a little pizazz to your picture.
You can change that border color by passing in the fill argument to expand(). This argument takes in a
named color or an RGB color.
Create a new file named colored_border.py and add this code to it:
# colored_border.py

from PIL import Image, ImageOps


def add_border(input_image, output_image, border, color=0):

img = Image.open(input_image)

if isinstance(border, int) or isinstance(

border, tuple):

bimg = ImageOps.expand(img,

border=border,

fill=color)

else:

msg = 'Border is not an integer or tuple!'

raise RuntimeError(msg)

bimg.save(output_image)

if __name__ == '__main__':

in_img = 'butterfly_grey.jpg'

add_border(in_img,

output_image='butterfly_border_red.jpg',

border=100,

color='indianred')

Now your add_border() function takes in a color argument, which you pass on to the expand() method.
When you run this code, you’ll see this for your result:

That looks pretty nice. You can experiment around with different colors or apply your own favorite
color as the border.

The next item on your Pillow tour is to learn how to resize images!

Resizing Images
Resizing images with Pillow is fairly simple. You will be using the resize() method which takes in a
tuple of integers that are used to resize the image. To see how this works, you’ll be using this lovely
shot of a lizard:

Now that you have a photo, go ahead and create a new file named resize_image.py and put this code in
it:
# resize_image.py

from PIL import Image

def resize_image(input_image_path, output_image_path, size):

original_image = Image.open(input_image_path)

width, height = original_image.size

print(f'The original image size is {width} wide x {height} '

f'high')

resized_image = original_image.resize(size)

width, height = resized_image.size

print(f'The resized image size is {width} wide x {height} '

f'high')

resized_image.show()

resized_image.save(output_image_path)

if __name__ == '__main__':

resize_image(

input_image_path='lizard.jpg',

output_image_path='lizard_small.jpg',

size=(800, 400),

Here you pass in the lizard photo and tell Pillow to resize it to 600 x 400. When you run this code,
the output will tell you that the original photo was 1191 x 1141 pixels before it resizes it for you.
The result of running this code looks like this:

Well, that looks a bit odd! Pillow doesn’t actually do any scaling when it resizes the image. Instead,
Pillow will stretch or contort your image to fit the values you tell it to use.

What you want to do is scale the image. To make that work, you need to create a new file
named scale_image.py and add some new code to it. Here’s the code you need:
# scale_image.py

from PIL import Image

def scale_image(

input_image_path,

output_image_path,

width=None,

height=None

):

original_image = Image.open(input_image_path)

w, h = original_image.size

print(f'The original image size is {w} wide x {h} '

'high')

if width and height:

max_size = (width, height)

elif width:

max_size = (width, h)

elif height:

max_size = (w, height)

else:

# No width or height specified

raise ValueError('Width or height required!')

original_image.thumbnail(max_size, Image.ANTIALIAS)

original_image.save(output_image_path)
scaled_image = Image.open(output_image_path)

width, height = scaled_image.size

print(f'The scaled image size is {width} wide x {height} '

'high')

if __name__ == '__main__':

scale_image(

input_image_path='lizard.jpg',

output_image_path='lizard_scaled.jpg',

width=800,

This time around, you let the user specify both the width and height. If the user specifies a width, a
height, or both, then the conditional statement uses that information to create a max_size. Once it has
the max_size value calculated, you pass that to thumbnail() and save the result. If the user specifies both
values, thumbnail() will maintain the aspect ratio correctly when resizing.
When you run this code, you will find that the result is a smaller version of the original image and
that it now maintains its aspect ratio.

Wrapping Up
Pillow is very useful for working with images using Python. In this article, you learned how to do the
following:

•Open Images
•Crop Images
•Use Filters
•Add Borders
•Resize Images
You can do much more with Pillow than what is shown here. For example, you can do various image
enhancements, like changing the contrast or brightness of an image. Or you could composite multiple
images together.

Working with Images in Python


• Difficulty Level : Easy
• Last Updated : 30 May, 2018

Read
Discuss
Courses
Practice
Video
PIL is the Python Imaging Library which provides the python interpreter with image editing capabilities. It
was developed by Fredrik Lundh and several other contributors. Pillow is the friendly PIL fork and an easy to
use library developed by Alex Clark and other contributors. We’ll be working with Pillow.
Installation:
• Linux: On linux terminal type the following:
pip install Pillow
Installing pip via terminal:
sudo apt-get update
sudo apt-get install python-pip
• Windows: Download the appropriate Pillow package according to your python version. Make sure to
download according to the python version you have.
We’ll be working with the Image Module here which provides a class of the same name and provides a lot of
functions to work on our images.To import the Image module, our code should begin with the following line:
from PIL import Image
Operations with Images:
• Open a particular image from a path:

#img = Image.open(path)

# On successful execution of this statement,

# an object of Image type is returned and stored in img variable)

try:

img = Image.open(path)

except IOError:

pass

# Use the above statement within try block, as it can

# raise an IOError if file cannot be found,

# or image cannot be opened.

• Retrieve size of image: The instances of Image class that are created have many attributes, one of its
useful attribute is size.

from PIL import Image


filename = "image.png"

with Image.open(filename) as image:

width, height = image.size

#Image.size gives a 2-tuple and the width, height can be obtained

• Some other attributes are: Image.width, Image.height, Image.format, Image.info etc.


• Save changes in image: To save any changes that you have made to the image file, we need to give path as
well as image format.

img.save(path, format)

# format is optional, if no format is specified,

#it is determined from the filename extension

• Rotating an Image: The image rotation needs angle as parameter to get the image rotated.

from PIL import Image

def main():

try:

#Relative Path

img = Image.open("picture.jpg")

#Angle given

img = img.rotate(180)

#Saved in the same relative location

img.save("rotated_picture.jpg")

except IOError:
pass

if __name__ == "__main__":

main()


Note: There is an optional expand flag available as one of the argument of the rotate method, which if set
true, expands the output image to make it large enough to hold the full rotated image.
As seen in the above code snippet, I have used a relative path where my image is located in the same
directory as my python code file, an absolute path can be used as well.
• Cropping an Image: Image.crop(box) takes a 4-tuple (left, upper, right, lower) pixel coordinate, and
returns a rectangular region from the used image.

from PIL import Image

def main():

try:

#Relative Path

img = Image.open("picture.jpg")

width, height = img.size

area = (0, 0, width/2, height/2)

img = img.crop(area)
#Saved in the same relative location

img.save("cropped_picture.jpg")

except IOError:

pass

if __name__ == "__main__":

main()


• Resizing an Image: Image.resize(size)- Here size is provided as a 2-tuple width and height.

from PIL import Image

def main():

try:

#Relative Path

img = Image.open("picture.jpg")

width, height = img.size


img = img.resize((width/2, height/2))

#Saved in the same relative location

img.save("resized_picture.jpg")

except IOError:

pass

if __name__ == "__main__":

main()

• Pasting an image on another image: The second argument can be a 2-tuple (specifying the top left
corner), or a 4-tuple (left, upper, right, lower) – in this case the size of pasted image must match the size of
this box region, or None which is equivalent to (0, 0).

from PIL import Image

def main():
try:

#Relative Path

#Image on which we want to paste

img = Image.open("picture.jpg")

#Relative Path

#Image which we want to paste

img2 = Image.open("picture2.jpg")

img.paste(img2, (50, 50))

#Saved in the same relative location

img.save("pasted_picture.jpg")

except IOError:

pass

if __name__ == "__main__":

main()

##An additional argument for an optional image mask image is also available.

• Getting a Histogram of an Image: This will return a histogram of the image as a list of pixel counts, one
for each pixel in the image. (A histogram of an image is a graphical representation of the tonal distribution
in a digital image. It contains what all the brightness values contained in an image are. It plots the number
of pixels for each brightness value. It helps in doing the exposure settings.)
from PIL import Image

def main():

try:

#Relative Path

img = Image.open("picture.jpg")

#Getting histogram of image

print img.histogram()

except IOError:

pass

if __name__ == "__main__":

main()

• Transposing an Image: This feature gives us the mirror image of an image

from PIL import Image

def main():

try:

#Relative Path

img = Image.open("picture.jpg")

#transposing image

transposed_img = img.transpose(Image.FLIP_LEFT_RIGHT)

#Save transposed image

transposed_img.save("transposed.jpg")

except IOError:

pass
if __name__ == "__main__":

main()


• Split an image into individual bands: Splitting an image in RGB mode, creates three new images each
containing a copy of the original individual bands.

from PIL import Image

def main():

try:

#Relative Path

img = Image.open("picture.jpg")

#splitting the image

print img.split()

except IOError:

pass

if __name__ == "__main__":
main()


• tobitmap: Converting an image to an X11 bitmap (A plain text binary image format). It returns a string
containing an X11 bitmap, it can only be used for mode “1” images, i.e. 1 bit pixel black and white images.
from PIL import Image

def main():

try:

#Relative Path

img = Image.open("picture.jpg")

print img.mode

#converting image to bitmap

print img.tobitmap()
print type(img.tobitmap())

except IOError:

pass

if __name__ == "__main__":

main()

• Creating a thumbnail: This method creates a thumbnail of the image that is opened. It does not return a
new image object, it makes in-place modification to the currently opened image object itself. If you do not
want to change the original image object, create a copy and then apply this method. This method also
evaluates the appropriate to maintain the aspect ratio of the image according to the size passed.
from PIL import Image
def main():

try:

#Relative Path

img = Image.open("picture.jpg")

#In-place modification

img.thumbnail((200, 200))

img.save("thumb.jpg")

except IOError:

pass

if __name__ == "__main__":

main()


EXPERIMENT-2
AIM:Geometric transformations of Image in Python
Geometric transformations of images are used to transform the image by changing
its size, position or orientation. It has many applications in the fields of Machine
Learning and Image Processing.

For instance, consider a Machine Learning based project of detecting emotions such
as anger, sadness, happy from a given set of images. The database consists of
images present at different scales and orientations. But the model needs a uniform
set of images. Therefore, it is necessary to apply geometric transformations to
images to transform them into a consistent format.

to apply geometric transformations to an image using Pillow Library in Python.

three basic geometric transformations of an image:

• Rotation
• Scaling
• Translation.
• we will also learn how to combine these transformations together to perform
composite transformations of the image.

• Importing Library and Reading Image


In the first step, we are going to import Pillow and read the image. Pillow is a
Python-based library that provides basic tools for opening, saving, and manipulating
images. We import the matplotlib.pyplot library to plot the images in Python. We use
the open() function to read the image from the location specified as a parameter to
the function.

from PIL import Image


import matplotlib.pyplot as plt
image = Image.open(r"lenna.png")
plt.imshow(image)

Output:
Getting the size and mode of Image
The properties of the above-created image object such as size and mode are used
to get the size and color model of the given image. We get the size in terms of width
and height. The color model, in this case, is RGB. RGB stands for red, green, and
blue channels of the given image.

size=image.size

mode=image.mode

print(f"The size of Image is: {size}")

print(f"The mode of Image is: {mode}")

Output:
The size of Image is: (220, 220)
The mode of Image is: RGB

Rotation of Image
For rotating an image, we are initially taking angle as a user input to determine the
angle with which the image should be rotated. Then we use the rotate() function to
rotate the image by the specified angle in degrees in a clockwise approach. We then
plot the rotated image as an output. In the below-mentioned code, we have rotated
the image by 90 degrees.

angle=int(input("Enter angle:"))

image = image.rotate(angle)
plt.imshow(image)

Output:

Scaling of Image
For scaling an image, we try to increase or decrease the size of the image. To scale
an image we make use of resize() function in Python. The resize function takes a
tuple containing the width and height of the image as parameters. The image is then
resized to this newly mentioned width and height. In the below-mentioned code, we
have doubled the width and height of the image.

(width,height)=(image.width*2,image.height*2)

img_resize = image.resize((width,height))

plt.imshow(img_resize)

print(f"New size of image: {img_resize.size}")

Output:
Translation of Image
Image translation is changing the position of an image by a specified shift in x and y
directions. To translate an image we make use of the transform() function in Python.
The syntax of the transform function is mentioned below.

image_object.transform(size, method, data)

where size=size of the output image


method= method of transformation of the image
data=data given as an input to the transformation method

In the below-mentioned code, the method used for transformation is AFFINE. Affine
Transformation is used to transform the image while preserving parallel lines in input
and output images. The input data to the affine method is a six-element tuple
(a,b,c,d,e,f) which represents an affine transformation matrix

Initially, we take the values x and y as input which represents the x and y-axis shifts
respectively. The method will calculate the value as (ax+by+c, dx+ey+f) for every
(x,y) value given as input to the c and f variables.

x=int(input("Enter pixels for x axis shift:"))


y=int(input("Enter pixels for y axis shift:"))
a=1
b=0
c=x
d=0
e=1
f=y
image = image.transform(image.size, Image.AFFINE, (a, b, c, d, e, f))

plt.imshow(image)

Output

Composite Transformation of Image


We can apply multiple geometric transformations to perform the composite
transformations of the image. In the below-mentioned code, we have combined the
scaling and rotation of the image. We initially double the width and height and
image. Then, we rotate the image by 50 degrees anticlockwise. To rotate any image
anticlockwise we specify a negative sign before the angle.

(width,height)=(round(im.width/2),round(im.height/2))

img_resize = im.resize((width,height))

im1=img_resize.rotate(-50)

plt.imshow(im1)

Output:
EXPERIMENT-3
AIM: Compute Homography Matrix.
What is Homography ?
Consider two images of a plane (top of the book) shown in Figure 1. The red
dot represents the same physical point in the two images. In computer vision
jargon we call these corresponding points. Figure 1. shows four corresponding
points in four different colors — red, green, yellow and orange.
A Homography is a transformation ( a 3×3 matrix ) that maps the points in
one image to the corresponding points in the other image.

Figure 1 : Two images of a 3D plane ( top of the book ) are related by a


Homography

Now since a homography is a 3×3 matrix we can write it as

Let us consider the first set of corresponding points — in the first


image and in the second image. Then, the Homography maps
them in the following way
Homography examples using OpenCV –
Image Alignment
The above equation is true for ALL sets of corresponding points as long as
they lie on the same plane in the real world. In other words you can apply the
homography to the first image and the book in the first image will get aligned
with the book in the second image! See Figure 2.

Figure 2 :
One image of a 3D plane can be aligned with another image of the same
plane using Homography

But what about points that are not on the plane ? Well, they will NOT be
aligned by a homography as you can see in Figure 2. But wait, what if there
are two planes in the image ? Well, then you have two homographies — one
for each plane.
Homography examples using OpenCV –
Panorama
In the previous section, we learned that if a homography between two images
is known, we can warp one image onto the other. However, there was one big
caveat. The images had to contain a plane ( the top of a book ), and only the
planar part was aligned properly. It turns out that if you take a picture of any
scene ( not just a plane ) and then take a second picture by rotating the
camera, the two images are related by a homography!

In other words you can mount your camera on a tripod and take a picture.
Next, pan it about the vertical axis and take another picture. The two images
you just took of a completely arbitrary 3D scene are related by a homography.
The two images will share some common regions that can be aligned and
stitched and bingo you have a panorama of two images. Is it really that easy ?
Nope! (sorry to disappoint) A lot more goes into creating a good panorama,
but the basic principle is to align using a homography and stitch intelligently so
that you do not see the seams. Creating panoramas will definitely be part of a
future post.

How to calculate a Homography ?


To calculate a homography between two images, you need to know at least 4
point correspondences between the two images. If you have more than 4
corresponding points, it is even better. OpenCV will robustly estimate a
homography that best fits all corresponding points. Usually, these point
correspondences are found automatically by matching features like SIFT or
SURF between the images, but in this post we are simply going to click the
points by hand.
Homography examples using OpenCV Python

Images in Figure 2. can also be generated using the following Python code.
The code below shows how to take four corresponding points in two images
and warp image onto the other.

#!/usr/bin/env python
1

2
import cv2
3
import numpy as np
4

5
if __name__ == '__main__' :
6

7
# Read source image.
8
im_src = cv2.imread('book2.jpg')
9 # Four corners of the book in source image

10 pts_src = np.array([[141, 131], [480, 159], [493, 630],[64, 601]])

11

12 # Read destination image.

13 im_dst = cv2.imread('book1.jpg')

# Four corners of the book in destination image.


14
pts_dst = np.array([[318, 256],[534, 372],[316, 670],[73, 473]])
15

16
# Calculate Homography
17
h, status = cv2.findHomography(pts_src, pts_dst)
18

19
# Warp source image to destination based on homography
20
im_out = cv2.warpPerspective(im_src, h, (im_dst.shape[1],im_dst.shape[0]))
21

22 # Display images
23 cv2.imshow("Source Image", im_src)

24 cv2.imshow("Destination Image", im_dst)

cv2.imshow("Warped Source Image", im_out)


25

26
cv2.waitKey(0)
27

28

29

Applications of Homography
The most interesting application of Homography is undoubtedly making
panoramas ( a.k.a image mosaicing and image stitching ). Panoramas will be
the subject of a later post. Let us see some other interesting applications.

EXPERIMENT-4

AIM: Perspective Transformation – Python OpenCV


In Perspective Transformation, we can change the perspective of a given image or video for
getting better insights into the required information. In Perspective Transformation, we need to
provide the points on the image from which want to gather information by changing the
perspective. We also need to provide the points inside which we want to display our image. Then,
we get the perspective transform from the two given sets of points and wrap it with the original
image.
We use cv2.getPerspectiveTransform and then cv2.warpPerspective

cv2.getPerspectiveTransform method
Syntax: cv2.getPerspectiveTransform(src, dst)

Parameters:
• src: Coordinates of quadrangle vertices in the source image.
• dst: Coordinates of the corresponding quadrangle vertices in the destination image.
cv2.wrapPerspective method
Syntax: cv2.warpPerspective(src, dst, dsize)
Parameters:
• src: Source Image
• dst: output image that has the size dsize and the same type as src.
• dsize: size of output image
Python code explaining Perspective Transformation:

# import necessary libraries

import cv2
import numpy as np

# Turn on Laptop's webcam


cap = cv2.VideoCapture(0)

while True:

ret, frame = cap.read()

# Locate points of the documents


# or object which you want to transform
pts1 = np.float32([[0, 260], [640, 260],
[0, 400], [640, 400]])
pts2 = np.float32([[0, 0], [400, 0],
[0, 640], [400, 640]])

# Apply Perspective Transform Algorithm


matrix = cv2.getPerspectiveTransform(pts1, pts2)
result = cv2.warpPerspective(frame, matrix, (500, 600))

# Wrap the transformed image


cv2.imshow('frame', frame) # Initial Capture
cv2.imshow('frame1', result) # Transformed Capture

if cv2.waitKey(24) == 27:
break

cap.release()
cv2.destroyAllWindows()

Output:
Output:
How to apply Perspective Transformations
on an image using OpenCV Python?
In Perspective Transformation, the straight lines remain straight even after the transformation. To apply a
perspective transformation, we need a 3×3 perspective transformation matrix. We need four points on the
input image and corresponding four points on the output image.
We apply the cv2.getPerspectiveTransform() method to find the transformation matrix. Its syntax is as
follows −
M = cv2.getPerspectiveTransform(pts1,pts2)
where,
• pts1 − An array of four points on the input image and
• pts2 − An array of corresponding four points on the output image.
The Perspective Transformation matrix M is a numpy array. We pass M to the cv2.warpAffine() function as
an argument to compute perspective transformation. Its syntax is −
cv2.warpAffine(img,M,(cols,rows))
Where,
• img − The image to be transformed.
• M − Perspective transformation matrix defined above.
• (cols, rows) − Width and height of the image after transformation.
To apply Perspective Transformation on an image, you can follow the steps given below −
Steps
Import the required library. In all the following Python examples, the required Python library is OpenCV.
Make sure you have already installed it.
import cv2
Read the input image using cv2.imread() function. Pass the full path of the input image.
img = cv2.imread('warning_wall.jpg')
Define pts1 and pts2. pts1 is an array of four points on the input image and pts2 is an array of
corresponding four points on the output image.
pts1 = np.float32([[56,65],[368,52],[28,387],[389,390]])
pts2 = np.float32([[100,50],[300,0],[0,300],[300,300]])
Compute the perspective transform matrix M using cv2.getPerspectiveTransform(pts1, pts2) function.
M = cv2.getPerspectiveTransform(pts1,pts2)
Transform the image using cv2.warpAffine() method passing the perspective transform matrix as
argument. cols and rows are the desired width and height of the image after transformation.
dst = cv2.warpAffine(img,M,(cols,rows))
Display the transformed image.
cv2.imshow("Transformed Image", dst)
cv2.waitKey(0)
cv2.destroyAllWindows()
Let's look at some examples for a better understanding of how it is done.
We will use this image as the input file for the following examples.
Example 1
In this example, we perform Perspective Transformation on the input image. We set the width and height
of the output image the same as the input image.
# import required libraries

import cv2

import numpy as np

# read the input image

img = cv2.imread('warning_wall.jpg')

# find the height and width of image

# width = number of columns, height = number of rows in image array

rows,cols,ch = img.shape

# define four points on input image

pts1 = np.float32([[56,65],[368,52],[28,387],[389,390]])

# define the corresponding four points on output image

pts2 = np.float32([[100,50],[300,0],[0,300],[300,300]])

# get the perspective transform matrix

M = cv2.getPerspectiveTransform(pts1,pts2)

# transform the image using perspective transform matrix

dst = cv2.warpPerspective(img,M,(cols, rows))

# display the transformed image

cv2.imshow('Transformed Image', dst)

cv2.waitKey(0)

cv2.destroyAllWindows()

Output
On execution, this Python program will produce the following output window −
The above output image is obtained after the Perspective Transformation on the input image.
Example 2
In this example, we perform Perspective Transform on the input image. We set the width and height of the
output image as (600, 350). It is different from the width and height of the input image.
import cv2

import numpy as np

import matplotlib.pyplot as plt

img = cv2.imread('warning_wall.jpg',0)

rows,cols = img.shape

pts1 = np.float32([[56,65],[368,52],[28,387],[389,390]])

pts2 = np.float32([[0,0],[300,0],[0,300],[300,300]])

M = cv2.getPerspectiveTransform(pts1,pts2)

dst = cv2.warpPerspective(img,M,(600,350))

plt.subplot(121),plt.imshow(img, cmap='gray'),plt.title('Input')
plt.subplot(122),plt.imshow(dst, cmap='gray'),plt.title('Output')

plt.show()

Output
On execution, it will produce the following output window −

The left image is the input image and the right image is the output image after Perspective Transformation.

EXPERIMENT-5

AIM: Camera Calibration with Python – OpenCV


A camera is an integral part of several domains like robotics, space exploration, etc camera is
playing a major role. It helps to capture each and every moment and helpful for many analyses.
In order to use the camera as a visual sensor, we should know the parameters of the
camera. Camera Calibration is nothing but estimating the parameters of a camera, parameters
about the camera are required to determine an accurate relationship between a 3D point in the
real world and its corresponding 2D projection (pixel) in the image captured by that calibrated
camera.
We need to consider both internal parameters like focal length, optical center, and radial
distortion coefficients of the lens etc., and external parameters like rotation and translation of the
camera with respect to some real world coordinate system.

Required libraries:
• OpenCV library in python is a computer vision library, mostly used for image processing, video
processing, and analysis, facial recognition and detection, etc.
• Numpy is a general-purpose array-processing package. It provides a high-performance
multidimensional array object and tools for working with these arrays.
Camera Calibration can be done in a step-by-step approach:
• Step 1: First define real world coordinates of 3D points using known size of checkerboard
pattern.
• Step 2: Different viewpoints of check-board image is captured.
• Step 3: findChessboardCorners() is a method in OpenCV and used to find pixel
coordinates (u, v) for each 3D point in different images
• Step 4: Then calibrateCamera() method is used to find camera parameters.

It will take our calculated (threedpoints, twodpoints, grayColor.shape[::-1], None, None) as


parameters and returns list having elements as Camera matrix, Distortion coefficient, Rotation
Vectors, and Translation Vectors.
Camera Matrix helps to transform 3D objects points to 2D image points and the Distortion
Coefficient returns the position of the camera in the world, with the values
of Rotation and Translation vectors

# Import required modules


import cv2
import numpy as np
import os
import glob

# Define the dimensions of checkerboard


CHECKERBOARD = (6, 9)

# stop the iteration when specified


# accuracy, epsilon, is reached or
# specified number of iterations are completed.
criteria = (cv2.TERM_CRITERIA_EPS +
cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)

# Vector for 3D points


threedpoints = []
# Vector for 2D points
twodpoints = []

# 3D points real world coordinates


objectp3d = np.zeros((1, CHECKERBOARD[0]
* CHECKERBOARD[1],
3), np.float32)
objectp3d[0, :, :2] = np.mgrid[0:CHECKERBOARD[0],
0:CHECKERBOARD[1]].T.reshape(-1, 2)
prev_img_shape = None

# Extracting path of individual image stored


# in a given directory. Since no path is
# specified, it will take current directory
# jpg files alone
images = glob.glob('*.jpg')

for filename in images:


image = cv2.imread(filename)
grayColor = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Find the chess board corners


# If desired number of corners are
# found in the image then ret = true
ret, corners = cv2.findChessboardCorners(
grayColor, CHECKERBOARD,
cv2.CALIB_CB_ADAPTIVE_THRESH
+ cv2.CALIB_CB_FAST_CHECK +
cv2.CALIB_CB_NORMALIZE_IMAGE)

# If desired number of corners can be detected then,


# refine the pixel coordinates and display
# them on the images of checker board
if ret == True:
threedpoints.append(objectp3d)

# Refining pixel coordinates


# for given 2d points.
corners2 = cv2.cornerSubPix(
grayColor, corners, (11, 11), (-1, -1), criteria)

twodpoints.append(corners2)

# Draw and display the corners


image = cv2.drawChessboardCorners(image,
CHECKERBOARD,
corners2, ret)

cv2.imshow('img', image)
cv2.waitKey(0)

cv2.destroyAllWindows()

h, w = image.shape[:2]

# Perform camera calibration by


# passing the value of above found out 3D points (threedpoints)
# and its corresponding pixel coordinates of the
# detected corners (twodpoints)
ret, matrix, distortion, r_vecs, t_vecs = cv2.calibrateCamera(
threedpoints, twodpoints, grayColor.shape[::-1], None, None)

# Displaying required output


print(" Camera matrix:")
print(matrix)

print("\n Distortion coefficient:")


print(distortion)

print("\n Rotation Vectors:")


print(r_vecs)

print("\n Translation Vectors:")


print(t_vecs)

Input:

Output:
Camera matrix:
[[ 36.26378216 0. 125.68539168]
[ 0. 36.76607372 142.49821147]
[ 0. 0. 1. ]]

Distortion coefficient:
[[-1.25491812e-03 9.89269357e-05 -2.89077718e-03 4.52760939e-04
-3.29964245e-06]]

Rotation Vectors:
[array([[-0.05767492],
[ 0.03549497],
[ 1.50906953]]), array([[-0.09301982],
[-0.01034321],
[ 3.07733805]]), array([[-0.02175332],
[ 0.05611105],
[-0.07308161]])]

Translation Vectors:
[array([[ 4.63047351],
[-3.74281386],
[ 1.64238108]]), array([[2.31648737],
[3.98801521],
[1.64584622]]), array([[-3.17548808],
[-3.46022466],
[ 1.68200157]])]

EXPERIMENT-6

AIM: Python OpenCV: Epipolar Geometry.


The general setup of epipolar geometry. The gray region is the epipolar plane. The orange line is
the baseline, while the two blue lines are the epipolar lines.
Often in multiple view geometry, there are interesting relationships between the multiple cameras,
a 3D point, and that point’s projections in each of the camera’s image plane. The geometry that
relates the cameras, points in 3D, and the corresponding observations is referred to as the
epipolar geometry of a stereo pair.
The standard epipolar geometry setup involves two cameras observing the same 3D point P,
whose projection in each of the image planes is located at p and p’ respectively. The camera
centers are located at O1 and O2, and the line between them is referred to as the baseline. We
call the plane defined by the two camera centers and P the epipolar plane. The locations of
where the baseline intersects the two image planes are known as the epipoles e and e’. Finally,
the lines defined by the intersection of the epipolar plane and the two image planes are known as
the epipolar lines. The epipolar lines have the property that they intersect the baseline at the
respective epipoles in the image plane.

When the two image planes are parallel, then the epipoles e and e’ are located at infinity. Notice
that the epipolar lines are parallel to u axis of each image plane.
An interesting case of epipolar geometry is shown in Figure 4, which occurs when the image
planes are parallel to each other. When the image planes are parallel to each other, then the
epipoles e and e’ will be located at infinity since the baseline joining the centers O1, O2 is parallel
to the image planes. Another important byproduct of this case is that the epipolar lines are
parallel to an axis of each image plane. This case is especially useful and will be covered in
greater detail in the subsequent section on image rectification.
In real-world situations, however, we are not given the exact location of the 3D location P, but can
determine its projection in one of the image planes p. We also should be able to know the
camera’s locations, orientations, and camera matrices. What can we do with this knowledge?
With the knowledge of camera locations O1, O2 and the image point p, we can define the
epipolar plane. With this epipolar plane, we can then determine the epipolar lines1. By definition,
P’s projection into the second image p0 must be located on the epipolar line of the second image.
Thus, a basic understanding of epipolar geometry allows us to create a strong constraint between
image pairs without knowing the 3D structure of the scene.

The setup for determining the essential and fundamental matrices, which help map points and
epipolar lines across views.
We will now try to develop seamless ways to do map points and epipolar lines across views. If we
take the setup given in the original epipolar geometry framework (Figure 5), then we shall further
deneM andM0 to be the camera projection matrices that map 3D points into their respective 2D
image plane locations. Let us assume that the world reference system is associated to the rst
camera with the second camera oset rst by a rotation R and then by a translation T. This species
the camera projection matrices to be:
M = K[I 0] M' = K'[R T]
Now we find Fundamental Matrix (F) and Essential Matrix (E). Essential Matrix contains the
information regarding translation and rotation, that describes the location of th e second camera
relative to the first in global coordinates.
Fundamental Matrix contains equivalent information as Essential Matrix additionally to the
knowledge about the intrinsics of both cameras in order that we will relate the 2 cameras in pixel
coordinates. (If we are using rectified images and normalize the point by dividing by the focal
lengths, F=E). In simple words, Fundamental Matrix F maps some extent in one image to a line
(epiline) within the other image. This is calculated from matching points from both the pictures. A
minimum of 8 such points is required to seek out the elemental matrix (while using the 8 -point
algorithm). More points are preferred and use RANSAC to urge a more robust result.
So first we need to find as many possible matches between two images to find the fundamental
matrix. For this, we use SIFT descriptors with FLANN based matcher and ratio test.

import numpy as np

import cv2

from matplotlib import pyplot as plt

# Load the left and right images

# in gray scale

imgLeft = cv2.imread('image_l.png',

0)

imgRight = cv2.imread('image_r.png',

0)

# Detect the SIFT key points and

# compute the descriptors for the

# two images

sift = cv2.xfeatures2d.SIFT_create()

keyPointsLeft, descriptorsLeft = sift.detectAndCompute(imgLeft,

None)

keyPointsRight, descriptorsRight = sift.detectAndCompute(imgRight,

None)

# Create FLANN matcher object

FLANN_INDEX_KDTREE = 0
indexParams = dict(algorithm=FLANN_INDEX_KDTREE,

trees=5)

searchParams = dict(checks=50)

flann = cv2.FlannBasedMatcher(indexParams,

searchParams)

# Apply ratio test

goodMatches = []

ptsLeft = []

ptsRight = []

for m, n in matches:

if m.distance < 0.8 * n.distance:

goodMatches.append([m])

ptsLeft.append(keyPointsLeft[m.trainIdx].pt)

ptsRight.append(keyPointsRight[n.trainIdx].pt)
Image Left
Image Right

Let’s notice the basic Matrix.

ptsLeft = np.int32(ptsLeft)

ptsRight = np.int32(ptsRight)

F, mask = cv2.findFundamentalMat(ptsLeft,

ptsRight,

cv2.FM_LMEDS)

# We select only inlier points

ptsLeft = ptsLeft[mask.ravel() == 1]

ptsRight = ptsRight[mask.ravel() == 1]
Next, we find the epilines. Epilines corresponding to the points in the first image is drawn on the
second image. So mentioning correct images are important here. We get an array of lines. So we
define a new function to draw these lines on the images.

def drawlines(img1, img2, lines, pts1, pts2):

r, c = img1.shape

img1 = cv2.cvtColor(img1, cv2.COLOR_GRAY2BGR)

img2 = cv2.cvtColor(img2, cv2.COLOR_GRAY2BGR)

for r, pt1, pt2 in zip(lines, pts1, pts2):

color = tuple(np.random.randint(0, 255,

3).tolist())

x0, y0 = map(int, [0, -r[2] / r[1] ])

x1, y1 = map(int,

[c, -(r[2] + r[0] * c) / r[1] ])

img1 = cv2.line(img1,

(x0, y0), (x1, y1), color, 1)

img1 = cv2.circle(img1,

tuple(pt1), 5, color, -1)

img2 = cv2.circle(img2,

tuple(pt2), 5, color, -1)

return img1, img2

Now we find the epilines in both the images and draw them.
# Find epilines corresponding to points

# in right image (second image) and

# drawing its lines on left image

linesLeft = cv2.computeCorrespondEpilines(ptsRight.reshape(-1,

1,

2),

2, F)

linesLeft = linesLeft.reshape(-1, 3)

img5, img6 = drawlines(imgLeft, imgRight,

linesLeft, ptsLeft,

ptsRight)

# Find epilines corresponding to

# points in left image (first image) and

# drawing its lines on right image

linesRight = cv2.computeCorrespondEpilines(ptsLeft.reshape(-1, 1, 2),

1, F)

linesRight = linesRight.reshape(-1, 3)

img3, img4 = drawlines(imgRight, imgLeft,

linesRight, ptsRight,

ptsLeft)

plt.subplot(121), plt.imshow(img5)
plt.subplot(122), plt.imshow(img3)

plt.show()

Output:

EXPERIMENT-7

AIM:Python – Edge Detection using Pillow


Edge Detection, is an Image Processing discipline that incorporates mathematics methods to find
edges in a Digital Image. Edge Detection internally works by running a filter/Kernel over a Digital
Image, which detects discontinuities in Image regions like stark changes in brightness/Intensity
value of pixels. There are two forms of edge detection:
• Search Based Edge detection (First order derivative)
• Zero Crossing Based Edge detection (Second order derivative)

Some of the commonly known edge detection methods are:


• Laplacian Operator or Laplacian Based Edge detection (Second order derivative)
• Canny edge detector (First order derivative)
• Prewitt operator (First order derivative)
• Sobel Operator (First order derivative)
We would be implementing a Laplacian Operator in order to incorporate Edge detection in one of
our later examples. For this purpose, we will be using pillow library. To install the library, execute
the following command in the command-line :
pip install pillow

There are two ways in which we would be implementing Edge detection on our images. In the first
method we would be using an inbuilt method provided in the pillow library
(ImageFilter.FIND_EDGES) for edge detection. In the second one we would be creating a
Laplacian Filter using PIL.ImageFilter.Kernel(), and then would use that filter for edge detection.
LAPLACIAN KERNEL:-

SAMPLE IMAGE:-
Method 1:
• Python3

from PIL import Image, ImageFilter


# Opening the image (R prefixed to string

# in order to deal with '\' in paths)

image = Image.open(r"Sample.png")

# Converting the image to grayscale, as edge detection

# requires input image to be of mode = Grayscale (L)

image = image.convert("L")

# Detecting Edges on the Image using the argument


ImageFilter.FIND_EDGES

image = image.filter(ImageFilter.FIND_EDGES)

# Saving the Image Under the name Edge_Sample.png

image.save(r"Edge_Sample.png")

Output (Edge_Sample.png):
Explanation:-
Firstly we create an image object of our image using Image.open(). Then we convert the Image
color mode to grayscale, as the input to the Laplacian operator is in grayscale mode (in general).
Then we pass the image onto Image.filter() function by specifying ImageFilter.FIND_EDGES
argument, which in turns runs a edge detection kernel on top of the image. The output of the
above function results in an image with high intensity changes (edges) in shades of white, and
rest of the image in black color.
Method 2:
• Python3
from PIL import Image, ImageFilter

img = Image.open(r"sample.png")

# Converting the image to grayscale, as Sobel Operator requires

# input image to be of mode Grayscale (L)

img = img.convert("L")

# Calculating Edges using the passed laplacian Kernel

final = img.filter(ImageFilter.Kernel((3, 3), (-1, -1, -1, -1, 8,

-1, -1, -1, -1), 1, 0))

final.save("EDGE_sample.png")

Output (EDGE_sample.png):
Explanation:-
Firstly we create an image object of our image using Image.open(). Then we convert the Image
color mode to grayscale, as the input to the Laplacian operator is in grayscale mode (in general).
Then we pass the image onto Image.filter() function by specifying our operator/Kernel inside the
function as an argument. The Kernel is specified by using ImageFilter.Kernel((3, 3), (-1, -1, -1, -1,
8, -1, -1, -1, -1), 1, 0)) which create a 3 X 3 Kernel (3 pixel Wide and 3 pixel long) with the
values (-1, -1, -1, -1, 8, -1, -1, -1, -1) (as stated in the Laplacian Kernel image). The 1 argument
(after the kernel) stands for the Scale value, which divides the final value after each kernel
operation, therefore we set that value to 1 as we don’t want any division to our final value. The 0
argument (after the Scale value) is the offset which is added after the division by Scale va lue. We
have set that value to 0 as we don’t want any increment to the final intensity value after the
Kernel Convolution. The output of the above function results in an image with high intensity
changes (edges) in shades of white, and rest of the image in black color.

Addendum –
Both the programs yielded the same result. The reason for which being the fact that the inbuilt
function ImageFilter.FIND_EDGE uses a 3 X 3 sized Laplacian Kernel/Operator internally. Due to
which we ended up with identical results. The benefit of using a Kernel instead of relying on
inbuilt functions is that we can define kernels according to our needs, which may/may not be in
the library. Such as we can create a Kernel for Blurring, Sharpening, Edge detection (usi ng other
Kernels) etc. Also, I intentionally chose the Laplacian so that we can maintain consistency in
results.
Benefits of using Laplacian:- Fast and decent results. Other common edge detectors like Sobel
(first order derivative) are more expensive on computation, as they require finding Gradients in
two directions and then Normalizing the results.
Drawbacks of using laplacian:- Convolving with Laplacian Kernel leads to a lot of noise in the
output. This issue is resolved by other Edge Detection methods such as Sobel, Prewitt Operator
etc. As they have a built-in Gaussian Blur Kernel in them. Which reduces the noise obtained from
the input image. They also lead to more accurate edge detection, due to the higher computation
involved into finding them

Python Program to detect the edges of an image


using OpenCV | Sobel edge detection method
The following program detects the edges of frames in a livestream video content. The code will
only compile in linux environment. Make sure that openCV is installed in your system before you
run the program.
Steps to download the requirements below:

• Run the following command on your terminal to install it from the Ubuntu or Debian
repository.

sudo apt-get install libopencv-dev python-opencv



• OR In order to download OpenCV from the official site run the following command:

bash install-opencv.sh
• on your terminal.
• Type your sudo password and you will have installed OpenCV.

Principle behind Edge Detection


Edge detection involves mathematical methods to find points in an image where the brightness of
pixel intensities changes distinctly.

• The first thing we are going to do is find the gradient of the grayscale image, allowing us to
find edge-like regions in the x and y direction. The gradient is a multi-variable generalization of
the derivative. While a derivative can be defined on functions of a single variable, for functions
of several variables, the gradient takes its place.

• The gradient is a vector-valued function, as opposed to a derivative, which is scalar-valued.


Like the derivative, the gradient represents the slope of the tangent of the graph of the
function. More precisely, the gradient points in the direction of the greatest rate of increase of
the function, and its magnitude is the slope of the graph in that direction.
Note: In computer vision, transitioning from black-to-white is considered a positive slope, whereas
a transition from white-to-black is a negative slope.

Python program to Edge detection

# using OpenCV in Python

# using Sobel edge detection

# and laplacian method

import cv2

import numpy as np

#Capture livestream video content from camera 0

cap = cv2.VideoCapture(0)

while(1):

# Take each frame

_, frame = cap.read()

# Convert to HSV for simpler calculations

hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)

# Calculation of Sobelx

sobelx = cv2.Sobel(frame,cv2.CV_64F,1,0,ksize=5)

# Calculation of Sobely
sobely = cv2.Sobel(frame,cv2.CV_64F,0,1,ksize=5)

# Calculation of Laplacian

laplacian = cv2.Laplacian(frame,cv2.CV_64F)

cv2.imshow('sobelx',sobelx)

cv2.imshow('sobely',sobely)

cv2.imshow('laplacian',laplacian)

k = cv2.waitKey(5) & 0xFF

if k == 27:

break

cv2.destroyAllWindows()

#release the frame

cap.release()

Calculation of the derivative of an image


A digital image is represented by a matrix that stores the RGB/BGR/HSV(whichever color space
the image belongs to) value of each pixel in rows and columns.
The derivative of a matrix is calculated by an operator called the Laplacian. In order to calculate
a Laplacian, you will need to calculate first two derivatives, called derivatives of Sobel, each of
which takes into account the gradient variations in a certain direction: one horizontal, the other
vertical.

Python | Detect corner of an image using


OpenCV
OpenCV (Open Source Computer Vision) is a computer vision library that contains various
functions to perform operations on Images or videos. OpenCV library can be used to perform
multiple operations on videos.
Let’s see how to detect the corner in the image.
cv2.goodFeaturesToTrack() method finds N strongest corners in the image by Shi-Tomasi
method. Note that the image should be a grayscale image. Specify the number of corners you
want to find and the quality level (which is a value between 0-1). It denotes the minimum quality
of corner below which everyone is rejected. Then provide the minimum Euclidean distance
between corners detected.
Syntax : cv2.goodFeaturesToTrack (image, maxCorners, qualityLevel, minDistance[, corners[,
mask[, blockSize[, useHarrisDetector[, k]]]]])

Image before corner detection:

# import the required library


import numpy as np
import cv2
from matplotlib import pyplot as plt

# read the image


img = cv2.imread('corner1.png')

# convert image to gray scale image


gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# detect corners with the goodFeaturesToTrack function.


corners = cv2.goodFeaturesToTrack(gray, 27, 0.01, 10)
corners = np.int0(corners)

# we iterate through each corner,


# making a circle at each point that we think is a corner.
for i in corners:
x, y = i.ravel()
cv2.circle(img, (x, y), 3, 255, -1)

plt.imshow(img), plt.show()
Image after corner detection –

Line detection in python with OpenCV |


Houghline method
The Hough Transform is a method that is used in image processing to detect any shape, if that
shape can be represented in mathematical form. It can detect the shape even if it is broken or
distorted a little bit.
We will see how Hough transform works for line detection using the HoughLine transform
method. To apply the Houghline method, first an edge detection of the specific image is
desirable. For the edge detection technique go through the article Edge detection

Basics of Houghline Method


A line can be represented as y = mx + c or in parametric form, as r = xcosθ + ysinθ where r is the
perpendicular distance from origin to the line, and θ is the angle formed by this perpendicular line
and horizontal axis measured in counter-clockwise ( That direction varies on how you represent
the coordinate system. This representation is used in OpenCV).
So Any line can be represented in these two terms, (r, θ).
Working of Houghline method:

• First it creates a 2D array or accumulator (to hold values of two parameters) and it is set to
zero initially.
• Let rows denote the r and columns denote the (θ)theta.
• Size of array depends on the accuracy you need. Suppose you want the accuracy of
angles to be 1 degree, you need 180 columns(Maximum degree for a straight line is 180).
• For r, the maximum distance possible is the diagonal length of the image. So taking one
pixel accuracy, number of rows can be diagonal length of the image.
Example:
Consider a 100×100 image with a horizontal line at the middle. Take the first point of the line. You
know its (x,y) values. Now in the line equation, put the values θ(theta) = 0,1,2,….,180 and check
the r you get. For every (r, 0) pair, you increment value by one in the accumulator in its
corresponding (r,0) cells. So now in accumulator, the cell (50,90) = 1 along with some other
cells.
Now take the second point on the line. Do the same as above. Increment the values in the cells
corresponding to (r,0) you got. This time, the cell (50,90) = 2. We are actually voting the (r,0)
values. You continue this process for every point on the line. At each point, the cell (50,90) will be
incremented or voted up, while other cells may or may not be voted up. This way, at the end, the
cell (50,90) will have maximum votes. So if you search the accumulator for maximum votes, you
get the value (50,90) which says, there is a line in this image at distance 50 from origin and at
angle 90 degrees.
Everything explained above is encapsulated in the OpenCV function, cv2.HoughLines(). It simply
returns an array of (r, 0) values. r is measured in pixels and 0 is measured in radians.

• Python

# Python program to illustrate HoughLine

# method for line detection

import cv2

import numpy as np

# Reading the required image in

# which operations are to be done.

# Make sure that the image is in the same

# directory in which this python program is

img = cv2.imread('image.jpg')

# Convert the img to grayscale


gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Apply edge detection method on the image

edges = cv2.Canny(gray, 50, 150, apertureSize=3)

# This returns an array of r and theta values

lines = cv2.HoughLines(edges, 1, np.pi/180, 200)

# The below for loop runs till r and theta values

# are in the range of the 2d array

for r_theta in lines:

arr = np.array(r_theta[0], dtype=np.float64)

r, theta = arr

# Stores the value of cos(theta) in a

a = np.cos(theta)

# Stores the value of sin(theta) in b

b = np.sin(theta)

# x0 stores the value rcos(theta)

x0 = a*r

# y0 stores the value rsin(theta)

y0 = b*r
# x1 stores the rounded off value of (rcos(theta)-1000sin(theta))

x1 = int(x0 + 1000*(-b))

# y1 stores the rounded off value of (rsin(theta)+1000cos(theta))

y1 = int(y0 + 1000*(a))

# x2 stores the rounded off value of (rcos(theta)+1000sin(theta))

x2 = int(x0 - 1000*(-b))

# y2 stores the rounded off value of (rsin(theta)-1000cos(theta))

y2 = int(y0 - 1000*(a))

# cv2.line draws a line in img from the point(x1,y1) to (x2,y2).

# (0,0,255) denotes the colour of the line to be

# drawn. In this case, it is red.

cv2.line(img, (x1, y1), (x2, y2), (0, 0, 255), 2)

# All the changes made in the input image are finally

# written on a new image houghlines.jpg

cv2.imwrite('linesDetected.jpg', img)

Elaboration of function(cv2.HoughLines (edges,1,np.pi/180, 200)):

1. First parameter, Input image should be a binary image, so apply threshold edge detection
before finding applying hough transform.
2. Second and third parameters are r and θ(theta) accuracies respectively.
3. Fourth argument is the threshold, which means minimum vote it should get for it to be
considered as a line.
4. Remember, number of votes depend upon number of points on the line. So it represents the
minimum length of line that should be detected.

1.
Alternate simpler method for directly extracting points:
• Python3

import cv2

import numpy as np

# Read image

image = cv2.imread('path/to/image.png')

# Convert image to grayscale

gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)

# Use canny edge detection

edges = cv2.Canny(gray,50,150,apertureSize=3)

# Apply HoughLinesP method to

# to directly obtain line end points

lines_list =[]

lines = cv2.HoughLinesP(

edges, # Input edge image

1, # Distance resolution in pixels

np.pi/180, # Angle resolution in radians


threshold=100, # Min number of votes for valid line

minLineLength=5, # Min allowed length of line

maxLineGap=10 # Max allowed gap between line for joining


them

# Iterate over points

for points in lines:

# Extracted points nested in the list

x1,y1,x2,y2=points[0]

# Draw the lines joing the points

# On the original image

cv2.line(image,(x1,y1),(x2,y2),(0,255,0),2)

# Maintain a simples lookup list for points

lines_list.append([(x1,y1),(x2,y2)])

# Save the result image

cv2.imwrite('detectedLines.png',image)

Summarizing the process

• In an image analysis context, the coordinates of the point(s) of edge segments (i.e. X,Y ) in
the image are known and therefore serve as constants in the parametric line equation, while
R(rho) and Theta(θ) are the unknown variables we seek.
• If we plot the possible (r) values defined by each (theta), points in cartesian image space map
to curves (i.e. sinusoids) in the polar Hough parameter space. This point-to-curve
transformation is the Hough transformation for straight lines.
• The transform is implemented by quantizing the Hough parameter space into finite intervals or
accumulator cells. As the algorithm runs, each (X,Y) is transformed into a discretized (r,0)
curve and the accumulator(2D array) cells which lie along this curve are incremented.
• Resulting peaks in the accumulator array represent strong evidence that a corresponding
straight line exists in the image.
Applications of Hough transform:
1. It is used to isolate features of a particular shape within an image.
2. Tolerant of gaps in feature boundary descriptions and is relatively unaffected by image noise.
3. Used extensively in barcode scanning, verification and recognition

EXPERIMENT-8
AIM:Introduction to SIFT( Scale Invariant Feature Transform)

SIFT stands for Scale-Invariant Feature Transform and was first presented in
2004, by D.Lowe, University of British Columbia. SIFT is invariance to image
scale and rotation. This algorithm is patented, so this algorithm is included in
the Non-free module in OpenCV.
Major advantages of SIFT are

• Locality: features are local, so robust to occlusion and clutter (no prior
segmentation)

• Distinctiveness: individual features can be matched to a large database of


objects

• Quantity: many features can be generated for even small objects

• Efficiency: close to real-time performance

• Extensibility: can easily be extended to a wide range of different feature types,


with each adding robustness

This is part of a 7-series Feature Detection and Matching. Other articles


included

• Introduction To Feature Detection And Matching

• Introduction to Harris Corner Detector

• Introduction to SURF (Speeded-Up Robust Features)

• Introduction to FAST (Features from Accelerated Segment Test)

• Introduction to BRIEF (Binary Robust Independent Elementary Features)

• Introduction to ORB (Oriented FAST and Rotated BRIEF)

The algorithm

SIFT is quite an involved algorithm. There are mainly four steps involved in the
SIFT algorithm. We will see them one-by-one.

• Scale-space peak selection: Potential location for finding features.

• Keypoint Localization: Accurately locating the feature keypoints.


• Orientation Assignment: Assigning orientation to keypoints.

• Keypoint descriptor: Describing the keypoints as a high dimensional vector.

• Keypoint Matching

Scale-space peak Selection

Scale-space

Real world objects are meaningful only at a certain scale. You might see a sugar
cube perfectly on a table. But if looking at the entire milky way, then it simply
does not exist. This multi-scale nature of objects is quite common in nature.
And a scale space attempts to replicate this concept on digital images.

The scale space of an image is a function L(x,y,σ) that is produced from the
convolution of a Gaussian kernel(Blurring) at different scales with the input
image. Scale-space is separated into octaves and the number of octaves and
scale depends on the size of the original image. So we generate several octaves
of the original image. Each octave’s image size is half the previous one.

Blurring

Within an octave, images are progressively blurred using the Gaussian Blur
operator. Mathematically, “blurring” is referred to as the convolution of the
Gaussian operator and the image. Gaussian blur has a particular expression or
“operator” that is applied to each pixel. What results is the blurred image.

Blurred image

G is the Gaussian Blur operator and I is an image. While x,y are the location
coordinates and σ is the “scale” parameter. Think of it as the amount of blur.
Greater the value, greater the blur.

Gaussian Blur operator

DOG(Difference of Gaussian kernel)

Now we use those blurred images to generate another set of images, the
Difference of Gaussians (DoG). These DoG images are great for finding out
interesting keypoints in the image. The difference of Gaussian is obtained as
the difference of Gaussian blurring of an image with two different σ, let it be σ
and kσ. This process is done for different octaves of the image in the Gaussian
Pyramid. It is represented in below image:
Finding keypoints

Up till now, we have generated a scale space and used the scale space to
calculate the Difference of Gaussians. Those are then used to calculate
Laplacian of Gaussian approximations that are scale invariant.

One pixel in an image is compared with its 8 neighbors as well as 9 pixels in


the next scale and 9 pixels in previous scales. This way, a total of 26 checks are
made. If it is a local extrema, it is a potential keypoint. It basically means that
keypoint is best represented in that scale.

Keypoint Localization
Key0points generated in the previous step produce a lot of keypoints. Some of
them lie along an edge, or they don’t have enough contrast. In both cases, they
are not as useful as features. So we get rid of them. The approach is similar to
the one used in the Harris Corner Detector for removing edge features. For low
contrast features, we simply check their intensities.

They used Taylor series expansion of scale space to get a more accurate
location of extrema, and if the intensity at this extrema is less than a threshold
value (0.03 as per the paper), it is rejected. DoG has a higher response for
edges, so edges also need to be removed. They used a 2x2 Hessian matrix (H)
to compute the principal curvature.

Orientation Assignment

Now we have legitimate keypoints. They’ve been tested to be stable. We already


know the scale at which the keypoint was detected (it’s the same as the scale of
the blurred image). So we have scale invariance. The next thing is to assign an
orientation to each keypoint to make it rotation invariance.
A neighborhood is taken around the keypoint location depending on the scale,
and the gradient magnitude and direction is calculated in that region. An
orientation histogram with 36 bins covering 360 degrees is created. Let's say
the gradient direction at a certain point (in the “orientation collection region”)
is 18.759 degrees, then it will go into the 10–19-degree bin. And the “amount”
that is added to the bin is proportional to the magnitude of the gradient at that
point. Once you’ve done this for all pixels around the keypoint, the histogram
will have a peak at some point.
The highest peak in the histogram is taken and any peak above 80% of it is also
considered to calculate the orientation. It creates keypoints with same location
and scale, but different directions. It contributes to the stability of matching.

Keypoint descriptor

At this point, each keypoint has a location, scale, orientation. Next is to


compute a descriptor for the local image region about each keypoint that is
highly distinctive and invariant as possible to variations such as changes in
viewpoint and illumination.

To do this, a 16x16 window around the keypoint is taken. It is divided into 16


sub-blocks of 4x4 size.

For each sub-block, 8 bin orientation histogram is created.


So 4 X 4 descriptors over 16 X 16 sample array were used in practice. 4 X 4 X 8
directions give 128 bin values. It is represented as a feature vector to form
keypoint descriptor. This feature vector introduces a few complications. We
need to get rid of them before finalizing the fingerprint.

1. Rotation dependence The feature vector uses gradient orientations. Clearly, if


you rotate the image, everything changes. All gradient orientations also change.
To achieve rotation independence, the keypoint’s rotation is subtracted from each
orientation. Thus each gradient orientation is relative to the keypoint’s
orientation.

2. Illumination dependence If we threshold numbers that are big, we can


achieve illumination independence. So, any number (of the 128) greater than 0.2
is changed to 0.2. This resultant feature vector is normalized again. And now you
have an illumination independent feature vector!

Keypoint Matching

Keypoints between two images are matched by identifying their nearest


neighbors. But in some cases, the second closest-match may be very near to the
first. It may happen due to noise or some other reasons. In that case, the ratio
of closest-distance to second-closest distance is taken. If it is greater than 0.8,
they are rejected. It eliminates around 90% of false matches while discards
only 5% correct matches, as per the paper.

Implementation

I was able to implement sift using OpenCV(3.4). Here’s how I did it:

SIFT (Scale-Invariant Feature Transform)


Import resources and display image
In [1]:
import cv2
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline

# Load the image


image1 = cv2.imread('./images/face1.jpeg')

# Convert the training image to RGB


training_image = cv2.cvtColor(image1, cv2.COLOR_BGR2RGB)

# Convert the training image to gray scale


training_gray = cv2.cvtColor(training_image, cv2.COLOR_RGB2GRAY)

# Create test image by adding Scale Invariance and Rotational Invariance


test_image = cv2.pyrDown(training_image)
test_image = cv2.pyrDown(test_image)
num_rows, num_cols = test_image.shape[:2]

rotation_matrix = cv2.getRotationMatrix2D((num_cols/2, num_rows/2), 30, 1)


test_image = cv2.warpAffine(test_image, rotation_matrix, (num_cols,
num_rows))

test_gray = cv2.cvtColor(test_image, cv2.COLOR_RGB2GRAY)

# Display traning image and testing image


fx, plots = plt.subplots(1, 2, figsize=(20,10))

plots[0].set_title("Training Image")
plots[0].imshow(training_image)

plots[1].set_title("Testing Image")
plots[1].imshow(test_image)
Out[1]:
<matplotlib.image.AxesImage at 0x7fa8c84ed390>

Detect keypoints and Create Descriptor


In [2]:
sift = cv2.xfeatures2d.SIFT_create()
train_keypoints, train_descriptor = sift.detectAndCompute(training_gray,
None)
test_keypoints, test_descriptor = sift.detectAndCompute(test_gray, None)

keypoints_without_size = np.copy(training_image)
keypoints_with_size = np.copy(training_image)

cv2.drawKeypoints(training_image, train_keypoints, keypoints_without_size,


color = (0, 255, 0))

cv2.drawKeypoints(training_image, train_keypoints, keypoints_with_size,


flags = cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)

# Display image with and without keypoints size


fx, plots = plt.subplots(1, 2, figsize=(20,10))

plots[0].set_title("Train keypoints With Size")


plots[0].imshow(keypoints_with_size, cmap='gray')

plots[1].set_title("Train keypoints Without Size")


plots[1].imshow(keypoints_without_size, cmap='gray')

# Print the number of keypoints detected in the training image


print("Number of Keypoints Detected In The Training Image: ",
len(train_keypoints))

# Print the number of keypoints detected in the query image


print("Number of Keypoints Detected In The Query Image: ",
len(test_keypoints))
Number of Keypoints Detected In The Training Image: 302
Number of Keypoints Detected In The Query Image: 91

Matching Keypoints
In [5]:
# Create a Brute Force Matcher object.
bf = cv2.BFMatcher(cv2.NORM_L1, crossCheck = False)

# Perform the matching between the SIFT descriptors of the training image
and the test image
matches = bf.match(train_descriptor, test_descriptor)

# The matches with shorter distance are the ones we want.


matches = sorted(matches, key = lambda x : x.distance)

result = cv2.drawMatches(training_image, train_keypoints, test_gray,


test_keypoints, matches, test_gray, flags = 2)

# Display the best matching points


plt.rcParams['figure.figsize'] = [14.0, 7.0]
plt.title('Best Matching Points')
plt.imshow(result)
plt.show()

# Print total number of matching points between the training and query
images
print("\nNumber of Matching Keypoints Between The Training and Query Images:
", len(matches))

Number of Matching Keypoints Between The Training and Query Images: 302
EXPERIMENT-9

AIM: SURE and HOG feature descriptor.

The Histogram of Oriented Gradients (HOG) is a feature descriptor used in computer


vision and image processing applications for the purpose of the object detection. It is a
technique that counts events of gradient orientation in a specific portion of an image or
region of interest.

In 2005, Dalal and Triggs published a research paper named Histograms of Oriented
Gradients for Human Detection. After the release of this paper, HOG is used in a lot of
object detection applications.

Here are the most important aspects of HOG:

• HOG focuses on the structure of the object. It extracts the information of the
edges magnitude as well as the orientation of the edges.
• It uses a detection window of 64x128 pixels, so the image is first converted
into (64, 128) shape.
• The image is then further divided into small parts, and then the gradient and
orientation of each part is calculated. It is divided into 8x16 cells into blocks with
50% overlap, so there are going to be 7x15 = 105 blocks in total, and each block
consists of 2x2 cells with 8x8 pixels.
• We take the 64 gradient vectors of each block (8x8 pixel cell) and put them into
a 9-bin histogram.

Below are the essential steps we take on HOG feature extraction:


Resizing the Image

As mentioned previously, if you have a wide image, then crop the image to the specific
part in which you want to apply HOG feature extraction, and then resize it to the
appropriate shape.

Calculating Gradients

Now after resizing, we need to calculate the gradient in the x and y direction. The
gradient is simply the small changes in the x and y directions, we need to convolve two
simple filters on the image.

The filter for calculating gradient in the x-direction is:

#importing required libraries


from skimage.io import imread
from skimage.transform import resize
from skimage.feature import hog
from skimage import exposure
import matplotlib.pyplot as plt

# reading the image


img = imread('cat.jpg')
plt.axis("off")
plt.imshow(img)
print(img.shape)

Resizing the image:

# resizing image
resized_img = resize(img, (128*4, 64*4))
plt.axis("off")
plt.imshow(resized_img)
print(resized_img.shape)

Now we simply use hog() function from scikit-image library:

#creating hog features


fd, hog_image = hog(resized_img, orientations=9, pixels_per_cell=(8, 8),
cells_per_block=(2, 2), visualize=True, multichannel=True)
plt.axis("off")
plt.imshow(hog_image, cmap="gray")

The hog() function takes 6 parameters as input:

• image: The target image you want to apply HOG feature extraction.
• orientations: Number of bins in the histogram we want to create, the original
research paper used 9 bins so we will pass 9 as orientations.
• pixels_per_cell: Determines the size of the cell, as we mentioned earlier, it
is 8x8.
• cells_per_block: Number of cells per block, will be 2x2 as mentioned
previously.
• visualize: A boolean whether to return the image of the HOG, we set it
to True so we can show the image.
• multichannel: We set it to True to tell the function that the last dimension is
considered as a color channel, instead of spatial.

• Finally, if you want to save the images:

• # save the images


• plt.imsave("resized_img.jpg", resized_img)
• plt.imsave("hog_image.jpg", hog_image, cmap="gray")
EXPERIMENT-10
AIM:Project based on computer vision Application.

Computer Vision is a field of Artificial Intelligence (AI) that focuses on


interpreting and extracting information from images and videos using various
techniques. It is an emerging and evolving field within AI. Computer Vision
applications have become an integral part of our daily lives, permeating various
aspects of our routines. These applications enclose a wide range of domains,
including reverse engineering, security inspections, image processing,
computer animation, autonomous navigation, and robotics.

In this article, we will be exploring some of the best Computer Vision projects.
These projects range from beginner-level to expert-level, catering to individuals
at different skill levels. Each Computer Vision project will provide you with
comprehensive guides, source codes, and datasets, enabling you to delve
straight into practical implementation and hands-on experience in the field of
computer vision.
What is Computer Vision?
Computer vision is a field of study within artificial intelligence (AI) that focuses
on enabling computers to Intercept and extract information from images and
videos, in a manner similar to human vision. It involves developing algorithms
and techniques to extract meaningful information from visual inputs and make
sense of the visual world.

You might also like