IP Report Final
IP Report Final
3. BACKGROUND:
● Currently, a prototype for the component recognition stage has been created, for
which a custom dataset is trained using the yolov5 algorithm for five components.
The model is able to provide findings with a 68% accuracy for epochs between 10
and 15, along with details in the form of graphs.
● The technique employed will guarantee that the issue statement's needs are met,
and the outcome guarantees a prototype that will satisfy the advancement of
computer vision in the field of component recognition while also delivering higher
production and an optimized cycle time.
5. SUMMARY:
1
6. BRIEF DESCRIPTION OF THE DRAWINGS:
2
7. DETAILED DESCRIPTION:
● As the output is discrete, the deep learning model used in this case is a classification
problem. The trained model generates a confusion matrix based on the number of
classes when tested because the output will likewise be discrete. This kind of
learning is therefore monitored. YOLOv5 is a type of algorithm that makes use of
a unique coco dataset to identify objects in real time.
● Shortly after YOLOv4, the most popular object identification algorithm's fifth
iteration was made public, but this time by Glenn Jocher. YOLO applied the
PyTorch deep learning framework for the first time ever.
● The working of YOLO algorithm: So, what it does is to train the model, it takes
input as the image that is the X train and the labels of the image (after creating the
bounding box) that is the Y train. Labels means the descriptors of the bounding box
as discussed above in text in csv format. After this the algorithm start its training,
as the name says “You Only Look Once”, it will make the grid in the image to be
tested and each grid it will look only once and it will check whether the center point
of the required Object’s bounding box centroid is there in that particular grid or not,
if its there it will make another label having the same four descriptors or else it will
do nothing like that it will run and that’s how it train the model by looking through
each grid only once.
3
● As we can see in Fig. 1, the first grid has no centroid of any bonding box, therefore
its labels are NULL, for the fifth grid, centroid of dog’s bounding box is there, so
label for that grid has values according to the descriptors and similarly for human’s
centroid grid. And after this all this data is trained using CNN as we can see in Fig.
2 and we get the final output when we input data for testing
● Creating a dataset: A dataset is created for two classes. For thirty epochs the model
will be able to deliver results with the accuracy of 70%.
● The dataset is created in such a way that it has two folders, namely labels and
images.
4
5
● The labels for the given images are created with makesenseai.com
6
● The labels of the images are in YOLO format i.e txt format.
7
● Once it is trained for the given epochs it will generate the weights after passing
through the neural network. The weights will be used for the identification of the
object for the given input image.
8
● Identification for a set of different images is also done which are used for validation.
● Here zero refers to Mesh clamp and one refers to Feed Chute.
● For the given input image, the predicted output will be shown with the confidence level
of the prediction.
9
● F1 Score: The F1 score is defined as the harmonic mean of precision and recall. An F1
score is considered perfect when it's 1, while the model is a total failure when it's 0.
10
● No model can be perfect, for the components taken the f1 curve is generated. This curve
combines precision (PPV) and Recall (TPR) in a single visualization. The higher the y-axis
the model performs the better.
● It is seen from the graph that for feed chute it is higher so the model performs better in
identifying feed chute well.
11
● For the problem the classification is done for two classes namely: Mesh clamp and Feed
Chute. A confusion matrix is generated as:
12
● Precision-Confidence curve: The Precision is calculated as the ratio between the
number of positive samples correctly classified to the total number of samples
classified as positive (either correctly or incorrectly)
13
● Recall-Confidence Curve: The recall is calculated as the ratio between the number of
positive samples correctly classified as positive to the total number of positive samples.
14
● Precision-Recall Curve: A precision-recall (PR) curve shows the trade-off between the
precision and recall values for different thresholds.
15
● Labels: For the classification of two components the labels are made for a certain set of
images. This labels are marked in graphs for the specific instances.
● It is seen that Mesh clamp occurs for 25 instances and Feed Chute for 15 instances.
● In the below graph it is seen that all those instances are spread around it then it is classified
separately into two classes.
16
● Results:
● Box_loss: The box loss represents how well the algorithm can locate the center of an object
and how well the predicted bounding box covers an object. The graph shows the box loss
that happened for 30 epochs.
● Object_loss: The train loss represents the confidence of object presence. The graph shows
the object loss that happened for 30 epochs.
● Class_loss: The train loss represents the confidence of the different classes present in the
model. The graph shows the class loss that happened for 30 epochs.
● The losses mentioned above are represented in the form graph for the images of both train
and val.
● Precision is a metric that quantifies the number of correct positive predictions made.
● Recall is a metric that quantifies the number of correct positive predictions made out of all
positive predictions that could have been made.
17
● Mean average precision(mAP): AP (Average precision) is a popular metric in measuring
the accuracy of object detectors like Faster R-CNN, SSD, etc. Average precision computes
the average precision value for recall value over 0 to 1.
Teachable machine:
● Here images of two different classes are fed into the machine.
● Once the model is trained for a given set of epochs the model will generate a desired output.
● For the given model there were no false positives given as input so the accuracy was high
and was able to deliver the desired output.
● Accuracy per class and per epoch is generated and a confusion matrix is formed.
18
● Loss per epoch is also calculated.
8. CLAIMS:
● An object recognition technology which when integrated along with the painting
manufacturing line will ultimately boost the production.
19
9. ABSTRACT:
20