A Guide to Build your Custom Object Detection Model using YOLOv3

Introduction

Object Detection is a part of the Computer Vision technique to localize the object in the image and classify it. As we humans see what the object is, we also make computers to understand what the image is and where it is localized.

Fig1

Fig 1: Source Google

As you can see in above Fig 1 Object detection compromises Classification and Localization.

It is possible for computers to observe, recognize, and analyze objects in images and videos in a similar fashion to how people do so using Computer Vision, a field of artificial intelligence that uses machine learning and deep learning. The use of Computer Vision for automated AI visual inspection, remote monitoring, and automation is quickly gaining prominence.

In this blog, you will come to know how to train and detect custom object detection using You only Look once V3. In the end, I am sure that you can implement your custom object detection. I have used Google Colab for training purposes. And for the demo, I have used Face Mask Detection, as it is a binary class (With Mask or Without Mask). Also, I have mentioned the requirements to get started.

Customized Cloud Solutions to Drive your Business Success

Cloud Migration
Devops
AIML & IoT

Know More

About YOLO

The YOLO (You Only Look Once) was written by Joseph Redmon using a framework called Darknet. YOLO is an open source and the state of the art algorithm for real-time object detection. There are multiple versions of YOLO (V2, V3, V4, and V5). We will be using Yolo V3 for easy training.

The initial version presented the overall architecture, the second iteration improved the design and used pre-defined anchor boxes to boost the bounding box proposal, and the third iteration further improved the model architecture and training procedure.

Step-by-Step Guide on Custom Object Detection Model

Here we will be creating Face Mask Detection using YOLO v3

Step 0: Custom Dataset Creation and Labelling

You have to collect the data for custom training. After preparing the dataset, it is recommended to you use the LabelImg tool, which can be used to create bounding boxes and actual labels for the images.

Easy installation:

pip install labelImg

For more reference: https://github.com/tzutalin/labelImg

After installing,

Create a new folder “Train” and create a “class.txt” file
In the class.txt file create the class label

Example: 0 Mask Not Detected

1 Mask Detected

Create obj.data file and modify the content below
classes= 2 (person with mask and without mask)
train = data/train.txt
valid = data/train.txt
names = data/obj.names
backup = backup/

Classes: represent no of classes
path to train data
path to test data
Create another folder “Images” under the “Train” Folder
Move all images (of different classes) to the “Images” folder

Now using the labelImg tool, create a bounding Box for the dataset

Make sure you save the image with the bounding box in the same Folder “/train/images” and save it in YOLO Format

Upload to google drive or GitHub account as a zip file

Congrats, one big step has been completed.

Step 1: Cloning the Darknet repository for YOLO architecture

Here, we are cloning the architecture of yolov3 which is used for detection.

!git clone https://github.com/AlexeyAB/darknet.git

Step 2: Configuring the MakeFile

Here, we are going to make some changes in the Make File for further computation.

2.1 Change the directory to Darknet Folder

2.2 Make sure You have GPU installed

2.3 Make Changes to GPU and OPENCV from 0 to 1

‘1’ represents to activate or use
!sed – stream editor
!cat – Makefile Cat(concatenate),it will read the file
!cat Makefile
!make

Step 3: Download the pre-trained weights

We download the weights so that we can initialize them with pre-trained models and train them for our dataset.
!wget https://pjreddie.com/media/files/darknet53.conv.74
Download the respective weight for the respective cfg file. As for yolov3, I have used yolov3.cfg and darknet53.conv.74 weights. To use Yolov4 you can refer to the Alexab GitHub page (https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects)

Step 4: Upload the dataset you have stored into Git hub or google drive

Make sure you have the below-defined files,

obj.data
obj.names
dataset ( images )

Unzip the data files we zipped before

!unzip data/custom.zip -d data/ # adjust the path

object.data

objdata

Object.names

objname

With the above files, you also need train.txt where it says the path of every image for training, and for validation it’s optional.

Step 5: Configuring the Yolo cfg file

We now going to make some changes to yolov3.cfg file available in Darknet/cfg folder

random 0 to 1
Max_batch = No_of classes * 2000
Filters = (classes + 5)x3
Subdivisions should be 8 batches to 32
Set network size width=416 height=416 or any value multiple of 32
Change line classes=80 to your number of objects ( e.g.: 2 )
To Configuring the cfg file run the below command

!sed -i ‘s/batch=1/batch=32/g’ cfg/yolov3.cfg !sed -i ‘s/subdivisions=1/subdivisions=8/g’ cfg/yolov3.cfg !sed -i ‘s/random=1/random=0/g’ cfg/yolov3.cfg !sed -i ‘s/max_batches = 500200/max_batches = 4000/g’ cfg/yolov3.cfg !sed -i ‘s/steps=400000,450000/steps=3200,3600/g’ cfg/yolov3.cfg !sed -i ‘s/classes=80/classes=2/g’ cfg/yolov3.cfg !sed -i ‘s/filters=255/filters=21/g’ cfg/yolov3.cfg !cat cfg/yolov3.cfg

Step 6: Train and Test model

For Linux use the below command

train

Step 7: When should I stop the training?

In the training part, you will see average loss, IoU, ith iteration as output
Make note of the average loss once the loss starts to increase rather than decrease continuously. If your average loss is increasing, then you should stop the training
After every 100 iterations, you will see, the weight’s are downloaded to the darknet/backup folder and after every 1000 iterations Weight’s will be stored in the darknet/backup folder
So now, let’s check the accuracy of our weight’s using a map indicator

For example, you have 3 different weight files (7000, 8000, and 9000th iterations)

darknet.exe detector map data/obj.data yolo-obj.cfg backup\yolo-obj_7000.weights

Replace 7000 with 8000 and 9000

Choose weights-file with the highest mAP (mean average precision) or IoU (intersect over union) darknet.exe detector train data/obj.data yolo-obj.cfg yolov4.conv.137 -map

For windows use,

Darknet.exe instead of !./darknet

darkent

Step 8: Testing with input Images / Videos

ImageDetection :

For Linux,

!./darknet detector test data/obj.data cfg/yolov3.cfg /content/weights/yolov3_1300.weights /content/darknet/data/image_test01.jpg -dont_show

Video Detection :

!./darknet detector demo data/obj.data cfg/yolov3.cfg /content/weights/yolov3_1300.weights -dont_show videoname -i 0 -out_filename me_06.avi -thresh 0.7

That’s it. Congratulation, you made it.

Sample Output

step8

Video Source Detection:

https://youtu.be/NcPq3xSN9vM

Conclusion

To elevate the custom object detection using Yolo, we created the Person with Mask and Without dataset and labeled it carefully using the tool LableImg. With that, we choose Yolo v3 as an architecture for faster detection. At last, trained and tested successfully in Google Colab.

Git Hub Reference: https://github.com/Ganesh9100/Mask-Detection-YOLO_V3-

With more Training data and different classes, the model can be used for many Real-Time Applications.

Get your new hires billable within 1-60 days. Experience our Capability Development Framework today.

Cloud Training
Customized Training
Experiential Learning

About CloudThat

CloudThat is also the official AWS (Amazon Web Services) Advanced Consulting Partner and Training partner and Microsoft gold partner, helping people develop knowledge of the cloud and help their businesses aim for higher goals using best in industry cloud computing practices and expertise. We are on a mission to build a robust cloud computing ecosystem by disseminating knowledge on technological intricacies within the cloud space. Our blogs, webinars, case studies, and white papers enable all the stakeholders in the cloud computing sphere.

Drop a query if you have any questions regarding YOLO, Object Detection and I will get back to you quickly.

To get started, go through our Consultancy page and Managed Services Package that is CloudThat’s offerings.

FAQs

1. What is LabelImg?

ANS: – It is a graphical image annotation tool written in python.
Installation : pip install labelImg

2. What is Yolo Cfg file?

ANS: – It is a configuration file where it has some parameters like batch, subdivisions, decay, etc.

3. What is darknet ?

ANS: – It is an opensource predefined neural network framework written in C and CUDA and also it supports CPU and GPU computations

WRITTEN BY Ganesh Raj

Ganesh Raj V works as a Sr. Research Associate at CloudThat. He is a highly analytical, creative, and passionate individual experienced in Data Science, Machine Learning algorithms, and Cloud Computing. In a quest to learn and work with recent technologies, he strives hard to stay updated on advanced technologies along efficiently solving problems analytically.