[PictoBloxExtension]

Object Detection (ML)

Extension Description

Create ML models to custom object detection in images.

Available in: Block Coding, Python Coding
Mode: Stage Mode
WiFi Required: No
Compatible Hardware in Block Coding: evive, Quarky, Arduino Uno, Arduino Mega, Arduino Nano, ESP32, T-Watch, Boffin, micro:bit, TECbits, LEGO EV3, LEGO Boost, LEGO WeDo 2.0, Go DFA, None
Compatible Hardware in Python: Quarky, None
Object Declaration in Python: Not Applicable
Extension Catergory: ML Environment

Introduction

The Object Detection extension of the PictoBlox Machine Learning Environment is used to detect particular targets present in a given picture.

Tutorial on using Object Detection (ML) in Block Coding

Tutorial on using Object Detection (ML) in Python Coding

Image Classification vs Object Detection

There is always confusion between image classification and object detection. Let’s look at it.

Image Classification deals with categorizing images based on their characteristics. These characteristics are observed and extracted from the images by the training model. Once extracted, these characteristics can be used as a set of rules to classify previously unseen data. Let’s understand this with an example.

Observe these images:

Image 1:

Image 2:

Thanks to the human brain, it’s no big feat for us to make out that Image 1 is that of a cat and Image 2 is that of a dog. However, computers do not possess the intelligence we do, hence we need to train them to recognize such images. The same is true for any scenario where we want the computer to categorize data.

Now observe this image:

Again, our human brain doesn’t break a sweat in making out that there are two animals in the picture, one cat, and one dog. However, a computer trained to recognize cats and dogs separately will not be able to make complete sense of the image. This is where object detection comes into play.

The biggest characteristic of Object Detection is the fact that it takes the location of objects into consideration. Object Detection algorithms can detect not only the class of the target objects but also their location.

Hence, Object Detection is extremely useful when there are multiple objects present in an image.

Opening Image Classifier Workflow

Alert: The Machine Learning Environment for model creation is available in the only desktop version of PictoBlox for Windows, macOS, or Linux. It is not available in Web, Android, and iOS versions.

Follow the steps below:

Open PictoBlox and create a new file.
Select the appropriate Coding Environment.
Select the “Open ML Environment” option under the “Files” tab to access the ML Environment.
You’ll be greeted with the following screen.
Click on “Create New Project“.
A window will open. Type in a project name of your choice and select the “Object Detection” extension. Click the “Create Project” button to open the Object Detection window.
You shall see the Object Detection workflow with two classes already made for you. Your environment is all set. Now it’s time to upload the data.

Adding Data to Project

There are 3 ways to add the images to the project:

Webcam: You can take photos from the camera directly using this option.
File Upload: You can upload the images from the local file system.
Downloading from PictoBlox Database: This gives you the option of downloading pre-annotated images and annotating images captured manually by the user.The images imported from the Database are already labeled for training.

Note: The best results are achieved by using a combination of downloaded images and images from the webcam. However, images from the webcam have to be labeled manually.

Bounding Box – Labelling Images

A bounding box is an imaginary rectangle that serves as a point of reference for object detection and creates a collision box for that object.

We draw these rectangles over images, outlining the object of interest within each image by defining its X and Y coordinates. This makes it easier for machine learning algorithms to find what they’re looking for, determine collision paths, and conserves valuable computing resources.

Object detection has two components: object classification and object localization. In other words, to detect an object in an image, the computer needs to know what it is and where it is.

Take self-driving cars as an example. An annotator will draw bounding boxes around other vehicles and label them. This helps train an algorithm to understand what vehicles look like. Annotating objects such as vehicles, traffic signals, and pedestrians makes it possible for autonomous vehicles to maneuver busy streets safely. Self-driving car perception models rely heavily on bounding boxes to make this possible.

Bounding boxes are used in all of these areas to train algorithms to identify patterns.

To create the bounding box in the images, click on the “Create Box” button, to create a bounding box. After the box is drawn, go to the “Label List” column and click on the edit button, and type in a name for the object under the bounding box. This name will become a class. Once you’ve entered the name, click on the tick mark to label the object.

Once you’ve labeled an object, its count is updated in the “Class Info” column. You can simply click on the class to classify another object under that label.

Options in Bounding Box:

Auto Save: This option allows you to auto-save the bounding boxes with the labels they are created. You do not need to save the images every time this option is enabled.
Manual Save: This option disables the auto-saving of the bounding boxes. When this option is enabled you have to save the image before moving on to the next image for labeling.
Create Box: This option starts the cursor on the images to create the bounding box. When the box is created, you can label it in the Label List.
Save Box: This option saves all the bounding boxes created under the Label List.
File List: It shows the list of images available for labeling in the project.
Label List: It shows the list of Labels created for the selected image.
Image Info: It shows the summary of the images – labeled and unlabelled images.
Class Info: It shows the summary of the classes with the total number of bounding boxes created for each class.

Note: If the images are imported from the PictoBlox Database, the images will be labeled for the users.

Analyzing Images

It is important that we analyze how the images are labeled. The Images tab allows you to analyze the images.

You can edit the images by clicking directly on the images.

Training the Model

In Object Detection, the model must locate and identify all the targets in the given image. This makes Object Detection a complex task to execute. Hence, the hyperparameters work differently in the Object Detection Extension.

Follow the process:

Go to the “Train” tab. You should see the following screen:
Click on the “Train New Model” button. Select the classes that need to be trained, and click on “Generate Dataset”. Once the dataset is generated, click “Next”.
You shall see the training configurations. Observe the hyperparameters.
1. Model name – The name of the model.
2. Batch size – The number of training samples utilized in one iteration. The larger the batch size, the larger the RAM required.
3. Number of iterations – The number of times your model will iterate through a batch of images.
4. Number of layers – The number of layers in your model. Use more layers for large models.
  
  Note: Hover your mouse over the question mark next to the hyperparameters to see their description.
Specify your hyperparameters. If the numbers go out of range, PictoBlox will show a message. Click “Create”.
Click “Start Training”. If desired performance is reached, click on the “Stop”.

Note: Training an Object Detection model is a time taking task. It might take a couple of hours to complete training.
After the training is completed, you’ll see four loss graphs. You’ll be able to see the graphs under the “Graphs” panel. Click on the buttons to view the graph.
1. Total Loss
2. Regularization Loss
3. Localization Loss
4. Classification Loss

Note: You can train multiple models for the same dataset to see the performance and accuracy of the training.

Evaluating the Model

Now, let’s move to the “Evaluate” tab. You can view True Positives, False Negatives, and False Positives for each class here along with metrics like Precision and Recall.

A true positive is an outcome where the model correctly predicts the positive class.
A true negative is an outcome where the model correctly predicts the negative class.
A false positive is an outcome where the model incorrectly predicts the positive class.
A false negative is an outcome where the model incorrectly predicts the negative class.

Precision and recall are two numbers that together are used to evaluate the performance of object detection.

Precision: Precision is calculated by dividing the true positives by anything that was predicted as a positive.
Recall (or True Positive Rate) is calculated by dividing the true positives by anything that should have been predicted as positive.

A perfect model has precision and recalls both equal to 1.

You can visualize the Precision and Recall in the Evaluation graphs of the class or the whole model:

You can select the individual class and look at the performance of the class:

Confidence Threshold & IoU

The confidence score is the probability that an anchor box contains an object. It is usually predicted by a classifier.
Intersection over Union (IoU) is defined as the area of the intersection divided by the area of the union of a predicted bounding box and a ground-truth box:

Both confidence score and IoU are used as the criteria that determine whether detection is a true positive or a false positive.

A detection is considered a true positive (TP) only if it satisfies three conditions:
1. confidence score > threshold;
2. the predicted class matches the class of a ground truth;
3. the predicted bounding box has an IoU greater than a threshold (e.g., 0.5) with the ground truth.
Violation of either of the latter two conditions makes a false positive (FP). In case multiple predictions correspond to the same ground truth, only the one with the highest confidence score counts as a true positive, while the remainings are considered false positives.
When the confidence score of detection that is supposed to detect a ground truth is lower than the threshold, the detection counts as a false negative (FN).
When the confidence score of detection that is not supposed to detect anything is lower than the threshold, the detection counts as a true negative (TN).

Note: You can adjust the Confidence Threshold and IoU to make the model give more precise results.

Testing the Model

The model can be tested with the following methods:

Uploading Image:
Webcam:

The model will return the probability of the input belonging to the classes.

Note: The Confidence threshold can be changed to define the minimum prediction score to show the result.

The Model can be exported to 4 different forms:

PictoBlox: The model can be exported to the blocks and functions which we will see further in the next section.
TF Lite: It exports the model into the TF Lite package. TensorFlow Lite provides a set of tools that enables on-device machine learning by allowing developers to run their trained models on mobile, embedded, and IoT devices and computers. It supports platforms such as embedded Linux, Android, iOS, and MCU.
TensorFlow.js: It exports the model into the TensorFlow.js package to run in the browser and in Node.js architecture.
Frozen Graph: It exports the model as the TF Saved Model to run in a Python environment. A SavedModel contains a complete TensorFlow program, including trained parameters and computation. It does not require the original model building code to run.

Export in Block Coding

Click on the “PictoBlox” button, and PictoBlox will load your model into the Block Coding Environment if you have opened the ML Environment in the Block Coding.

Export in Python Coding

Click on the “PictoBlox” button, and PictoBlox will load your model into the Python Coding Environment if you have opened the ML Environment in Python Coding.

The following code appears in the Python Editor of the selected sprite.

####################imports####################
# Do not change

import cv2
import numpy as np
import tensorflow.compat.v2 as tf

# Do not change
####################imports####################

#Following are the model and video capture configurations
# Do not change

detect_fn = tf.saved_model.load("saved_model")

cap = cv2.VideoCapture(0)  # Using device's camera to capture video
font = cv2.FONT_HERSHEY_SIMPLEX
fontScale = 1
color_box = (50, 50, 255)
color_text = (255, 255, 255)
thickness = 2

class_list = [
    'Lion',
    'Zebra',
    'Elephant',
    'Tiger',
    'Panda',
]  # List of all the classes

#This is the while loop block, computations happen here
while True:

  ret, image_np = cap.read()  # Read Frame
  height, width, channels = image_np.shape  # Get height, wdith
  image_resized = cv2.resize(image_np,
                             (320, 320))  # Resize image to model input size
  input_tensor = tf.convert_to_tensor(image_resized)  # Convert image to tensor
  input_tensor = input_tensor[tf.newaxis,
                              ...]  # Expanding the tensor dimensions

  detections = detect_fn(input_tensor)  #Pass image to model

  num_detections = int(detections.pop('num_detections'))  #Postprocessing
  detections = {
      key: value[0, :num_detections].numpy()
      for key, value in detections.items()
  }
  detections['num_detections'] = num_detections
  detections['detection_classes'] = detections['detection_classes'].astype(
      np.int64)

  # Draw recangle around detection object
  for j in range(len(detections['detection_boxes'])):
    # Set minimum threshold to 0.3
    if (detections['detection_scores'][j] > 0.3):
      # Starting and end point of detected object
      starting_point = (int(detections['detection_boxes'][j][1] * width),
                        int(detections['detection_boxes'][j][0] * height))
      end_point = (int(detections['detection_boxes'][j][3] * width),
                   int(detections['detection_boxes'][j][2] * height))
      # Class name of detected object
      className = class_list[detections['detection_classes'][j] - 1]
      # Starting point of text
      starting_point_text = (int(
          detections['detection_boxes'][j][1] *
          width), int(detections['detection_boxes'][j][0] * height) - 5)
      # Draw rectangle and put text
      image_np = cv2.rectangle(image_np, starting_point, end_point, color_box,
                               thickness)
      image_np = cv2.putText(image_np, className, starting_point_text, font,
                             fontScale, color_text, thickness, cv2.LINE_AA)
  # Show image in new window
  cv2.imshow("Detection Window", image_np)

  if cv2.waitKey(25) & 0xFF == ord(
      'q'):  # Press 'q' to close the classification window
    break

cap.release()  # Stops taking video input
cv2.destroyAllWindows()  # Closes input window

Note: You can edit the code to add custom code according to your requirement.

PictoBlox Blocks

Moves the wizbot in arc length in specified direction and step length from the options.

The block is used to draw an outline of the triangle or a filled triangle from three points on evive TFT display. It takes the color and the 3 points for the corner of the triangle to draws a triangle.

When the block is executed it plays the tone of specified frequency/note for a specific duration/beat. The note and the beat can be selected from the drop-down menu. Also, the user can input the specific frequency and duration (in milliseconds).

Dabble input module has 2 potentiometers whose value can be varied from 0 to 1023 by the user. This block reports the current value of the selected potentiometer.

The block sets the end-effector to the specified position on the selected axis and the other two positions remain the same.

This block sets the value of the selected servo by the value you enter. Whereas the angle of other servos remains the same.

The block changes the selected sprite’s X position to a specified value.

The block changes the specified effect on its sprite by the specified amount. There are seven different effects to choose from: colour, fisheye, whirl, pixelate, mosaic, brightness and ghost.

The block reports how loud the noise is that a microphone receives, on a scale of 0 to 100. To use this block, a microphone must be used, and so a message will appear on the screen, asking for permission to use the microphone. If you deny it, the block will report a loudness of 0 or -1.

The block picks a pseudorandom number ranging from the first given number to the second, including both endpoints. If both numbers have no decimals, it will report a whole number. For example, if a 1 and a 3 were inputted, the block could return a 1, 2 or 3. If one of the numbers has a decimal point, even .0, it reports a number with a decimal. For example, if 0.1 and 0.14 were given, the output will be 0.1, 0.11, 0.12, 0.13, or 0.14.

The block replaces the specified item; in other words, it changes the item’s content to the given text.

The function allows the user to add a particular face into the database from the camera or stage. The user can specify the name of the face with the argument as well. This addition of the face in the database is also stored inside the PictoBlox file while saving.

The block returns the specified parameter for the specified number card detected.

The function set the API keys for the Open Weather Map API calls.

The block returns the hex code of the Red, Green, and Blue values specified.

The block returns the state of the digital sensor connected to the specified pin of the Quarky.

The block does the step simulation for the Physics Engine. This block is required to run in a loop for the physics to work.

The block reports the moisture reading from the sensor. The value varies from 0 to 100%.

The block sends multiple data to the ThingSpeak channel with a delay of the specified time seconds. The data is mapped to the 8 fields of the ThingSpeak channel.

The block executes the oscillator according to stored parameters for the servo motor and the current angle specified in the block.

The block sets the oscillator parameters for the selected servo motor.

The block moves the servo motors of the pick and place robot to the place angle specified by the user.

The block sets the user API key of the ChatGPT in the project.

The block makes the specified LED turn ON or OFF on the 8×8 Dot Matrix display.

This block is used to move the end-effector to the specified position of the selected axis, while the other two positions remain the same.

This block allows the user to control the end-effector of a robotic arm to move in a specified axis by a set value, with all other axes remaining constant.

This block is use for set the motor speed while doing the line following and Turning.

Moves the wizbot forward for a √2 step length on the grid pattern.

The block sets the time on evive’s Real Time Clock (RTC) to the time specified by the user in the input.

Dabble phone sensor module give the real-time reading of the following sensors to evive: Accelerometer, gyroscope, proximity sensor, magnetometer, light meter, sound meter, GPS, temperature sensor and barometer. This block reports the current value of the selected sensor.

All articles loaded

No more articles to load

Block Coding Examples

There are no block coding examples for the extension to show.

Python Coding Examples

There are no python examples for the extension to show.

Object Detection (ML)

Introduction

Tutorial on using Object Detection (ML) in Block Coding

Tutorial on using Object Detection (ML) in Python Coding

Image Classification vs Object Detection

Opening Image Classifier Workflow

Adding Data to Project

Bounding Box – Labelling Images

Analyzing Images

Training the Model

Evaluating the Model

Confidence Threshold & IoU

Testing the Model

Export in Block Coding

Export in Python Coding

PictoBlox Blocks

Block Coding Examples

Python Coding Examples

Company

Community

Products

School Programs

Impact Programs

Learning Resources

Product Documentation

Get in Touch

Follow Us

PictoBlox - Block & Python Coding for Kids

Dabble App - One App. Infinite Control.