Skip to content

Latest commit

 

History

History

Detection

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

ImageAI : Object Detection

---------------------------------------------------

Introducing Jarvis and TheiaEngine.

We the creators of ImageAI are glad to announce 2 new AI projects to provide state-of-the-art Generative AI, LLM and Image Understanding on your personal computer and servers.

Install Jarvis on PC/Mac to setup limitless access to LLM powered AI Chats for your every day work, research and generative AI needs with 100% privacy and full offline capability.

Visit https://jarvis.genxr.co to get started.

TheiaEngine, the next-generation computer Vision AI API capable of all Generative and Understanding computer vision tasks in a single API call and available via REST API to all programming languages. Features include

  • Detect 300+ objects ( 220 more objects than ImageAI)
  • Provide answers to any content or context questions asked on an image
    • very useful to get information on any object, action or information without needing to train a new custom model for every tasks
  • Generate scene description and summary
  • Convert 2D image to 3D pointcloud and triangular mesh
  • Semantic Scene mapping of objects, walls, floors, etc
  • Stateless Face recognition and emotion detection
  • Image generation and augmentation from prompt
  • etc.

Visit https://www.genxr.co/theia-engine to try the demo and join in the beta testing today.

---------------------------------------------------

TABLE OF CONTENTS

ImageAI provides very convenient and powerful methods to perform object detection on images and extract each object from the image. The object detection class supports RetinaNet, YOLOv3 and TinyYOLOv3. To start performing object detection, you must download the RetinaNet, YOLOv3 or TinyYOLOv3 object detection model via the links below:

  • RetinaNet (Size = 130 mb, high performance and accuracy, with longer detection time)
  • YOLOv3 (Size = 237 mb, moderate performance and accuracy, with a moderate detection time)
  • TinyYOLOv3 (Size = 34 mb, optimized for speed and moderate performance, with fast detection time)

Once you download the object detection model file, you should copy the model file to the your project folder where your .py files will be. Then create a python file and give it a name; an example is FirstObjectDetection.py. Then write the code below into the python file:

FirstObjectDetection.py

from imageai.Detection import ObjectDetection
import os

execution_path = os.getcwd()

detector = ObjectDetection()
detector.setModelTypeAsYOLOv3()
detector.setModelPath( os.path.join(execution_path , "yolov3.pt"))
detector.loadModel()
detections = detector.detectObjectsFromImage(input_image=os.path.join(execution_path , "image2.jpg"), output_image_path=os.path.join(execution_path , "image2new.jpg"), minimum_percentage_probability=30)

for eachObject in detections:
    print(eachObject["name"] , " : ", eachObject["percentage_probability"], " : ", eachObject["box_points"] )
    print("--------------------------------")

Sample Result: Input Image Input Image Output Image Output Image

laptop  :  87.32235431671143  :  (306, 238, 390, 284)
--------------------------------
laptop  :  96.86298966407776  :  (121, 209, 258, 293)
--------------------------------
laptop  :  98.6301600933075  :  (279, 321, 401, 425)
--------------------------------
laptop  :  99.78572130203247  :  (451, 204, 579, 285)
--------------------------------
bed  :  94.02391314506531  :  (23, 205, 708, 553)
--------------------------------
apple  :  48.03136885166168  :  (527, 343, 557, 364)
--------------------------------
cup  :  34.09906327724457  :  (462, 347, 496, 379)
--------------------------------
cup  :  44.65090036392212  :  (582, 342, 618, 386)
--------------------------------
person  :  57.70219564437866  :  (27, 311, 341, 437)
--------------------------------
person  :  85.26121377944946  :  (304, 173, 387, 253)
--------------------------------
person  :  96.33603692054749  :  (415, 130, 538, 266)
--------------------------------
person  :  96.95255160331726  :  (174, 108, 278, 269)
--------------------------------

Let us make a breakdown of the object detection code that we used above.

from imageai.Detection import ObjectDetection
import os

execution_path = os.getcwd()

In the 3 lines above , we import the ImageAI object detection class in the first line, import the os in the second line and obtained the path to folder where our python file runs.

detector = ObjectDetection()
detector.setModelTypeAsYOLOv3()
detector.setModelPath( os.path.join(execution_path , "yolov3.pt"))
detector.loadModel()

In the 4 lines above, we created a new instance of the ObjectDetection class in the first line, set the model type to YOLOv3 in the second line, set the model path to the YOLOv3 model file we downloaded and copied to the python file folder in the third line and load the model in the fourth line.

detections = detector.detectObjectsFromImage(input_image=os.path.join(execution_path , "image2.jpg"), output_image_path=os.path.join(execution_path , "image2new.jpg"))

for eachObject in detections:
    print(eachObject["name"] , " : ", eachObject["percentage_probability"], " : ", eachObject["box_points"] )
    print("--------------------------------")

In the 2 lines above, we ran the detectObjectsFromImage() function and parse in the path to our image, and the path to the new image which the function will save. Then the function returns an array of dictionaries with each dictionary corresponding to the number of objects detected in the image. Each dictionary has the properties name (name of the object), percentage_probability (percentage probability of the detection) and box_points (the x1,y1,x2 and y2 coordinates of the bounding box of the object).

Should you want to use the RetinaNet which is appropriate for high-performance and high-accuracy demanding detection tasks, you will download the RetinaNet model file from the links above, copy it to your python file's folder, set the model type and model path in your python code as seen below:

detector = ObjectDetection()
detector.setModelTypeAsRetinaNet()
detector.setModelPath( os.path.join(execution_path , "retinanet_resnet50_fpn_coco-eeacb38b.pth"))
detector.loadModel()

However, if you desire TinyYOLOv3 which is optimized for speed and embedded devices, you will download the TinyYOLOv3 model file from the links above, copy it to your python file's folder, set the model type and model path in your python code as seen below:

detector = ObjectDetection()
detector.setModelTypeAsTinyYOLOv3()
detector.setModelPath( os.path.join(execution_path , "tiny-yolov3.pt"))
detector.loadModel()

Object Detection, Extraction and Fine-tune

In the examples we used above, we ran the object detection on an image and it returned the detected objects in an array as well as save a new image with rectangular markers drawn on each object. In our next examples, we will be able to extract each object from the input image and save it independently.

In the example code below which is very identical to the previous object detction code, we will save each object detected as a seperate image.

from imageai.Detection import ObjectDetection
import os

execution_path = os.getcwd()

detector = ObjectDetection()
detector.setModelTypeAsYOLOv3()
detector.setModelPath( os.path.join(execution_path , "yolov3.pt"))
detector.loadModel()

detections, objects_path = detector.detectObjectsFromImage(input_image=os.path.join(execution_path , "image3.jpg"), output_image_path=os.path.join(execution_path , "image3new.jpg"), minimum_percentage_probability=30,  extract_detected_objects=True)

for eachObject, eachObjectPath in zip(detections, objects_path):
    print(eachObject["name"] , " : " , eachObject["percentage_probability"], " : ", eachObject["box_points"] )
    print("Object's image saved in " + eachObjectPath)
    print("--------------------------------")

Input Image Output Images

dog motorcycle car bicycle person person person person person

Let us review the part of the code that perform the object detection and extract the images:

detections, objects_path = detector.detectObjectsFromImage(input_image=os.path.join(execution_path , "image3.jpg"), output_image_path=os.path.join(execution_path , "image3new.jpg"), minimum_percentage_probability=30,  extract_detected_objects=True)

for eachObject, eachObjectPath in zip(detections, objects_path):
    print(eachObject["name"] , " : " , eachObject["percentage_probability"], " : ", eachObject["box_points"] )
    print("Object's image saved in " + eachObjectPath)
    print("--------------------------------")

In the above above lines, we called the detectObjectsFromImage() , parse in the input image path, output image path, and an extra parameter extract_detected_objects=True. This parameter states that the function should extract each object detected from the image and save it has a seperate image. The parameter is false by default. Once set to true, the function will create a directory which is the output image path + "-objects" . Then it saves all the extracted images into this new directory with each image's name being the detected object name + "-" + a number which corresponds to the order at which the objects were detected.

This new parameter we set to extract and save detected objects as an image will make the function to return 2 values. The first is the array of dictionaries with each dictionary corresponding to a detected object. The second is an array of the paths to the saved images of each object detected and extracted, and they are arranged in order at which the objects are in the first array.

And one important feature you need to know! You will recall that the percentage probability for each detected object is sent back by the detectObjectsFromImage() function. The function has a parameter minimum_percentage_probability, whose default value is 50 (value ranges between 0 - 100) , but it set to 30 in this example. That means the function will only return a detected object if it's percentage probability is 30 or above. The value was kept at this number to ensure the integrity of the detection results. You fine-tune the object detection by setting minimum_percentage_probability equal to a smaller value to detect more number of objects or higher value to detect less number of objects.

Custom Object Detection

The object detection model (RetinaNet) supported by ImageAI can detect 80 different types of objects. They include:

person,  bicycle,  car, motorcycle, airplane, bus, train,  truck,  boat,  traffic light,  fire hydrant, stop_sign,
parking meter,   bench,   bird,   cat,   dog,   horse,   sheep,   cow,   elephant,   bear,   zebra,
giraffe,   backpack,   umbrella,   handbag,   tie,   suitcase,   frisbee,   skis,   snowboard,
sports ball,   kite,   baseball bat,   baseball glove,   skateboard,   surfboard,   tennis racket,
bottle,   wine glass,   cup,   fork,   knife,   spoon,   bowl,   banana,   apple,   sandwich,   orange,
broccoli,   carrot,   hot dog,   pizza,   donot,   cake,   chair,   couch,   potted plant,   bed,
dining table,   toilet,   tv,   laptop,   mouse,   remote,   keyboard,   cell phone,   microwave,   oven,
toaster,   sink,   refrigerator,   book,   clock,   vase,   scissors,   teddy bear,   hair dryer,   toothbrush.

Interestingly, ImageAI allow you to perform detection for one or more of the items above. That means you can customize the type of object(s) you want to be detected in the image. Let's take a look at the code below:

from imageai.Detection import ObjectDetection
import os

execution_path = os.getcwd()

detector = ObjectDetection()
detector.setModelTypeAsYOLOv3()
detector.setModelPath( os.path.join(execution_path , "yolov3.pt"))
detector.loadModel()

custom_objects = detector.CustomObjects(car=True, motorcycle=True)
detections = detector.detectCustomObjectsFromImage(custom_objects=custom_objects, input_image=os.path.join(execution_path , "image3.jpg"), output_image_path=os.path.join(execution_path , "image3custom.jpg"), minimum_percentage_probability=30)

for eachObject in detections:
    print(eachObject["name"] , " : ", eachObject["percentage_probability"], " : ", eachObject["box_points"] )
    print("--------------------------------")

Result

Let us take a look at the part of the code that made this possible.

custom_objects = detector.CustomObjects(car=True, motorcycle=True)
detections = detector.detectCustomObjectsFromImage(custom_objects=custom_objects, input_image=os.path.join(execution_path , "image3.jpg"), output_image_path=os.path.join(execution_path , "image3custom.jpg"), minimum_percentage_probability=30)

In the above code, after loading the model (can be done before loading the model as well), we defined a new variable custom_objects = detector.CustomObjects(), in which we set its car and motorcycle properties equal to True. This is to tell the model to detect only the object we set to True. Then we call the detector.detectCustomObjectsFromImage() which is the function that allows us to perform detection of custom objects. Then we will set the custom_objects value to the custom objects variable we defined.

Hiding/Showing Object Name and Probability

ImageAI provides options to hide the name of objects detected and/or the percentage probability from being shown on the saved/returned detected image. Using the detectObjectsFromImage() and detectCustomObjectsFromImage() functions, the parameters display_object_name and display_percentage_probability can be set to True of False individually. Take a look at the code below:

detections = detector.detectObjectsFromImage(input_image=os.path.join(execution_path , "image3.jpg"), output_image_path=os.path.join(execution_path , "image3new_nodetails.jpg"), minimum_percentage_probability=30, display_percentage_probability=False, display_object_name=False)

In the above code, we specified that both the object name and percentage probability should not be shown. As you can see in the result below, both the names of the objects and their individual percentage probability is not shown in the detected image.

Result

Image Input & Output Types

ImageAI supports 3 types of inputs which are file path to image file(default), numpy array of image and image file stream as well as 2 types of output which are image file(default) and numpy **array **. This means you can now perform object detection in production applications such as on a web server and system that returns file in any of the above stated formats.

To perform object detection with numpy array or file stream input, you just need to state the input type in the .detectObjectsFromImage() function or the .detectCustomObjectsFromImage() function. See example below.

detections = detector.detectObjectsFromImage(input_type="array", input_image=image_array , output_image_path=os.path.join(execution_path , "image.jpg")) # For numpy array input type
detections = detector.detectObjectsFromImage(input_type="stream", input_image=image_stream , output_image_path=os.path.join(execution_path , "test2new.jpg")) # For file stream input type

To perform object detection with numpy array output you just need to state the output type in the .detectObjectsFromImage() function or the .detectCustomObjectsFromImage() function. See example below.

detected_image_array, detections = detector.detectObjectsFromImage(output_type="array", input_image="image.jpg" ) # For numpy array output type

Documentation

We have provided full documentation for all ImageAI classes and functions. Find links below: