Types of Image Annotation | Detect objection, Classification, Segmentation

5 min readDec 22, 2022

In Data Labeling in a Nutshell, we’ve mentioned that AI can find objects and make a prediction when and only when it has “trained” and “taught” by metadata. In this article, let’s discover 3 kinds of ubiquitous image labeling: classification, object detection and segmentation used to train AI, helping it recognize objects and “learn” the ground truth to make predictions.

Types of Image Annotation | Detect objection, Classification, Segmentation

What is image annotation?

Image annotation is the process of categorizing or labeling an image using language, annotation tools, or both to display the data features you want your model to automatically recognize.

When you annotate an image, you are creating metadata and adding it to a dataset.

Annotation can be simple or complex:

Simple image annotation: may involve annotation of an image, a phrase that describes the objects in pictures. For instance, you might want to annotate an image of a car with a label “a red car”. This is called image classification or tagging.
Complex image annotation: can be used to count, track multiple objects or areas in an image. For instance, you might label the difference between a shoal of fish and your trained model to recognize the red one. The complexity of your annotation will vary, based on the complexity of your project.

Depending on each task, each kind of image annotation will fit for your project. So, let’s discover 3 types of image annotations and find out what’s the difference among them.

3 types of image annotation

Classification annotation

Image classification is the method of annotation that just identifies the presence of similar objects depicted in images across an entire dataset.

This training process is sometimes known as annotation or tagging and the most basic type of data annotation. For example, annotators can tag a town image with labels such as “house” “tree” “people”… But, this method is quite limited because it just helps answer the question — Does the image contain a tree or not? without identifying where it is in the picture.

Object Detection annotation

Object detection is the combination between classification and localization to determine what objects are in an image or video and identify where they are in the image.

In the picture that is shown below the image left illustrates the object by surrounding the members of each class: interstitial lung disease — with a bounding box.

You can use a variety of techniques to perform object detection. Popular deep learning–based approaches using convolutional neural networks (CNNs), such as R-CNN and YOLO v2, automatically learn to detect objects within images.

You can choose from 2 key approaches to get started with object detection using deep learning:

Create and train a custom object detector: To train a custom object detector from scratch, you must create a network architecture to learn the features of the items of interest. To train the CNN, you also need to assemble a sizable quantity of labeled data. A custom object detector can produce amazing results. Nevertheless, you must manually configure the CNN’s layers and weights, which takes a lot of time and training data.
Use a pre-trained object detector: In this method, you can leverage your training model speech by using a pre-trained network and then fine-tune it for your complication. This method can give you faster results because the object detectors have already been trained on a thousand or even millions of images.

Segmentation

This method is more advanced compared to 3 ways because it required assigning a class to each pixel of object. So, this method can be used in diverse fields, especially medical fields in numerous images, for instance, to detect ophthalmology, histopathology.

There are 3 types of Segmentation:

Semantic Segmentation annotation:

This kind of annotation is used when you want to track and count the presence, location, and sometimes the size and the shape of an object. In order to categorize objects that are difficult to count or track due to their potential lack of size or shape, semantic segmentation is used.

Instance Segmentation annotation

We can use this type to track and count the presence, location, count, size and shape of an object in an image. This labeling type allows us to identify every pixel inside the image.

Panoptic Segmentation

It’s a combination of instance and semantic segmentation. This annotation blends semantic and instance segmentation to provide data that is labeled for both background (semantic) and object (instance). You can

Compare each type of image annotation

This is comparison and example of each types of annotation. While classification helps us track the presence of object in image, object detection assist annotator to track the presence, location and count the object by rectangle box — bounding box. From left to right, the advancement of each type of image annotation increases from lowest to highest sophisticated.

In medical field, VinLab introduced to you top 5 free medical imaging annotation tool to maximize productivities of your image labeling process and create quality datasets for machine learning.

Thanks for reading!

If you are finding information about machine learning, artificial intelligent or data in general or medical field. Follow us to acquire more useful knowledge about this 3 keywords.

Contact

Email: info@vinlab.io

Twitter: https://twitter.com/VinLab_io

YouTube: https://www.youtube.com/@Vinlab-MedicalImageAnnotation

Open source project: https://github.com/vinbigdata-medical/vindr-lab