Skip to content

Usage

Try live Demo

Try OD-Metrics samples Binder

Simple example.

Suppose to be in the following situation.

Example

You have 2 images with:

  • Image 1

  • 2 ground truth bounding boxes with: one belonging to 0 class and one to 1 class:
  • 3 predictions bounding boxes, with labels [0, 1, 1] and scores [.88, .70, .80].

    # Image 1
    y_true =
        {
         "boxes": [[25, 16, 38, 56], [129, 123, 41, 62]],
         "labels": [0, 1]
         }
    y_pred =
        {
         "boxes": [[25, 27, 37, 54], [119, 111, 40, 67], [124, 9, 49, 67]],
         "labels": [0, 1, 1],
         "scores": [.88, .70, .80]
         },
    

  • Image 2

  • 2 ground truth bounding boxes, each belonging to 0 class;
  • 3 predictions bounding boxes, with labels [0, 1, 0], with scores [.71, .54, .74].
    # Image 2
    y_true =
        {
         "boxes": [[123, 11, 43, 55], [38, 132, 59, 45]],
         "labels": [0, 0]
         }
    y_pred = {
         "boxes": [[64, 111, 64, 58], [26, 140, 60, 47], [19, 18, 43, 35]],
         "labels": [0, 1, 0],
         "scores": [.71, .54, .74]
         }
    

The mAP (Mean Average Precision) and mAR (Mean Average Recall) for the previous situation can be calculated as follows:

simple_example
from od_metrics import ODMetrics

# Ground truths
y_true = [
    { # image 1
     "boxes": [[25, 16, 38, 56], [129, 123, 41, 62]],
     "labels": [0, 1]
     },
    { # image 2
     "boxes": [[123, 11, 43, 55], [38, 132, 59, 45]],
     "labels": [0, 0]
     }
    ]

# Predictions
y_pred = [
    { # image 1
     "boxes": [[25, 27, 37, 54], [119, 111, 40, 67], [124, 9, 49, 67]],
     "labels": [0, 1, 1],
     "scores": [.88, .70, .80]
     },
    { # image 2
     "boxes": [[64, 111, 64, 58], [26, 140, 60, 47], [19, 18, 43, 35]],
     "labels": [0, 1, 0],
     "scores": [.71, .54, .74]
     }
    ]

metrics = ODMetrics()
output = metrics.compute(y_true, y_pred)
print(output)
"""
{'mAP@[.5 | all | 100]': 0.2574257425742574,
 'mAP@[.5:.95 | all | 100]': 0.10297029702970294,
 'mAP@[.5:.95 | large | 100]': -1.0,
 'mAP@[.5:.95 | medium | 100]': 0.10297029702970294,
 'mAP@[.5:.95 | small | 100]': -1.0,
 'mAP@[.75 | all | 100]': 0.0,
 'mAR@[.5 | all | 100]': 0.25,
 'mAR@[.5:.95 | all | 100]': 0.1,
 'mAR@[.5:.95 | all | 10]': 0.1,
 'mAR@[.5:.95 | all | 1]': 0.1,
 'mAR@[.5:.95 | large | 100]': -1.0,
 'mAR@[.5:.95 | medium | 100]': 0.1,
 'mAR@[.5:.95 | small | 100]': -1.0,
 'mAR@[.75 | all | 100]': 0.0,
 'classes': [0, 1],
 'n_images': 2}
"""

Custom settings

By default, ODMetrics uses COCO settings for iou_thresholds, recall_thresholds, max_detection_thresholds and area_ranges (see ODMetrics.__init__() method).
Instead of the default COCO settings, custom settings can be specified.
For example, if one is interested in a iou threshold value of 0.4 and a maximum detection threshold of 2:

custom_settings_example
from od_metrics import ODMetrics

# Ground truths
y_true = [
    { # image 1
     "boxes": [[25, 16, 38, 56], [129, 123, 41, 62]],
     "labels": [0, 1]
     },
    { # image 2
     "boxes": [[123, 11, 43, 55], [38, 132, 59, 45]],
     "labels": [0, 0]
     }
    ]

# Predictions
y_pred = [
    { # image 1
     "boxes": [[25, 27, 37, 54], [119, 111, 40, 67], [124, 9, 49, 67]],
     "labels": [0, 1, 1],
     "scores": [.88, .70, .80]
     },
    { # image 2
     "boxes": [[64, 111, 64, 58], [26, 140, 60, 47], [19, 18, 43, 35]],
     "labels": [0, 1, 0],
     "scores": [.71, .54, .74]
     }
    ]

metrics = ODMetrics(iou_thresholds=.4, max_detection_thresholds=2)
output = metrics.compute(y_true, y_pred)
print(output)
"""
{'mAP@[.4 | all | 2]': 0.2574257425742574,
 'mAP@[.4 | large | 2]': -1.0,
 'mAP@[.4 | medium | 2]': 0.2574257425742574,
 'mAP@[.4 | small | 2]': -1.0,
 'mAR@[.4 | all | 2]': 0.25,
 'mAR@[.4 | large | 2]': -1.0,
 'mAR@[.4 | medium | 2]': 0.25,
 'mAR@[.4 | small | 2]': -1.0,
 'classes': [0, 1],
 'n_images': 2}
"""

class_metrics

The class_metrics option enable per class metrics: each metric is reported globally as well for each individual class.

class_metrics_example
from od_metrics import ODMetrics

# Ground truths
y_true = [
    { # image 1
     "boxes": [[25, 16, 38, 56], [129, 123, 41, 62]],
     "labels": [0, 1]
     },
    { # image 2
     "boxes": [[123, 11, 43, 55], [38, 132, 59, 45]],
     "labels": [0, 0]
     }
    ]

# Predictions
y_pred = [
    { # image 1
     "boxes": [[25, 27, 37, 54], [119, 111, 40, 67], [124, 9, 49, 67]],
     "labels": [0, 1, 1],
     "scores": [.88, .70, .80]
     },
    { # image 2
     "boxes": [[64, 111, 64, 58], [26, 140, 60, 47], [19, 18, 43, 35]],
     "labels": [0, 1, 0],
     "scores": [.71, .54, .74]
     }
    ]

metrics = ODMetrics(class_metrics=True)
output = metrics.compute(y_true, y_pred)
print(output)
"""
{'mAP@[.5 | all | 100]': 0.16831683168316827,
 'mAP@[.5:.95 | all | 100]': 0.06732673267326732,
 'mAP@[.5:.95 | large | 100]': -1.0,
 'mAP@[.5:.95 | medium | 100]': 0.06732673267326732,
 'mAP@[.5:.95 | small | 100]': -1.0,
 'mAP@[.75 | all | 100]': 0.0,
 'mAR@[.5 | all | 100]': 0.16666666666666666,
 'mAR@[.5:.95 | all | 100]': 0.06666666666666667,
 'mAR@[.5:.95 | all | 10]': 0.06666666666666667,
 'mAR@[.5:.95 | all | 1]': 0.06666666666666667,
 'mAR@[.5:.95 | large | 100]': -1.0,
 'mAR@[.5:.95 | medium | 100]': 0.06666666666666667,
 'mAR@[.5:.95 | small | 100]': -1.0,
 'mAR@[.75 | all | 100]': 0.0,
 'class_metrics': {0: {'AP@[.5 | all | 100]': 0.33663366336633654,
   'AP@[.5:.95 | all | 100]': 0.13465346534653463,
   'AP@[.5:.95 | large | 100]': -1.0,
   'AP@[.5:.95 | medium | 100]': 0.13465346534653463,
   'AP@[.5:.95 | small | 100]': -1.0,
   'AP@[.75 | all | 100]': 0.0,
   'AR@[.5 | all | 100]': 0.3333333333333333,
   'AR@[.5:.95 | all | 100]': 0.13333333333333333,
   'AR@[.5:.95 | all | 10]': 0.13333333333333333,
   'AR@[.5:.95 | all | 1]': 0.13333333333333333,
   'AR@[.5:.95 | large | 100]': -1.0,
   'AR@[.5:.95 | medium | 100]': 0.13333333333333333,
   'AR@[.5:.95 | small | 100]': -1.0,
   'AR@[.75 | all | 100]': 0.0},
  1: {'AP@[.5 | all | 100]': 0.0,
   'AP@[.5:.95 | all | 100]': 0.0,
   'AP@[.5:.95 | large | 100]': -1.0,
   'AP@[.5:.95 | medium | 100]': 0.0,
   'AP@[.5:.95 | small | 100]': -1.0,
   'AP@[.75 | all | 100]': 0.0,
   'AR@[.5 | all | 100]': 0.0,
   'AR@[.5:.95 | all | 100]': 0.0,
   'AR@[.5:.95 | all | 10]': 0.0,
   'AR@[.5:.95 | all | 1]': 0.0,
   'AR@[.5:.95 | large | 100]': -1.0,
   'AR@[.5:.95 | medium | 100]': 0.0,
   'AR@[.5:.95 | small | 100]': -1.0,
   'AR@[.75 | all | 100]': 0.0}},
 'classes': [0, 1],
 'n_images': 2}
"""

Warning

Enable class_metrics has a performance impact.

extended_summary

The extended_summary in ODMetrics.compute() method enable extended summary with additional metrics including IoU, AP (Average Precision), AR (Average Recall) and mean_evaluator (Callable).

extended_summary_example
from od_metrics import ODMetrics

# Ground truths
y_true = [
    { # image 1
     "boxes": [[25, 16, 38, 56], [129, 123, 41, 62]],
     "labels": [0, 1]
     },
    { # image 2
     "boxes": [[123, 11, 43, 55], [38, 132, 59, 45]],
     "labels": [0, 0]
     }
    ]

# Predictions
y_pred = [
    { # image 1
     "boxes": [[25, 27, 37, 54], [119, 111, 40, 67], [124, 9, 49, 67]],
     "labels": [0, 1, 1],
     "scores": [.88, .70, .80]
     },
    { # image 2
     "boxes": [[64, 111, 64, 58], [26, 140, 60, 47], [19, 18, 43, 35]],
     "labels": [0, 1, 0],
     "scores": [.71, .54, .74]
     }
    ]

metrics = ODMetrics()
output = metrics.compute(y_true, y_pred, extended_summary=True)
print(list(output.keys()))
"""
['mAP@[.5 | all | 100]',,
 'mAP@[.5:.95 | all | 100]',
 'mAP@[.5:.95 | large | 100]',
 'mAP@[.5:.95 | medium | 100]',
 'mAP@[.5:.95 | small | 100]',
 'mAP@[.75 | all | 100]',
 'mAR@[.5 | all | 100]',
 'mAR@[.5:.95 | all | 100]',
 'mAR@[.5:.95 | all | 10]',
 'mAR@[.5:.95 | all | 1]',
 'mAR@[.5:.95 | large | 100]',
 'mAR@[.5:.95 | medium | 100]',
 'mAR@[.5:.95 | small | 100]',
 'mAR@[.75 | all | 100]',
 'classes',
 'n_images',
 'AP',
 'AR',
 'IoU',
 'mean_evaluator']
"""
In particular mean_evaluator is Callable that can be used to calculate metrics for each combination of interest between constructor settings that are not included in default compute output. For example, using standard COCO settings, che metric combination mAP@[.55 | medium | 10] is not included in default compute output.

mean_evaluator_example
from od_metrics import ODMetrics

# Ground truths
y_true = [
    { # image 1
     "boxes": [[25, 16, 38, 56], [129, 123, 41, 62]],
     "labels": [0, 1]
     },
    { # image 2
     "boxes": [[123, 11, 43, 55], [38, 132, 59, 45]],
     "labels": [0, 0]
     }
    ]

# Predictions
y_pred = [
    { # image 1
     "boxes": [[25, 27, 37, 54], [119, 111, 40, 67], [124, 9, 49, 67]],
     "labels": [0, 1, 1],
     "scores": [.88, .70, .80]
     },
    { # image 2
     "boxes": [[64, 111, 64, 58], [26, 140, 60, 47], [19, 18, 43, 35]],
     "labels": [0, 1, 0],
     "scores": [.71, .54, .74]
     }
    ]

metrics = ODMetrics()
output = metrics.compute(y_true, y_pred, extended_summary=True)
mean_evaluator = output["mean_evaluator"]
_metric = mean_evaluator(
    iou_threshold=.55,
    max_detection_threshold=10,
    area_range_key="medium",
    metrics="AP"
    )
print(_metric)
"""
{'mAP@[.55 | medium | 10]': 0.2574257425742574}
"""
For all arguments accepted by mean_evaluator function, see extended_summary in ODMetrics.compute() method.

Iou

The calculation of mAP and mAR clearly make use of IoU. It's possible to use standalone iou function.

iou_example
from od_metrics import iou

y_true = [[25, 16, 38, 56], [129, 123, 41, 62]]
y_pred = [[25, 27, 37, 54], [119, 111, 40, 67], [124, 9, 49, 67]]

result = iou(y_true, y_pred, box_format="xywh")
print(result)
"""
array([[0.67655425, 0.        ],
       [0.        , 0.46192609],
       [0.        , 0.        ]])
"""
iou function supports iscrowd COCOAPI parameter. Please refer to iou source code.