Skip to content

Usage

Try live Demo

Explore live OD-Metrics examples on Binder Binder or Google Colab

Simple example.

Consider a scenario with two images, Image 1 and Image 2, and the following annotations and predictions.

Image 1 contains:

  • 2 ground-truth bounding boxes, one for class 0 and one for class 1;
  • 3 predicted bounding boxes with labels [0, 1, 1] and scores [.88, .70, .80].
    # Image 1
    y_true =
      {
       "boxes": [[25, 16, 38, 56], [129, 123, 41, 62]],
       "labels": [0, 1]
       }
    y_pred =
      {
       "boxes": [[25, 27, 37, 54], [119, 111, 40, 67], [124, 9, 49, 67]],
       "labels": [0, 1, 1],
       "scores": [.88, .70, .80]
       },
    

Image 2 contains:

  • 2 ground-truth bounding boxes, both for class 0;
  • 3 predicted bounding boxes, with labels [0, 1, 0] and scores [.71, .54, .74].
    # Image 2
    y_true =
      {
       "boxes": [[123, 11, 43, 55], [38, 132, 59, 45]],
       "labels": [0, 0]
       }
    y_pred = {
       "boxes": [[64, 111, 64, 58], [26, 140, 60, 47], [19, 18, 43, 35]],
       "labels": [0, 1, 0],
       "scores": [.71, .54, .74]
       }
    

The mAP (Mean Average Precision) and mAR (Mean Average Recall) for this scenario are computed using OD-Metrics as follows.

simple_example
from od_metrics import ODMetrics

# Ground truths
y_true = [
    { # image 1
     "boxes": [[25, 16, 38, 56], [129, 123, 41, 62]],
     "labels": [0, 1]
     },
    { # image 2
     "boxes": [[123, 11, 43, 55], [38, 132, 59, 45]],
     "labels": [0, 0]
     }
    ]

# Predictions
y_pred = [
    { # image 1
     "boxes": [[25, 27, 37, 54], [119, 111, 40, 67], [124, 9, 49, 67]],
     "labels": [0, 1, 1],
     "scores": [.88, .70, .80]
     },
    { # image 2
     "boxes": [[64, 111, 64, 58], [26, 140, 60, 47], [19, 18, 43, 35]],
     "labels": [0, 1, 0],
     "scores": [.71, .54, .74]
     }
    ]

metrics = ODMetrics()
output = metrics.compute(y_true, y_pred)
print(output)
"""
{'mAP@[.5 | all | 100]': 0.2574257425742574,
 'mAP@[.5:.95 | all | 100]': 0.10297029702970294,
 'mAP@[.5:.95 | large | 100]': -1.0,
 'mAP@[.5:.95 | medium | 100]': 0.10297029702970294,
 'mAP@[.5:.95 | small | 100]': -1.0,
 'mAP@[.75 | all | 100]': 0.0,
 'mAR@[.5 | all | 100]': 0.25,
 'mAR@[.5:.95 | all | 100]': 0.1,
 'mAR@[.5:.95 | all | 10]': 0.1,
 'mAR@[.5:.95 | all | 1]': 0.1,
 'mAR@[.5:.95 | large | 100]': -1.0,
 'mAR@[.5:.95 | medium | 100]': 0.1,
 'mAR@[.5:.95 | small | 100]': -1.0,
 'mAR@[.75 | all | 100]': 0.0,
 'classes': [0, 1],
 'n_images': 2}
"""

Custom settings

By default, OD-Metrics follows MS-COCO 1 settings, including iou_thresholds, recall_thresholds, max_detection_thresholds, and area_ranges (see ODMetrics.__init__() method).
Custom settings can replace the default configuration. For instance, to set an IoU threshold of 0.4 and a maximum detection threshold of 2:

custom_settings_example
from od_metrics import ODMetrics

# Ground truths
y_true = [
    { # image 1
     "boxes": [[25, 16, 38, 56], [129, 123, 41, 62]],
     "labels": [0, 1]
     },
    { # image 2
     "boxes": [[123, 11, 43, 55], [38, 132, 59, 45]],
     "labels": [0, 0]
     }
    ]

# Predictions
y_pred = [
    { # image 1
     "boxes": [[25, 27, 37, 54], [119, 111, 40, 67], [124, 9, 49, 67]],
     "labels": [0, 1, 1],
     "scores": [.88, .70, .80]
     },
    { # image 2
     "boxes": [[64, 111, 64, 58], [26, 140, 60, 47], [19, 18, 43, 35]],
     "labels": [0, 1, 0],
     "scores": [.71, .54, .74]
     }
    ]

metrics = ODMetrics(iou_thresholds=.4, max_detection_thresholds=2)
output = metrics.compute(y_true, y_pred)
print(output)
"""
{'mAP@[.4 | all | 2]': 0.2574257425742574,
 'mAP@[.4 | large | 2]': -1.0,
 'mAP@[.4 | medium | 2]': 0.2574257425742574,
 'mAP@[.4 | small | 2]': -1.0,
 'mAR@[.4 | all | 2]': 0.25,
 'mAR@[.4 | large | 2]': -1.0,
 'mAR@[.4 | medium | 2]': 0.25,
 'mAR@[.4 | small | 2]': -1.0,
 'classes': [0, 1],
 'n_images': 2}
"""

class_metrics

The class_metrics option enables per-class metrics: each metric is reported globally and for each individual class.

Warning

Enabling class_metrics has a performance impact.

class_metrics_example
from od_metrics import ODMetrics

# Ground truths
y_true = [
    { # image 1
     "boxes": [[25, 16, 38, 56], [129, 123, 41, 62]],
     "labels": [0, 1]
     },
    { # image 2
     "boxes": [[123, 11, 43, 55], [38, 132, 59, 45]],
     "labels": [0, 0]
     }
    ]

# Predictions
y_pred = [
    { # image 1
     "boxes": [[25, 27, 37, 54], [119, 111, 40, 67], [124, 9, 49, 67]],
     "labels": [0, 1, 1],
     "scores": [.88, .70, .80]
     },
    { # image 2
     "boxes": [[64, 111, 64, 58], [26, 140, 60, 47], [19, 18, 43, 35]],
     "labels": [0, 1, 0],
     "scores": [.71, .54, .74]
     }
    ]

metrics = ODMetrics(class_metrics=True)
output = metrics.compute(y_true, y_pred)
print(output)
"""
{'mAP@[.5 | all | 100]': 0.16831683168316827,
 'mAP@[.5:.95 | all | 100]': 0.06732673267326732,
 'mAP@[.5:.95 | large | 100]': -1.0,
 'mAP@[.5:.95 | medium | 100]': 0.06732673267326732,
 'mAP@[.5:.95 | small | 100]': -1.0,
 'mAP@[.75 | all | 100]': 0.0,
 'mAR@[.5 | all | 100]': 0.16666666666666666,
 'mAR@[.5:.95 | all | 100]': 0.06666666666666667,
 'mAR@[.5:.95 | all | 10]': 0.06666666666666667,
 'mAR@[.5:.95 | all | 1]': 0.06666666666666667,
 'mAR@[.5:.95 | large | 100]': -1.0,
 'mAR@[.5:.95 | medium | 100]': 0.06666666666666667,
 'mAR@[.5:.95 | small | 100]': -1.0,
 'mAR@[.75 | all | 100]': 0.0,
 'class_metrics': {0: {'AP@[.5 | all | 100]': 0.33663366336633654,
   'AP@[.5:.95 | all | 100]': 0.13465346534653463,
   'AP@[.5:.95 | large | 100]': -1.0,
   'AP@[.5:.95 | medium | 100]': 0.13465346534653463,
   'AP@[.5:.95 | small | 100]': -1.0,
   'AP@[.75 | all | 100]': 0.0,
   'AR@[.5 | all | 100]': 0.3333333333333333,
   'AR@[.5:.95 | all | 100]': 0.13333333333333333,
   'AR@[.5:.95 | all | 10]': 0.13333333333333333,
   'AR@[.5:.95 | all | 1]': 0.13333333333333333,
   'AR@[.5:.95 | large | 100]': -1.0,
   'AR@[.5:.95 | medium | 100]': 0.13333333333333333,
   'AR@[.5:.95 | small | 100]': -1.0,
   'AR@[.75 | all | 100]': 0.0},
  1: {'AP@[.5 | all | 100]': 0.0,
   'AP@[.5:.95 | all | 100]': 0.0,
   'AP@[.5:.95 | large | 100]': -1.0,
   'AP@[.5:.95 | medium | 100]': 0.0,
   'AP@[.5:.95 | small | 100]': -1.0,
   'AP@[.75 | all | 100]': 0.0,
   'AR@[.5 | all | 100]': 0.0,
   'AR@[.5:.95 | all | 100]': 0.0,
   'AR@[.5:.95 | all | 10]': 0.0,
   'AR@[.5:.95 | all | 1]': 0.0,
   'AR@[.5:.95 | large | 100]': -1.0,
   'AR@[.5:.95 | medium | 100]': 0.0,
   'AR@[.5:.95 | small | 100]': -1.0,
   'AR@[.75 | all | 100]': 0.0}},
 'classes': [0, 1],
 'n_images': 2}
"""

extended_summary

The extended_summary option in the ODMetrics.compute() method enables an extended summary with additional metrics such as IoU, AP (Average Precision), AR (Average Recall), and mean_evaluator (a Callable).

extended_summary_example
from od_metrics import ODMetrics

# Ground truths
y_true = [
    { # image 1
     "boxes": [[25, 16, 38, 56], [129, 123, 41, 62]],
     "labels": [0, 1]
     },
    { # image 2
     "boxes": [[123, 11, 43, 55], [38, 132, 59, 45]],
     "labels": [0, 0]
     }
    ]

# Predictions
y_pred = [
    { # image 1
     "boxes": [[25, 27, 37, 54], [119, 111, 40, 67], [124, 9, 49, 67]],
     "labels": [0, 1, 1],
     "scores": [.88, .70, .80]
     },
    { # image 2
     "boxes": [[64, 111, 64, 58], [26, 140, 60, 47], [19, 18, 43, 35]],
     "labels": [0, 1, 0],
     "scores": [.71, .54, .74]
     }
    ]

metrics = ODMetrics()
output = metrics.compute(y_true, y_pred, extended_summary=True)
print(list(output.keys()))
"""
['mAP@[.5 | all | 100]',,
 'mAP@[.5:.95 | all | 100]',
 'mAP@[.5:.95 | large | 100]',
 'mAP@[.5:.95 | medium | 100]',
 'mAP@[.5:.95 | small | 100]',
 'mAP@[.75 | all | 100]',
 'mAR@[.5 | all | 100]',
 'mAR@[.5:.95 | all | 100]',
 'mAR@[.5:.95 | all | 10]',
 'mAR@[.5:.95 | all | 1]',
 'mAR@[.5:.95 | large | 100]',
 'mAR@[.5:.95 | medium | 100]',
 'mAR@[.5:.95 | small | 100]',
 'mAR@[.75 | all | 100]',
 'classes',
 'n_images',
 'AP',
 'AR',
 'IoU',
 'mean_evaluator']
"""
In particular, mean_evaluator is a Callable that can calculate metrics for any combination of settings, even those not included in default compute output. For example, with standard MS-COCO 1 settings, the metric combination mAP@[.55 | medium | 10] is not included in the default compute output but can be obtained using the mean_evaluator, after calling compute.

mean_evaluator_example
from od_metrics import ODMetrics

# Ground truths
y_true = [
    { # image 1
     "boxes": [[25, 16, 38, 56], [129, 123, 41, 62]],
     "labels": [0, 1]
     },
    { # image 2
     "boxes": [[123, 11, 43, 55], [38, 132, 59, 45]],
     "labels": [0, 0]
     }
    ]

# Predictions
y_pred = [
    { # image 1
     "boxes": [[25, 27, 37, 54], [119, 111, 40, 67], [124, 9, 49, 67]],
     "labels": [0, 1, 1],
     "scores": [.88, .70, .80]
     },
    { # image 2
     "boxes": [[64, 111, 64, 58], [26, 140, 60, 47], [19, 18, 43, 35]],
     "labels": [0, 1, 0],
     "scores": [.71, .54, .74]
     }
    ]

metrics = ODMetrics()
output = metrics.compute(y_true, y_pred, extended_summary=True)
mean_evaluator = output["mean_evaluator"]
_metric = mean_evaluator(
    iou_threshold=.55,
    max_detection_threshold=10,
    area_range_key="medium",
    metrics="AP"
    )
print(_metric)
"""
{'mAP@[.55 | medium | 10]': 0.2574257425742574}
"""
For a complete list of arguments accepted by the mean_evaluator function, refer to the extended_summary option in the ODMetrics.compute() method.

IoU

The calculation of mAP and mAR relies on IoU (Intersection over Union). You can use the standalone iou function from OD-Metrics.

iou_example
from od_metrics import iou

y_true = [[25, 16, 38, 56], [129, 123, 41, 62]]
y_pred = [[25, 27, 37, 54], [119, 111, 40, 67], [124, 9, 49, 67]]

result = iou(y_true, y_pred, box_format="xywh")
print(result)
"""
array([[0.67655425, 0.        ],
       [0.        , 0.46192609],
       [0.        , 0.        ]])
"""
The iou function supports the iscrowd parameter from the COCOAPI. For more details, refer to the iscrowd section.

References


  1. Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft coco: common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, 740–755. Springer, 2014.