📏 Metrics#

Functional metrics#

`get_stats`(output, target, mode[, ...])	Compute true positive, false positive, false negative, true negative 'pixels' for each image and each class.
`fbeta_score`(tp, fp, fn, tn[, beta, ...])	F beta score
`f1_score`(tp, fp, fn, tn[, reduction, ...])	F1 score
`iou_score`(tp, fp, fn, tn[, reduction, ...])	IoU score or Jaccard index
`accuracy`(tp, fp, fn, tn[, reduction, ...])	Accuracy
`precision`(tp, fp, fn, tn[, reduction, ...])	Precision or positive predictive value (PPV)
`recall`(tp, fp, fn, tn[, reduction, ...])	Sensitivity, recall, hit rate, or true positive rate (TPR)
`sensitivity`(tp, fp, fn, tn[, reduction, ...])	Sensitivity, recall, hit rate, or true positive rate (TPR)
`specificity`(tp, fp, fn, tn[, reduction, ...])	Specificity, selectivity or true negative rate (TNR)
`balanced_accuracy`(tp, fp, fn, tn[, ...])	Balanced accuracy
`positive_predictive_value`(tp, fp, fn, tn[, ...])	Precision or positive predictive value (PPV)
`negative_predictive_value`(tp, fp, fn, tn[, ...])	Negative predictive value (NPV)
`false_negative_rate`(tp, fp, fn, tn[, ...])	Miss rate or false negative rate (FNR)
`false_positive_rate`(tp, fp, fn, tn[, ...])	Fall-out or false positive rate (FPR)
`false_discovery_rate`(tp, fp, fn, tn[, ...])	False discovery rate (FDR)
`false_omission_rate`(tp, fp, fn, tn[, ...])	False omission rate (FOR)
`positive_likelihood_ratio`(tp, fp, fn, tn[, ...])	Positive likelihood ratio (LR+)
`negative_likelihood_ratio`(tp, fp, fn, tn[, ...])	Negative likelihood ratio (LR-)

Various metrics based on Type I and Type II errors.

References

https://en.wikipedia.org/wiki/Confusion_matrix

Example

import segmentation_models_pytorch as smp

# lets assume we have multilabel prediction for 3 classes
output = torch.rand([10, 3, 256, 256])
target = torch.rand([10, 3, 256, 256]).round().long()

# first compute statistics for true positives, false positives, false negative and
# true negative "pixels"
tp, fp, fn, tn = smp.metrics.get_stats(output, target, mode='multilabel', threshold=0.5)

# then compute metrics with required reduction (see metric docs)
iou_score = smp.metrics.iou_score(tp, fp, fn, tn, reduction="micro")
f1_score = smp.metrics.f1_score(tp, fp, fn, tn, reduction="micro")
f2_score = smp.metrics.fbeta_score(tp, fp, fn, tn, beta=2, reduction="micro")
accuracy = smp.metrics.accuracy(tp, fp, fn, tn, reduction="macro")
recall = smp.metrics.recall(tp, fp, fn, tn, reduction="micro-imagewise")

segmentation_models_pytorch.metrics.functional.get_stats(output, target, mode, ignore_index=None, threshold=None, num_classes=None)[source]#

Compute true positive, false positive, false negative, true negative ‘pixels’ for each image and each class.

Parameters:

output (Union[torch.LongTensor, torch.FloatTensor]) –
Model output with following shapes and types depending on the specified mode:

’binary’
shape (N, 1, …) and torch.LongTensor or torch.FloatTensor

’multilabel’
shape (N, C, …) and torch.LongTensor or torch.FloatTensor

’multiclass’
shape (N, …) and torch.LongTensor
target (torch.LongTensor) –
Targets with following shapes depending on the specified mode:

’binary’
shape (N, 1, …)

’multilabel’
shape (N, C, …)

’multiclass’
shape (N, …)
mode (str) – One of 'binary' | 'multilabel' | 'multiclass'
ignore_index (Optional[int]) – Label to ignore on for metric computation. Not supported for 'binary' and 'multilabel' modes. Defaults to None.
threshold (Optional[float, List[float]]) – Binarization threshold for output in case of 'binary' or 'multilabel' modes. Defaults to None.
num_classes (Optional[int]) – Number of classes, necessary attribute only for 'multiclass' mode. Class values should be in range 0..(num_classes - 1). If ignore_index is specified it should be outside the classes range, e.g. -1 or 255.

Raises:

ValueError – in case of misconfiguration.

Returns:

true_positive, false_positive, false_negative,: true_negative tensors (N, C) shape each.

Return type:

Tuple[torch.LongTensor]

segmentation_models_pytorch.metrics.functional.fbeta_score(tp, fp, fn, tn, beta=1.0, reduction=None, class_weights=None, zero_division=1.0)[source]#

F beta score

Parameters:

tp (torch.LongTensor) – tensor of shape (N, C), true positive cases
fp (torch.LongTensor) – tensor of shape (N, C), false positive cases
fn (torch.LongTensor) – tensor of shape (N, C), false negative cases
tn (torch.LongTensor) – tensor of shape (N, C), true negative cases
reduction (Optional[str]) –
Define how to aggregate metric between classes and images:
- ’micro’
  Sum true positive, false positive, false negative and true negative pixels over all images and all classes and then compute score.
- ’macro’
  Sum true positive, false positive, false negative and true negative pixels over all images for each label, then compute score for each label separately and average labels scores. This does not take label imbalance into account.
- ’weighted’
  Sum true positive, false positive, false negative and true negative pixels over all images for each label, then compute score for each label separately and average weighted labels scores.
- ’micro-imagewise’
  Sum true positive, false positive, false negative and true negative pixels for each image, then compute score for each image and average scores over dataset. All images contribute equally to final score, however takes into accout class imbalance for each image.
- ’macro-imagewise’
  Compute score for each image and for each class on that image separately, then compute average score on each image over labels and average image scores over dataset. Does not take into account label imbalance on each image.
- ’weighted-imagewise’
  Compute score for each image and for each class on that image separately, then compute weighted average score on each image over labels and average image scores over dataset.
- ’none’ or None
  Same as 'macro-imagewise', but without any reduction.
For 'binary' case 'micro' = 'macro' = 'weighted' and 'micro-imagewise' = 'macro-imagewise' = 'weighted-imagewise'.

Prefixes 'micro', 'macro' and 'weighted' define how the scores for classes will be aggregated, while postfix 'imagewise' defines how scores between the images will be aggregated.
class_weights (Optional[List[float]]) – list of class weights for metric aggregation, in case of weighted* reduction is chosen. Defaults to None.
zero_division (Union[str, float]) – Sets the value to return when there is a zero division, i.e. when all predictions and labels are negative. If set to “warn”, this acts as 0, but warnings are also raised. Defaults to 1.
beta (float)

Returns:

if 'reduction' is not None or 'none' returns scalar metric,: else returns tensor of shape (N, C)

Return type:

📏 Metrics

Contents

📏 Metrics#

Functional metrics#