albumentations.core.bbox_utils


Utilities for handling bounding box operations during image augmentation. This module provides tools for processing bounding boxes in various formats (COCO, Pascal VOC, YOLO), converting between coordinate systems, normalizing and denormalizing coordinates, filtering boxes based on visibility and size criteria, and performing transformations on boxes to match image augmentations. It forms the core functionality for all bounding box-related operations in the albumentations library.

BboxParamsclass

BboxParams(
    format: Literal['coco', 'pascal_voc', 'albumentations', 'yolo'],
    label_fields: Sequence[Any] | None = None,
    min_area: float = 0.0,
    min_visibility: float = 0.0,
    min_width: float = 0.0,
    min_height: float = 0.0,
    check_each_transform: bool = True,
    clip: bool = False,
    filter_invalid_bboxes: bool = False,
    max_accept_ratio: float | None = None
)

Parameters for bounding box transforms.

Parameters

NameTypeDefaultDescription
format
One of:
  • 'coco'
  • 'pascal_voc'
  • 'albumentations'
  • 'yolo'
-Format of bounding boxes. Should be one of: - 'coco': [x_min, y_min, width, height], e.g. [97, 12, 150, 200]. - 'pascal_voc': [x_min, y_min, x_max, y_max], e.g. [97, 12, 247, 212]. - 'albumentations': like pascal_voc but normalized in [0, 1] range, e.g. [0.2, 0.3, 0.4, 0.5]. - 'yolo': [x_center, y_center, width, height] normalized in [0, 1] range, e.g. [0.1, 0.2, 0.3, 0.4].
label_fields
One of:
  • Sequence[Any]
  • None
NoneList of fields that are joined with boxes, e.g., ['class_labels', 'scores']. Default: None.
min_areafloat0.0Minimum area of a bounding box. All bounding boxes whose visible area in pixels is less than this value will be removed. Default: 0.0.
min_visibilityfloat0.0Minimum fraction of area for a bounding box to remain this box in the result. Should be in [0.0, 1.0] range. Default: 0.0.
min_widthfloat0.0Minimum width of a bounding box in pixels or normalized units. Bounding boxes with width less than this value will be removed. Default: 0.0.
min_heightfloat0.0Minimum height of a bounding box in pixels or normalized units. Bounding boxes with height less than this value will be removed. Default: 0.0.
check_each_transformboolTrueIf True, performs checks for each dual transform. Default: True.
clipboolFalseIf True, clips bounding boxes to image boundaries before applying any transform. Default: False.
filter_invalid_bboxesboolFalseIf True, filters out invalid bounding boxes (e.g., boxes with negative dimensions or boxes where x_max < x_min or y_max < y_min) at the beginning of the pipeline. If clip=True, filtering is applied after clipping. Default: False.
max_accept_ratio
One of:
  • float
  • None
NoneMaximum allowed aspect ratio for bounding boxes. The aspect ratio is calculated as max(width/height, height/width), so it's always >= 1. Boxes with aspect ratio greater than this value will be filtered out. For example, if max_accept_ratio=3.0, boxes with width:height or height:width ratios greater than 3:1 will be removed. Set to None to disable aspect ratio filtering. Default: None.

BboxProcessorclass

BboxProcessor(
    params: BboxParams,
    additional_targets: dict[str, str] | None = None
)

Processor for bounding box transformations. This class handles the preprocessing and postprocessing of bounding boxes during augmentation pipeline, including format conversion, validation, clipping, and filtering.

Parameters

NameTypeDefaultDescription
paramsBboxParams-Parameters that control bounding box processing. See BboxParams class for details.
additional_targets
One of:
  • dict[str, str]
  • None
NoneDictionary with additional targets to process. Keys are names of additional targets, values are their types. For example: {'bbox2': 'bboxes'} will handle 'bbox2' as another bounding box target. Default: None.

bboxes_from_masksfunction

bboxes_from_masks(
    masks: np.ndarray
)

Create bounding boxes from binary masks (fast version)

Parameters

NameTypeDefaultDescription
masksnp.ndarray-Binary masks of shape (H, W) or (N, H, W) where N is the number of masks, and H, W are the height and width of each mask.

Returns

  • np.ndarray: An array of bounding boxes with shape (N, 4), where each row is

bboxes_to_maskfunction

bboxes_to_mask(
    bboxes: np.ndarray,
    image_shape: tuple[int, int]
)

Convert bounding boxes to a single mask.

Parameters

NameTypeDefaultDescription
bboxesnp.ndarray-A numpy array of bounding boxes with shape (num_bboxes, 4+).
image_shapetuple[int, int]-Image shape (height, width).

Returns

  • np.ndarray: A numpy array of shape (height, width) with 1s where any bounding box is present.

calculate_bbox_areas_in_pixelsfunction

calculate_bbox_areas_in_pixels(
    bboxes: np.ndarray,
    shape: ShapeType
)

Calculate areas for multiple bounding boxes. This function computes the areas of bounding boxes given their normalized coordinates and the dimensions of the image they belong to. The bounding boxes are expected to be in the format [x_min, y_min, x_max, y_max] with normalized coordinates (0 to 1).

Parameters

NameTypeDefaultDescription
bboxesnp.ndarray-A numpy array of shape (N, 4+) where N is the number of bounding boxes. Each row contains [x_min, y_min, x_max, y_max] in normalized coordinates. Additional columns beyond the first 4 are ignored.
shapeShapeType-A tuple containing the height and width of the image (height, width).

Returns

  • np.ndarray: A 1D numpy array of shape (N,) containing the areas of the bounding boxes in pixels.

Example

>>> bboxes = np.array([[0.1, 0.1, 0.5, 0.5], [0.2, 0.2, 0.8, 0.8]])
>>> image_shape = (100, 100)
>>> areas = calculate_bbox_areas(bboxes, image_shape)
>>> print(areas)
[1600. 3600.]

Notes

- The function assumes that the input bounding boxes are valid (i.e., x_max > x_min and y_max > y_min). Invalid bounding boxes may result in negative areas. - The function preserves the input array and creates a copy for internal calculations. - The returned areas are in pixel units, not normalized.

check_bboxesfunction

check_bboxes(
    bboxes: np.ndarray
)

Check if bounding boxes are valid.

Parameters

NameTypeDefaultDescription
bboxesnp.ndarray-A numpy array of bounding boxes with shape (num_bboxes, 4+).

clip_bboxesfunction

clip_bboxes(
    bboxes: np.ndarray,
    shape: ShapeType
)

Clip bounding boxes to the image shape.

Parameters

NameTypeDefaultDescription
bboxesnp.ndarray-A numpy array of bounding boxes with shape (num_bboxes, 4+).
shapeShapeType-The shape of the image/volume: - For 2D: {'height': int, 'width': int} - For 3D: {'height': int, 'width': int, 'depth': int}

Returns

  • np.ndarray: A numpy array of bounding boxes with shape (num_bboxes, 4+).

convert_bboxes_from_albumentationsfunction

convert_bboxes_from_albumentations(
    bboxes: np.ndarray,
    target_format: Literal['coco', 'pascal_voc', 'yolo'],
    shape: ShapeType,
    check_validity: bool = False
)

Convert bounding boxes from the format used by albumentations to a specified format.

Parameters

NameTypeDefaultDescription
bboxesnp.ndarray-A numpy array of albumentations bounding boxes with shape (num_bboxes, 4+). The first 4 columns are [x_min, y_min, x_max, y_max].
target_format
One of:
  • 'coco'
  • 'pascal_voc'
  • 'yolo'
-Required format of the output bounding boxes.
shapeShapeType-Image shape (height, width).
check_validityboolFalseCheck if all boxes are valid boxes.

Returns

  • np.ndarray: An array of bounding boxes in the target format with shape (num_bboxes, 4+).

convert_bboxes_to_albumentationsfunction

convert_bboxes_to_albumentations(
    bboxes: np.ndarray,
    source_format: Literal['coco', 'pascal_voc', 'yolo'],
    shape: ShapeType,
    check_validity: bool = False
)

Convert bounding boxes from a specified format to the format used by albumentations: normalized coordinates of top-left and bottom-right corners of the bounding box in the form of `(x_min, y_min, x_max, y_max)` e.g. `(0.15, 0.27, 0.67, 0.5)`.

Parameters

NameTypeDefaultDescription
bboxesnp.ndarray-A numpy array of bounding boxes with shape (num_bboxes, 4+).
source_format
One of:
  • 'coco'
  • 'pascal_voc'
  • 'yolo'
-Format of the input bounding boxes.
shapeShapeType-Image shape (height, width).
check_validityboolFalseCheck if all boxes are valid boxes.

Returns

  • np.ndarray: An array of bounding boxes in albumentations format with shape (num_bboxes, 4+).

denormalize_bboxesfunction

denormalize_bboxes(
    bboxes: np.ndarray,
    shape: ShapeType | tuple[int, int]
)

Denormalize array of bounding boxes.

Parameters

NameTypeDefaultDescription
bboxesnp.ndarray-Normalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.
shape
One of:
  • ShapeType
  • tuple[int, int]
-Image shape `(height, width)`.

Returns

  • np.ndarray: Denormalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.

filter_bboxesfunction

filter_bboxes(
    bboxes: np.ndarray,
    shape: ShapeType,
    min_area: float = 0.0,
    min_visibility: float = 0.0,
    min_width: float = 1.0,
    min_height: float = 1.0,
    max_accept_ratio: float | None = None
)

Remove bounding boxes that either lie outside of the visible area by more than min_visibility or whose area in pixels is under the threshold set by `min_area`. Also crops boxes to final image size.

Parameters

NameTypeDefaultDescription
bboxesnp.ndarray-A numpy array of bounding boxes with shape (num_bboxes, 4+).
shapeShapeType-The shape of the image/volume: - For 2D: {'height': int, 'width': int} - For 3D: {'height': int, 'width': int, 'depth': int}
min_areafloat0.0Minimum area of a bounding box in pixels. Default: 0.0.
min_visibilityfloat0.0Minimum fraction of area for a bounding box to remain. Default: 0.0.
min_widthfloat1.0Minimum width of a bounding box in pixels. Default: 0.0.
min_heightfloat1.0Minimum height of a bounding box in pixels. Default: 0.0.
max_accept_ratio
One of:
  • float
  • None
NoneMaximum allowed aspect ratio, calculated as max(width/height, height/width). Boxes with higher ratios will be filtered out. Default: None.

Returns

  • np.ndarray: Filtered bounding boxes.

mask_to_bboxesfunction

mask_to_bboxes(
    masks: np.ndarray,
    original_bboxes: np.ndarray
)

Convert masks back to bounding boxes.

Parameters

NameTypeDefaultDescription
masksnp.ndarray-A numpy array of masks with shape (num_masks, height, width).
original_bboxesnp.ndarray-Original bounding boxes with shape (num_bboxes, 4+).

Returns

  • np.ndarray: A numpy array of bounding boxes with shape (num_masks, 4+).

masks_from_bboxesfunction

masks_from_bboxes(
    bboxes: np.ndarray,
    shape: ShapeType | tuple[int, int]
)

Convert bounding boxes to masks.

Parameters

NameTypeDefaultDescription
bboxesnp.ndarray-A numpy array of bounding boxes with shape (num_bboxes, 4+).
shape
One of:
  • ShapeType
  • tuple[int, int]
-Image shape (height, width).

Returns

  • np.ndarray: A numpy array of masks with shape (num_bboxes, height, width).

normalize_bboxesfunction

normalize_bboxes(
    bboxes: np.ndarray,
    shape: ShapeType | tuple[int, int]
)

Normalize array of bounding boxes.

Parameters

NameTypeDefaultDescription
bboxesnp.ndarray-Denormalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.
shape
One of:
  • ShapeType
  • tuple[int, int]
-Image shape `(height, width)`.

Returns

  • np.ndarray: Normalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.

union_of_bboxesfunction

union_of_bboxes(
    bboxes: np.ndarray,
    erosion_rate: float
)

Calculate union of bounding boxes. Boxes could be in albumentations or Pascal Voc format.

Parameters

NameTypeDefaultDescription
bboxesnp.ndarray-List of bounding boxes
erosion_ratefloat-How much each bounding box can be shrunk, useful for erosive cropping. Set this in range [0, 1]. 0 will not be erosive at all, 1.0 can make any bbox lose its volume.

Returns

  • np.ndarray | None: A bounding box `(x_min, y_min, x_max, y_max)` or None if no bboxes are given or if