albumentations.core.bbox_utils


Utilities for handling bounding box operations during image augmentation. This module provides tools for processing bounding boxes in various formats (COCO, Pascal VOC, YOLO, cxcywh), converting between coordinate systems, normalizing and denormalizing coordinates, filtering boxes based on visibility and size criteria, and performing transformations on boxes to match image augmentations. It forms the core functionality for all bounding box-related operations in the albumentations library.

BboxParamsclass

Parameters for bounding box transforms.

Parameters

NameTypeDefaultDescription
coord_formatLiteral-Coordinate format of bounding boxes. Should be one of: - 'coco': [x_min, y_min, width, height], e.g. [97, 12, 150, 200]. - 'pascal_voc': [x_min, y_min, x_max, y_max], e.g. [97, 12, 247, 212]. - 'albumentations': like pascal_voc but normalized in [0, 1] range, e.g. [0.2, 0.3, 0.4, 0.5]. - 'yolo': [x_center, y_center, width, height] normalized in [0, 1] range, e.g. [0.1, 0.2, 0.3, 0.4]. - 'cxcywh': [x_center, y_center, width, height] in pixel coordinates, e.g. [50, 50, 40, 60].
label_fields
One of:
  • collections.abc.Sequence[typing.Any]
  • None
NoneList of fields that are joined with boxes, e.g., ['class_labels', 'scores']. Default: None.
bbox_typeLiteralhbbBounding box type. - 'hbb': axis-aligned boxes with 4 coords (default). - 'obb': oriented boxes with angle as the 5th coord.
min_areafloat0.0Minimum area of a bounding box. All bounding boxes whose visible area in pixels is less than this value will be removed. Default: 0.0.
min_visibilityfloat0.0Minimum fraction of area for a bounding box to remain this box in the result. Should be in [0.0, 1.0] range. Default: 0.0.
min_widthfloat0.0Minimum width of a bounding box in pixels or normalized units. Bounding boxes with width less than this value will be removed. Default: 0.0.
min_heightfloat0.0Minimum height of a bounding box in pixels or normalized units. Bounding boxes with height less than this value will be removed. Default: 0.0.
check_each_transformboolTrueIf True, performs checks for each dual transform. Default: True.
filter_invalid_bboxesboolFalseIf True, filters out invalid bounding boxes (e.g., boxes with negative dimensions or boxes where x_max < x_min or y_max < y_min) at the beginning of the pipeline. If clip_bboxes_on_input=True, filtering is applied after clipping. Default: False.
max_accept_ratio
One of:
  • float
  • None
NoneMaximum allowed aspect ratio for bounding boxes. The aspect ratio is calculated as max(width/height, height/width), so it's always >= 1. Boxes with aspect ratio greater than this value will be filtered out. For example, if max_accept_ratio=3.0, boxes with width:height or height:width ratios greater than 3:1 will be removed. Set to None to disable aspect ratio filtering. Default: None.
clip_bboxes_on_inputboolFalseIf True, clips bounding boxes to image boundaries once at pipeline start (during preprocessing). Use this to fix invalid input data (e.g., YOLO coordinates like -1e-6). For OBB: clipping is lossy—boxes with corners outside [0, 1] become axis-aligned (angle=0). Recommend False for OBB when using Affine/rotation. Default: False.
clip_after_transformboolTrueIf True, clip bounding boxes to image bounds AFTER EACH TRANSFORM in the augmentation pipeline. If False, boxes may temporarily go outside [0, 1] bounds. This is different from `clip_bboxes_on_input` which only runs once before the pipeline. When True: for HBB, clips (x_min, y_min, x_max, y_max) to [0, 1]; for OBB, clips all 4 rotated corners to [0, 1] and returns a wrapping axis-aligned bounding box (angle set to 0). Default: True.

Examples

>>> # Create BboxParams for COCO format with class labels
>>> bbox_params = BboxParams(
...     coord_format='coco',
...     label_fields=['class_labels'],
...     min_area=1024,
...     min_visibility=0.1
... )

>>> # Create BboxParams that clips and filters invalid boxes
>>> bbox_params = BboxParams(
...     coord_format='pascal_voc',
...     clip_bboxes_on_input=True,
...     filter_invalid_bboxes=True
... )
>>> # Create BboxParams that filters extremely elongated boxes
>>> bbox_params = BboxParams(
...     coord_format='yolo',
...     max_accept_ratio=5.0,  # Filter boxes with aspect ratio > 5:1
...     clip_bboxes_on_input=True
... )
>>> # Create BboxParams for OBB with clipping after transforms
>>> bbox_params = BboxParams(
...     coord_format='albumentations',
...     bbox_type='obb',
...     clip_after_transform=True,  # Clip all corners inside bounds
... )
>>> # Create BboxParams with lenient clipping (allows temporary excursions)
>>> bbox_params = BboxParams(
...     coord_format='yolo',
...     clip_bboxes_on_input=True,  # Fix input errors
...     clip_after_transform=False  # Allow boxes to go outside temporarily
... )
>>> # Create BboxParams for cxcywh (center + wh in pixels)
>>> bbox_params = BboxParams(
...     coord_format='cxcywh',
...     label_fields=['class_ids'],
... )

BboxProcessorclass

Processor for bounding box transformations. This class handles the preprocessing and postprocessing of bounding boxes during augmentation pipeline, including format conversion, validation, clipping, and filtering.

Parameters

NameTypeDefaultDescription
paramsBboxParams-Parameters that control bounding box processing. See BboxParams class for details.
additional_targets
One of:
  • dict[str, str]
  • None
NoneDictionary with additional targets to process. Keys are names of additional targets, values are their types. For example: {'bbox2': 'bboxes'} will handle 'bbox2' as another bounding box target. Default: None.

Examples

>>> import albumentations as A
>>> # Process COCO format bboxes with class labels
>>> params = A.BboxParams(
...     format='coco',
...     label_fields=['class_labels'],
...     min_area=1024,
...     min_visibility=0.1
... )
>>> processor = BboxProcessor(params)
>>>
>>> # Process multiple bbox fields
>>> params = A.BboxParams('pascal_voc')
>>> processor = BboxProcessor(
...     params,
...     additional_targets={'bbox2': 'bboxes'}
... )

bboxes_from_masksfunction

Create bounding boxes from binary masks (fast version)

Parameters

NameTypeDefaultDescription
masksndarray-Binary masks of shape (H, W) or (N, H, W) where N is the number of masks, and H, W are the height and width of each mask.

Returns

  • np.ndarray: An array of bounding boxes with shape (N, 4), where each row is

bboxes_to_maskfunction

Convert bounding boxes to a single mask.

Parameters

NameTypeDefaultDescription
bboxesndarray-A numpy array of bounding boxes with shape (num_bboxes, 4+).
image_shapetuple-Image shape (height, width).

Returns

  • np.ndarray: A numpy array of shape (height, width) with 1s where any bounding box is present.

calculate_bbox_areas_in_pixelsfunction

Calculate areas for multiple bounding boxes. This function computes the areas of bounding boxes given their normalized coordinates and the dimensions of the image they belong to. The bounding boxes are expected to be in the format [x_min, y_min, x_max, y_max] with normalized coordinates (0 to 1).

Parameters

NameTypeDefaultDescription
bboxesndarray-A numpy array of shape (N, 4+) where N is the number of bounding boxes. Each row contains [x_min, y_min, x_max, y_max] in normalized coordinates. Additional columns beyond the first 4 are ignored.
shapetuple-A tuple containing the height and width of the image (height, width).

Returns

  • np.ndarray: A 1D numpy array of shape (N,) containing the areas of the bounding boxes in pixels.

Examples

>>> bboxes = np.array([[0.1, 0.1, 0.5, 0.5], [0.2, 0.2, 0.8, 0.8]])
>>> image_shape = (100, 100)
>>> areas = calculate_bbox_areas(bboxes, image_shape)
>>> print(areas)
[1600. 3600.]

Notes

- The function assumes that the input bounding boxes are valid (i.e., x_max > x_min and y_max > y_min). Invalid bounding boxes may result in negative areas. - The function preserves the input array and creates a copy for internal calculations. - The returned areas are in pixel units, not normalized.

check_bboxesfunction

Check if bounding boxes are valid.

Parameters

NameTypeDefaultDescription
bboxesndarray-A numpy array of bounding boxes with shape (num_bboxes, 4+).

clip_bboxesfunction

Clip bounding boxes to the image shape.

Parameters

NameTypeDefaultDescription
bboxesndarray-A numpy array of bounding boxes with shape (num_bboxes, 4+).
shapetuple-The shape of the image (height, width).

Returns

  • np.ndarray: A numpy array of bounding boxes with shape (num_bboxes, 4+).

clip_bboxes_geometryfunction

Clip bounding boxes based on actual geometry. This function provides geometry-aware clipping that works correctly for both HBB and OBB: - For HBB: clips (x_min, y_min, x_max, y_max) coordinates to [0, 1] (fast path) - For OBB: clips all 4 rotated corners and returns axis-aligned wrapping box with angle=0

Parameters

NameTypeDefaultDescription
bboxesndarray-Array of bounding boxes in albumentations format (normalized). Shape: (N, 4+) for HBB or (N, 5+) for OBB.
shapetuple-Image shape (height, width).
bbox_typeLiteral-Either "hbb" or "obb".

Returns

  • np.ndarray: Clipped bounding boxes. For OBB, returns (N, 5+) with angle set to 0.

Examples

>>> # HBB - simple coordinate clipping
>>> hbb = np.array([[0.2, 0.3, 1.2, 0.8]])
>>> clipped = clip_bboxes_geometry(hbb, (100, 100), "hbb")
>>> # Result: [[0.2, 0.3, 1.0, 0.8]]

>>> # OBB - clips corners and returns wrapping HBB with angle=0
>>> obb = np.array([[0.2, 0.3, 1.2, 0.8, 45.0]])  # rotated 45 degrees
>>> clipped = clip_bboxes_geometry(obb, (100, 100), "obb")
>>> # Result: [[x_min, y_min, x_max, y_max, 0.0]] - angle reset to 0

Notes

For HBB, this is equivalent to clip_bboxes() (fast coordinate clipping). For OBB, clips the 4 rotated corners and returns the axis-aligned bounding box that wraps them, with angle set to 0 since the result is axis-aligned. cv2.minAreaRect is NOT used for clipping - only for actual rotations.

convert_bboxes_from_albumentationsfunction

Convert bounding boxes from the format used by albumentations to a specified format.

Parameters

NameTypeDefaultDescription
bboxesndarray-A numpy array of albumentations bounding boxes with shape (num_bboxes, 4+). The first 4 columns are [x_min, y_min, x_max, y_max].
target_formatLiteral-Required format of the output bounding boxes.
shapetuple-Image shape (height, width).
bbox_typeLiteral-Bounding box type; required for cxcywh OBB conversion.
check_validityboolFalseCheck if all boxes are valid boxes.

Returns

  • np.ndarray: An array of bounding boxes in the target format with shape (num_bboxes, 4+).

convert_bboxes_to_albumentationsfunction

Convert bounding boxes from a specified format to the format used by albumentations: normalized coordinates of top-left and bottom-right corners of the bounding box in the form of `(x_min, y_min, x_max, y_max)` e.g. `(0.15, 0.27, 0.67, 0.5)`.

Parameters

NameTypeDefaultDescription
bboxesndarray-A numpy array of bounding boxes with shape (num_bboxes, 4+).
source_formatLiteral-Format of the input bounding boxes.
shapetuple-Image shape (height, width).
bbox_typeLiteral-Bounding box type; required for cxcywh OBB conversion.
check_validityboolFalseCheck if all boxes are valid boxes.

Returns

  • np.ndarray: An array of bounding boxes in albumentations format with shape (num_bboxes, 4+).

denormalize_bboxesfunction

Denormalize array of bounding boxes.

Parameters

NameTypeDefaultDescription
bboxesndarray-Normalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.
shapetuple-Image shape `(height, width)`.

Returns

  • np.ndarray: Denormalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.

filter_bboxesfunction

Remove bounding boxes that either lie outside of the visible area by more than min_visibility or whose area in pixels is under the threshold set by `min_area`. Also crops boxes to final image size.

Parameters

NameTypeDefaultDescription
bboxesndarray-A numpy array of bounding boxes with shape (num_bboxes, 4+).
shapetuple-The shape of the image (height, width).
bbox_typeLiteral-Type of bounding boxes. Used for geometry-aware clipping. Required parameter, no default.
min_areafloat0.0Minimum area of a bounding box in pixels. Default: 0.0.
min_visibilityfloat0.0Minimum fraction of area for a bounding box to remain. Default: 0.0.
min_widthfloat1.0Minimum width of a bounding box in pixels. Default: 0.0.
min_heightfloat1.0Minimum height of a bounding box in pixels. Default: 0.0.
max_accept_ratio
One of:
  • float
  • None
NoneMaximum allowed aspect ratio, calculated as max(width/height, height/width). Boxes with higher ratios will be filtered out. Default: None.
clip_after_transformboolTrueIf True, clip bounding boxes to image bounds (HBB: coords, OBB: corners). If False, boxes may extend outside [0, 1]. Default: True.

Returns

  • np.ndarray: Filtered bounding boxes.

mask_to_bboxesfunction

Convert masks back to bounding boxes.

Parameters

NameTypeDefaultDescription
masksndarray-A numpy array of masks with shape (num_masks, height, width).
original_bboxesndarray-Original bounding boxes with shape (num_bboxes, 4+) for HBB or (num_bboxes, 5+) for OBB.
bbox_typeLiteral-Type of bounding box - "hbb" for axis-aligned or "obb" for oriented. Default: "hbb".

Returns

  • np.ndarray: A numpy array of bounding boxes with shape (num_masks, 4+) for HBB

masks_from_bboxesfunction

Convert bounding boxes to masks.

Parameters

NameTypeDefaultDescription
bboxesndarray-A numpy array of bounding boxes with shape (num_bboxes, 4+).
shapetuple-Image shape (height, width).

Returns

  • np.ndarray: A numpy array of masks with shape (num_bboxes, height, width).

normalize_bboxesfunction

Normalize denormalized bounding boxes.

Parameters

NameTypeDefaultDescription
bboxesndarray-Denormalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.
shapetuple-Image shape `(height, width)`.

Returns

  • np.ndarray: Normalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.

obb_to_polygonsfunction

Convert oriented bounding boxes to corner polygons (vectorized). Same convention as cv2.minAreaRect/cv2.boxPoints for consistency with polygons_to_obb. Base rect corners [-w/2,-h/2], [w/2,-h/2], [w/2,h/2], [-w/2,h/2] rotated by angle and translated to center.

Parameters

NameTypeDefaultDescription
bboxesndarray-Array of shape (N, >=5) where each row is [x_min, y_min, x_max, y_max, angle_deg, ...]. Coordinate-system agnostic. Additional columns beyond the first 5 are preserved but not used.

Returns

  • np.ndarray: Array of shape (N, 4, 2) containing the corner coordinates of each

polygons_to_obbfunction

Fit oriented bbox from corner polygons. Uses cv2.minAreaRect only to get the 4 corners (via boxPoints). From those corners we derive (w, h, angle) with our convention: width = edge more parallel to horizontal, angle in [-90, 90). This ensures obb_to_polygons and cv2.boxPoints produce visually correct results regardless of minAreaRect's internal (w,h,angle) representation. The function is coordinate-system agnostic - it preserves the input coordinate system.

Parameters

NameTypeDefaultDescription
polygonsndarray-array of shape (N, 4, 2) with corners in any coordinate system.
extra_fields
One of:
  • numpy.ndarray
  • None
Noneoptional array (N, M) to append after bbox coords + angle.

Returns

  • : Array of OBB bounding boxes in the same coordinate system as input polygons.

union_of_bboxesfunction

Calculate union of bounding boxes. Boxes could be in albumentations or Pascal Voc format.

Parameters

NameTypeDefaultDescription
bboxesndarray-List of bounding boxes
erosion_ratefloat-How much each bounding box can be shrunk, useful for erosive cropping. Set this in range [0, 1]. 0 will not be erosive at all, 1.0 can make any bbox lose its volume.

Returns

  • np.ndarray | None: A bounding box `(x_min, y_min, x_max, y_max)` or None if no bboxes are given or if