albumentations.core.bbox_utils
Utilities for handling bounding box operations during image augmentation. This module provides tools for processing bounding boxes in various formats (COCO, Pascal VOC, YOLO, cxcywh), converting between coordinate systems, normalizing and denormalizing coordinates, filtering boxes based on visibility and size criteria, and performing transformations on boxes to match image augmentations. It forms the core functionality for all bounding box-related operations in the albumentations library.
Members
- classBboxParams
- classBboxProcessor
- functionbboxes_from_masks
- functionbboxes_to_mask
- functioncalculate_bbox_areas_in_pixels
- functioncheck_bboxes
- functionclip_bboxes
- functionclip_bboxes_geometry
- functionconvert_bboxes_from_albumentations
- functionconvert_bboxes_to_albumentations
- functiondenormalize_bboxes
- functionfilter_bboxes
- functionmask_to_bboxes
- functionmasks_from_bboxes
- functionnormalize_bboxes
- functionobb_to_polygons
- functionpolygons_to_obb
- functionunion_of_bboxes
BboxParamsclass
Parameters for bounding box transforms.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| coord_format | Literal | - | Coordinate format of bounding boxes. Should be one of: - 'coco': [x_min, y_min, width, height], e.g. [97, 12, 150, 200]. - 'pascal_voc': [x_min, y_min, x_max, y_max], e.g. [97, 12, 247, 212]. - 'albumentations': like pascal_voc but normalized in [0, 1] range, e.g. [0.2, 0.3, 0.4, 0.5]. - 'yolo': [x_center, y_center, width, height] normalized in [0, 1] range, e.g. [0.1, 0.2, 0.3, 0.4]. - 'cxcywh': [x_center, y_center, width, height] in pixel coordinates, e.g. [50, 50, 40, 60]. |
| label_fields | One of:
| None | List of fields that are joined with boxes, e.g., ['class_labels', 'scores']. Default: None. |
| bbox_type | Literal | hbb | Bounding box type. - 'hbb': axis-aligned boxes with 4 coords (default). - 'obb': oriented boxes with angle as the 5th coord. |
| min_area | float | 0.0 | Minimum area of a bounding box. All bounding boxes whose visible area in pixels is less than this value will be removed. Default: 0.0. |
| min_visibility | float | 0.0 | Minimum fraction of area for a bounding box to remain this box in the result. Should be in [0.0, 1.0] range. Default: 0.0. |
| min_width | float | 0.0 | Minimum width of a bounding box in pixels or normalized units. Bounding boxes with width less than this value will be removed. Default: 0.0. |
| min_height | float | 0.0 | Minimum height of a bounding box in pixels or normalized units. Bounding boxes with height less than this value will be removed. Default: 0.0. |
| check_each_transform | bool | True | If True, performs checks for each dual transform. Default: True. |
| filter_invalid_bboxes | bool | False | If True, filters out invalid bounding boxes (e.g., boxes with negative dimensions or boxes where x_max < x_min or y_max < y_min) at the beginning of the pipeline. If clip_bboxes_on_input=True, filtering is applied after clipping. Default: False. |
| max_accept_ratio | One of:
| None | Maximum allowed aspect ratio for bounding boxes. The aspect ratio is calculated as max(width/height, height/width), so it's always >= 1. Boxes with aspect ratio greater than this value will be filtered out. For example, if max_accept_ratio=3.0, boxes with width:height or height:width ratios greater than 3:1 will be removed. Set to None to disable aspect ratio filtering. Default: None. |
| clip_bboxes_on_input | bool | False | If True, clips bounding boxes to image boundaries once at pipeline start (during preprocessing). Use this to fix invalid input data (e.g., YOLO coordinates like -1e-6). For OBB: clipping is lossy—boxes with corners outside [0, 1] become axis-aligned (angle=0). Recommend False for OBB when using Affine/rotation. Default: False. |
| clip_after_transform | bool | True | If True, clip bounding boxes to image bounds AFTER EACH TRANSFORM in the augmentation pipeline. If False, boxes may temporarily go outside [0, 1] bounds. This is different from `clip_bboxes_on_input` which only runs once before the pipeline. When True: for HBB, clips (x_min, y_min, x_max, y_max) to [0, 1]; for OBB, clips all 4 rotated corners to [0, 1] and returns a wrapping axis-aligned bounding box (angle set to 0). Default: True. |
Examples
>>> # Create BboxParams for COCO format with class labels
>>> bbox_params = BboxParams(
... coord_format='coco',
... label_fields=['class_labels'],
... min_area=1024,
... min_visibility=0.1
... )
>>> # Create BboxParams that clips and filters invalid boxes
>>> bbox_params = BboxParams(
... coord_format='pascal_voc',
... clip_bboxes_on_input=True,
... filter_invalid_bboxes=True
... )
>>> # Create BboxParams that filters extremely elongated boxes
>>> bbox_params = BboxParams(
... coord_format='yolo',
... max_accept_ratio=5.0, # Filter boxes with aspect ratio > 5:1
... clip_bboxes_on_input=True
... )
>>> # Create BboxParams for OBB with clipping after transforms
>>> bbox_params = BboxParams(
... coord_format='albumentations',
... bbox_type='obb',
... clip_after_transform=True, # Clip all corners inside bounds
... )
>>> # Create BboxParams with lenient clipping (allows temporary excursions)
>>> bbox_params = BboxParams(
... coord_format='yolo',
... clip_bboxes_on_input=True, # Fix input errors
... clip_after_transform=False # Allow boxes to go outside temporarily
... )
>>> # Create BboxParams for cxcywh (center + wh in pixels)
>>> bbox_params = BboxParams(
... coord_format='cxcywh',
... label_fields=['class_ids'],
... )BboxProcessorclass
Processor for bounding box transformations. This class handles the preprocessing and postprocessing of bounding boxes during augmentation pipeline, including format conversion, validation, clipping, and filtering.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| params | BboxParams | - | Parameters that control bounding box processing. See BboxParams class for details. |
| additional_targets | One of:
| None | Dictionary with additional targets to process. Keys are names of additional targets, values are their types. For example: {'bbox2': 'bboxes'} will handle 'bbox2' as another bounding box target. Default: None. |
Examples
>>> import albumentations as A
>>> # Process COCO format bboxes with class labels
>>> params = A.BboxParams(
... format='coco',
... label_fields=['class_labels'],
... min_area=1024,
... min_visibility=0.1
... )
>>> processor = BboxProcessor(params)
>>>
>>> # Process multiple bbox fields
>>> params = A.BboxParams('pascal_voc')
>>> processor = BboxProcessor(
... params,
... additional_targets={'bbox2': 'bboxes'}
... )bboxes_from_masksfunction
Create bounding boxes from binary masks (fast version)
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| masks | ndarray | - | Binary masks of shape (H, W) or (N, H, W) where N is the number of masks, and H, W are the height and width of each mask. |
Returns
- np.ndarray: An array of bounding boxes with shape (N, 4), where each row is
bboxes_to_maskfunction
Convert bounding boxes to a single mask.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| bboxes | ndarray | - | A numpy array of bounding boxes with shape (num_bboxes, 4+). |
| image_shape | tuple | - | Image shape (height, width). |
Returns
- np.ndarray: A numpy array of shape (height, width) with 1s where any bounding box is present.
calculate_bbox_areas_in_pixelsfunction
Calculate areas for multiple bounding boxes. This function computes the areas of bounding boxes given their normalized coordinates and the dimensions of the image they belong to. The bounding boxes are expected to be in the format [x_min, y_min, x_max, y_max] with normalized coordinates (0 to 1).
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| bboxes | ndarray | - | A numpy array of shape (N, 4+) where N is the number of bounding boxes. Each row contains [x_min, y_min, x_max, y_max] in normalized coordinates. Additional columns beyond the first 4 are ignored. |
| shape | tuple | - | A tuple containing the height and width of the image (height, width). |
Returns
- np.ndarray: A 1D numpy array of shape (N,) containing the areas of the bounding boxes in pixels.
Examples
>>> bboxes = np.array([[0.1, 0.1, 0.5, 0.5], [0.2, 0.2, 0.8, 0.8]])
>>> image_shape = (100, 100)
>>> areas = calculate_bbox_areas(bboxes, image_shape)
>>> print(areas)
[1600. 3600.]Notes
- The function assumes that the input bounding boxes are valid (i.e., x_max > x_min and y_max > y_min). Invalid bounding boxes may result in negative areas. - The function preserves the input array and creates a copy for internal calculations. - The returned areas are in pixel units, not normalized.
check_bboxesfunction
Check if bounding boxes are valid.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| bboxes | ndarray | - | A numpy array of bounding boxes with shape (num_bboxes, 4+). |
clip_bboxesfunction
Clip bounding boxes to the image shape.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| bboxes | ndarray | - | A numpy array of bounding boxes with shape (num_bboxes, 4+). |
| shape | tuple | - | The shape of the image (height, width). |
Returns
- np.ndarray: A numpy array of bounding boxes with shape (num_bboxes, 4+).
clip_bboxes_geometryfunction
Clip bounding boxes based on actual geometry. This function provides geometry-aware clipping that works correctly for both HBB and OBB: - For HBB: clips (x_min, y_min, x_max, y_max) coordinates to [0, 1] (fast path) - For OBB: clips all 4 rotated corners and returns axis-aligned wrapping box with angle=0
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| bboxes | ndarray | - | Array of bounding boxes in albumentations format (normalized). Shape: (N, 4+) for HBB or (N, 5+) for OBB. |
| shape | tuple | - | Image shape (height, width). |
| bbox_type | Literal | - | Either "hbb" or "obb". |
Returns
- np.ndarray: Clipped bounding boxes. For OBB, returns (N, 5+) with angle set to 0.
Examples
>>> # HBB - simple coordinate clipping
>>> hbb = np.array([[0.2, 0.3, 1.2, 0.8]])
>>> clipped = clip_bboxes_geometry(hbb, (100, 100), "hbb")
>>> # Result: [[0.2, 0.3, 1.0, 0.8]]
>>> # OBB - clips corners and returns wrapping HBB with angle=0
>>> obb = np.array([[0.2, 0.3, 1.2, 0.8, 45.0]]) # rotated 45 degrees
>>> clipped = clip_bboxes_geometry(obb, (100, 100), "obb")
>>> # Result: [[x_min, y_min, x_max, y_max, 0.0]] - angle reset to 0Notes
For HBB, this is equivalent to clip_bboxes() (fast coordinate clipping). For OBB, clips the 4 rotated corners and returns the axis-aligned bounding box that wraps them, with angle set to 0 since the result is axis-aligned. cv2.minAreaRect is NOT used for clipping - only for actual rotations.
convert_bboxes_from_albumentationsfunction
Convert bounding boxes from the format used by albumentations to a specified format.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| bboxes | ndarray | - | A numpy array of albumentations bounding boxes with shape (num_bboxes, 4+). The first 4 columns are [x_min, y_min, x_max, y_max]. |
| target_format | Literal | - | Required format of the output bounding boxes. |
| shape | tuple | - | Image shape (height, width). |
| bbox_type | Literal | - | Bounding box type; required for cxcywh OBB conversion. |
| check_validity | bool | False | Check if all boxes are valid boxes. |
Returns
- np.ndarray: An array of bounding boxes in the target format with shape (num_bboxes, 4+).
convert_bboxes_to_albumentationsfunction
Convert bounding boxes from a specified format to the format used by albumentations: normalized coordinates of top-left and bottom-right corners of the bounding box in the form of `(x_min, y_min, x_max, y_max)` e.g. `(0.15, 0.27, 0.67, 0.5)`.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| bboxes | ndarray | - | A numpy array of bounding boxes with shape (num_bboxes, 4+). |
| source_format | Literal | - | Format of the input bounding boxes. |
| shape | tuple | - | Image shape (height, width). |
| bbox_type | Literal | - | Bounding box type; required for cxcywh OBB conversion. |
| check_validity | bool | False | Check if all boxes are valid boxes. |
Returns
- np.ndarray: An array of bounding boxes in albumentations format with shape (num_bboxes, 4+).
denormalize_bboxesfunction
Denormalize array of bounding boxes.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| bboxes | ndarray | - | Normalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`. |
| shape | tuple | - | Image shape `(height, width)`. |
Returns
- np.ndarray: Denormalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.
filter_bboxesfunction
Remove bounding boxes that either lie outside of the visible area by more than min_visibility or whose area in pixels is under the threshold set by `min_area`. Also crops boxes to final image size.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| bboxes | ndarray | - | A numpy array of bounding boxes with shape (num_bboxes, 4+). |
| shape | tuple | - | The shape of the image (height, width). |
| bbox_type | Literal | - | Type of bounding boxes. Used for geometry-aware clipping. Required parameter, no default. |
| min_area | float | 0.0 | Minimum area of a bounding box in pixels. Default: 0.0. |
| min_visibility | float | 0.0 | Minimum fraction of area for a bounding box to remain. Default: 0.0. |
| min_width | float | 1.0 | Minimum width of a bounding box in pixels. Default: 0.0. |
| min_height | float | 1.0 | Minimum height of a bounding box in pixels. Default: 0.0. |
| max_accept_ratio | One of:
| None | Maximum allowed aspect ratio, calculated as max(width/height, height/width). Boxes with higher ratios will be filtered out. Default: None. |
| clip_after_transform | bool | True | If True, clip bounding boxes to image bounds (HBB: coords, OBB: corners). If False, boxes may extend outside [0, 1]. Default: True. |
Returns
- np.ndarray: Filtered bounding boxes.
mask_to_bboxesfunction
Convert masks back to bounding boxes.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| masks | ndarray | - | A numpy array of masks with shape (num_masks, height, width). |
| original_bboxes | ndarray | - | Original bounding boxes with shape (num_bboxes, 4+) for HBB or (num_bboxes, 5+) for OBB. |
| bbox_type | Literal | - | Type of bounding box - "hbb" for axis-aligned or "obb" for oriented. Default: "hbb". |
Returns
- np.ndarray: A numpy array of bounding boxes with shape (num_masks, 4+) for HBB
masks_from_bboxesfunction
Convert bounding boxes to masks.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| bboxes | ndarray | - | A numpy array of bounding boxes with shape (num_bboxes, 4+). |
| shape | tuple | - | Image shape (height, width). |
Returns
- np.ndarray: A numpy array of masks with shape (num_bboxes, height, width).
normalize_bboxesfunction
Normalize denormalized bounding boxes.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| bboxes | ndarray | - | Denormalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`. |
| shape | tuple | - | Image shape `(height, width)`. |
Returns
- np.ndarray: Normalized bounding boxes `[(x_min, y_min, x_max, y_max, ...)]`.
obb_to_polygonsfunction
Convert oriented bounding boxes to corner polygons (vectorized). Same convention as cv2.minAreaRect/cv2.boxPoints for consistency with polygons_to_obb. Base rect corners [-w/2,-h/2], [w/2,-h/2], [w/2,h/2], [-w/2,h/2] rotated by angle and translated to center.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| bboxes | ndarray | - | Array of shape (N, >=5) where each row is [x_min, y_min, x_max, y_max, angle_deg, ...]. Coordinate-system agnostic. Additional columns beyond the first 5 are preserved but not used. |
Returns
- np.ndarray: Array of shape (N, 4, 2) containing the corner coordinates of each
polygons_to_obbfunction
Fit oriented bbox from corner polygons. Uses cv2.minAreaRect only to get the 4 corners (via boxPoints). From those corners we derive (w, h, angle) with our convention: width = edge more parallel to horizontal, angle in [-90, 90). This ensures obb_to_polygons and cv2.boxPoints produce visually correct results regardless of minAreaRect's internal (w,h,angle) representation. The function is coordinate-system agnostic - it preserves the input coordinate system.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| polygons | ndarray | - | array of shape (N, 4, 2) with corners in any coordinate system. |
| extra_fields | One of:
| None | optional array (N, M) to append after bbox coords + angle. |
Returns
- : Array of OBB bounding boxes in the same coordinate system as input polygons.
union_of_bboxesfunction
Calculate union of bounding boxes. Boxes could be in albumentations or Pascal Voc format.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| bboxes | ndarray | - | List of bounding boxes |
| erosion_rate | float | - | How much each bounding box can be shrunk, useful for erosive cropping. Set this in range [0, 1]. 0 will not be erosive at all, 1.0 can make any bbox lose its volume. |
Returns
- np.ndarray | None: A bounding box `(x_min, y_min, x_max, y_max)` or None if no bboxes are given or if
On this page
- BboxParams
- BboxProcessor
- bboxes_from_masks
- bboxes_to_mask
- calculate_bbox_areas_in_pixels
- check_bboxes
- clip_bboxes
- clip_bboxes_geometry
- convert_bboxes_from_albumentations
- convert_bboxes_to_albumentations
- denormalize_bboxes
- filter_bboxes
- mask_to_bboxes
- masks_from_bboxes
- normalize_bboxes
- obb_to_polygons
- polygons_to_obb
- union_of_bboxes