albumentations.augmentations.mixing.functional
Functional implementations for image mixing operations. This module provides utility functions for blending and combining images, such as copy-and-paste operations with masking.
Members
- classProcessedMosaicItem
- functioncopy_and_paste_blend
- functioncalculate_mosaic_center_point
- functioncalculate_cell_placements
- functionfilter_valid_metadata
- functionassign_items_to_grid_cells
- functionpreprocess_selected_mosaic_items
- functionget_opposite_crop_coords
- functionprocess_cell_geometry
- functionshift_cell_coordinates
- functionassemble_mosaic_from_processed_cells
- functionprocess_all_mosaic_geometries
- functionget_cell_relative_position
- functionshift_all_coordinates
ProcessedMosaicItemclass
ProcessedMosaicItem()Single mosaic item (primary or additional) after preprocessing: image, optional mask, bboxes, keypoints with preprocessed annotations.
copy_and_paste_blendfunction
copy_and_paste_blend(
base_image: np.ndarray,
overlay_image: np.ndarray,
overlay_mask: np.ndarray,
offset: tuple[int, int]
)Copy overlay pixels onto the base image where mask > 0, at the given (y, x) offset. Same shape as base_image; overlay and mask must match. This function copies pixels from the overlay image to the base image only where the mask has non-zero values. The overlay is placed at the specified offset from the top-left corner of the base image. Args: base_image (np.ndarray): The destination image that will be modified. overlay_image (np.ndarray): The source image containing pixels to copy. overlay_mask (np.ndarray): Binary mask indicating which pixels to copy from the overlay. Pixels are copied where mask > 0. offset (tuple[int, int]): The (y, x) offset specifying where to place the top-left corner of the overlay relative to the base image. Returns: np.ndarray: The blended image with the overlay applied to the base image.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| base_image | np.ndarray | - | - |
| overlay_image | np.ndarray | - | - |
| overlay_mask | np.ndarray | - | - |
| offset | tuple[int, int] | - | - |
calculate_mosaic_center_pointfunction
calculate_mosaic_center_point(
grid_yx: tuple[int, int],
cell_shape: tuple[int, int],
target_size: tuple[int, int],
center_range: tuple[float, float],
py_random: random.Random
)Compute mosaic crop center by sampling in the valid zone so target_size crop overlaps all grid cells. center_range and py_random control proportional sampling. Ensures the center point allows a crop of target_size to overlap all grid cells, applying randomness based on center_range proportionally within the valid region where the center can lie. Args: grid_yx (tuple[int, int]): The (rows, cols) of the mosaic grid. cell_shape (tuple[int, int]): Shape of each cell in the mosaic grid. target_size (tuple[int, int]): The final output (height, width). center_range (tuple[float, float]): Range [0.0-1.0] for sampling center proportionally within the valid zone. py_random (random.Random): Random state instance. Returns: tuple[int, int]: The calculated (x, y) center point relative to the top-left of the conceptual large grid.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| grid_yx | tuple[int, int] | - | - |
| cell_shape | tuple[int, int] | - | - |
| target_size | tuple[int, int] | - | - |
| center_range | tuple[float, float] | - | - |
| py_random | random.Random | - | - |
calculate_cell_placementsfunction
calculate_cell_placements(
grid_yx: tuple[int, int],
cell_shape: tuple[int, int],
target_size: tuple[int, int],
center_xy: tuple[int, int]
)Compute cell placements by clipping grid lines to the crop window. Returns list of (x_min, y_min, x_max, y_max) per cell on the output canvas. Args: grid_yx (tuple[int, int]): The (rows, cols) of the mosaic grid. cell_shape (tuple[int, int]): Shape of each cell in the mosaic grid. target_size (tuple[int, int]): The final output (height, width). center_xy (tuple[int, int]): The calculated (x, y) center of the final crop window, relative to the top-left of the conceptual large grid. Returns: list[tuple[int, int, int, int]]: A list containing placement coordinates `(x_min, y_min, x_max, y_max)` for each resulting cell part on the final output canvas.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| grid_yx | tuple[int, int] | - | - |
| cell_shape | tuple[int, int] | - | - |
| target_size | tuple[int, int] | - | - |
| center_xy | tuple[int, int] | - | - |
filter_valid_metadatafunction
filter_valid_metadata(
metadata_input: Sequence[dict[str, Any]] | None,
metadata_key_name: str,
data: dict[str, Any]
)Filter metadata dicts to those compatible with primary data (image/mask dimensions and channels). Uses _check_data_compatibility; warns and skips invalid items.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| metadata_input | One of:
| - | - |
| metadata_key_name | str | - | - |
| data | dict[str, Any] | - | - |
assign_items_to_grid_cellsfunction
assign_items_to_grid_cells(
num_items: int,
cell_placements: list[tuple[int, int, int, int]],
py_random: random.Random
)Assign primary (index 0) to largest-area placement; remaining items randomly to others. Returns mapping from (x1,y1,x2,y2) to item index. Assigns the primary item (index 0) to the placement with the largest area, and assigns the remaining items (indices 1 to num_items-1) randomly to the remaining placements. Args: num_items (int): The total number of items to assign (primary + additional + replicas). cell_placements (list[tuple[int, int, int, int]]): List of placement coords (x1, y1, x2, y2) for cells to be filled. py_random (random.Random): Random state instance. Returns: dict[tuple[int, int, int, int], int]: Dict mapping placement coords (x1, y1, x2, y2) to assigned item index.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| num_items | int | - | - |
| cell_placements | list[tuple[int, int, int, int]] | - | - |
| py_random | random.Random | - | - |
preprocess_selected_mosaic_itemsfunction
preprocess_selected_mosaic_items(
selected_raw_items: list[dict[str, Any]],
bbox_processor: BboxProcessor | None,
keypoint_processor: KeypointsProcessor | None
)Preprocess bboxes and keypoints per item via processors; update encoders. Returns list of ProcessedMosaicItem (image, mask, preprocessed bboxes/keypoints). Iterates through items, preprocesses annotations individually using processors (updating label encoders), and returns a list of dicts with original image/mask and the corresponding preprocessed bboxes/keypoints.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| selected_raw_items | list[dict[str, Any]] | - | - |
| bbox_processor | One of:
| - | - |
| keypoint_processor | One of:
| - | - |
get_opposite_crop_coordsfunction
get_opposite_crop_coords(
cell_size: tuple[int, int],
crop_size: tuple[int, int],
cell_position: Literal['top_left', 'top_right', 'center', 'bottom_left', 'bottom_right']
)Compute (x_min, y_min, x_max, y_max) for crop of crop_size in cell_size, opposite cell_position (e.g. top_left → bottom-right). Raises if crop larger than cell. Given a cell of `cell_size`, this function determines the top-left (x_min, y_min) and bottom-right (x_max, y_max) coordinates for a crop of `crop_size`, such that the crop is located in the corner or center opposite to `cell_position`. For example, if `cell_position` is "top_left", the crop coordinates will correspond to the bottom-right region of the cell. Args: cell_size (tuple[int, int]): The (height, width) of the cell from which to crop. crop_size (tuple[int, int]): The (height, width) of the desired crop. cell_position (Literal['top_left', 'top_right', 'center', 'bottom_left', 'bottom_right']): The reference position within the cell. The crop will be taken from the opposite position. Returns: tuple[int, int, int, int]: (x_min, y_min, x_max, y_max) representing the crop coordinates. Raises: ValueError: If crop_size is larger than cell_size in either dimension.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| cell_size | tuple[int, int] | - | - |
| crop_size | tuple[int, int] | - | - |
| cell_position | One of:
| - | - |
process_cell_geometryfunction
process_cell_geometry(
cell_shape: tuple[int, int],
item: ProcessedMosaicItem,
target_shape: tuple[int, int],
fill: float | tuple[float, ...],
fill_mask: float | tuple[float, ...],
fit_mode: Literal['cover', 'contain'],
interpolation: int,
mask_interpolation: int,
cell_position: Literal['top_left', 'top_right', 'center', 'bottom_left', 'bottom_right']
)Pad and/or crop one item to target_shape. PadIfNeeded and Crop with fit_mode and cell_position; returns ProcessedMosaicItem (image, mask, bboxes, keypoints). Uses a Compose pipeline with PadIfNeeded and Crop to ensure the output matches the target cell dimensions exactly, handling both padding and cropping cases. Args: cell_shape (tuple[int, int]): Shape of the cell. item (ProcessedMosaicItem): The preprocessed mosaic item dictionary. target_shape (tuple[int, int]): Target shape of the cell. fill (float | tuple[float, ...]): Fill value for image padding. fill_mask (float | tuple[float, ...]): Fill value for mask padding. fit_mode (Literal['cover', 'contain']): Fit mode for the mosaic. interpolation (int): Interpolation method for image. mask_interpolation (int): Interpolation method for mask. cell_position (Literal['top_left', 'top_right', 'center', 'bottom_left', 'bottom_right']): Position of the cell. Returns: (ProcessedMosaicItem): Dictionary containing the geometrically processed image, mask, bboxes, and keypoints, fitting the target dimensions.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| cell_shape | tuple[int, int] | - | - |
| item | ProcessedMosaicItem | - | - |
| target_shape | tuple[int, int] | - | - |
| fill | One of:
| - | - |
| fill_mask | One of:
| - | - |
| fit_mode | One of:
| - | - |
| interpolation | int | - | - |
| mask_interpolation | int | - | - |
| cell_position | One of:
| - | - |
shift_cell_coordinatesfunction
shift_cell_coordinates(
processed_item_geom: ProcessedMosaicItem,
placement_coords: tuple[int, int, int, int]
)Shift bbox and keypoint coords by placement offset onto final canvas. Returns ProcessedMosaicItem with image, mask, shifted bboxes and keypoints. Args: processed_item_geom (ProcessedMosaicItem): The output from process_cell_geometry. placement_coords (tuple[int, int, int, int]): The (x1, y1, x2, y2) placement on the final canvas. Returns: (ProcessedMosaicItem): A dictionary with keys 'bboxes' and 'keypoints', containing the shifted numpy arrays (potentially empty).
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| processed_item_geom | ProcessedMosaicItem | - | - |
| placement_coords | tuple[int, int, int, int] | - | - |
assemble_mosaic_from_processed_cellsfunction
assemble_mosaic_from_processed_cells(
processed_cells: dict[tuple[int, int, int, int], dict[str, Any]],
target_shape: tuple[int, ...],
dtype: np.dtype,
data_key: Literal['image', 'mask'],
fill: float | tuple[float, ...] | None
)Build mosaic: fill canvas with fill, paste each cell segment at its placement. data_key 'image' or 'mask'; handles multi-channel masks. Returns canvas array. Initializes the canvas with the fill value and overwrites with processed segments. Handles potentially multi-channel masks. Addresses potential broadcasting errors if mask segments have unexpected dimensions. Assumes input data is valid and correctly sized. Args: processed_cells (dict[tuple[int, int, int, int], dict[str, Any]]): Dictionary mapping placement coords to processed cell data. target_shape (tuple[int, ...]): The target shape of the output canvas (e.g., (H, W) or (H, W, C)). dtype (np.dtype): NumPy dtype for the canvas. data_key (Literal['image', 'mask']): Specifies whether to assemble 'image' or 'mask'. fill (float | tuple[float, ...] | None): Value used to initialize the canvas (image fill or mask fill). Should be a float/int or a tuple matching the number of channels. If None, defaults to 0. Returns: np.ndarray: The assembled mosaic canvas.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| processed_cells | dict[tuple[int, int, int, int], dict[str, Any]] | - | - |
| target_shape | tuple[int, ...] | - | - |
| dtype | np.dtype | - | - |
| data_key | One of:
| - | - |
| fill | One of:
| - | - |
process_all_mosaic_geometriesfunction
process_all_mosaic_geometries(
canvas_shape: tuple[int, int],
cell_shape: tuple[int, int],
placement_to_item_index: dict[tuple[int, int, int, int], int],
final_items_for_grid: list[ProcessedMosaicItem],
fill: float | tuple[float, ...],
fill_mask: float | tuple[float, ...],
fit_mode: Literal['cover', 'contain'],
interpolation: Literal[cv2.INTER_NEAREST, cv2.INTER_NEAREST_EXACT, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4, cv2.INTER_LINEAR_EXACT],
mask_interpolation: Literal[cv2.INTER_NEAREST, cv2.INTER_NEAREST_EXACT, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4, cv2.INTER_LINEAR_EXACT]
)Crop/pad every assigned cell via process_cell_geometry. Returns placement -> ProcessedMosaicItem (bbox/keypoint coords not yet shifted). Iterates through assigned placements, applies geometric transforms via process_cell_geometry, and returns a dictionary mapping final placement coordinates to the processed item data. The bbox/keypoint coordinates in the returned dict are *not* shifted yet. Args: canvas_shape (tuple[int, int]): The shape of the canvas. cell_shape (tuple[int, int]): Shape of each cell in the mosaic grid. placement_to_item_index (dict[tuple[int, int, int, int], int]): Mapping from placement coordinates (x1, y1, x2, y2) to assigned item index. final_items_for_grid (list[ProcessedMosaicItem]): List of all preprocessed items available. fill (float | tuple[float, ...]): Fill value for image padding. fill_mask (float | tuple[float, ...]): Fill value for mask padding. fit_mode (Literal['cover', 'contain']): Fit mode for the mosaic. interpolation (Literal[cv2.INTER_NEAREST, cv2.INTER_NEAREST_EXACT, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4, cv2.INTER_LINEAR_EXACT]): Interpolation for image. mask_interpolation (Literal[cv2.INTER_NEAREST, cv2.INTER_NEAREST_EXACT, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4, cv2.INTER_LINEAR_EXACT]): Interpolation for mask. Returns: dict[tuple[int, int, int, int], ProcessedMosaicItem]: Dictionary mapping final placement coordinates (x1, y1, x2, y2) to the geometrically processed item data (image, mask, un-shifted bboxes/kps).
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| canvas_shape | tuple[int, int] | - | - |
| cell_shape | tuple[int, int] | - | - |
| placement_to_item_index | dict[tuple[int, int, int, int], int] | - | - |
| final_items_for_grid | list[ProcessedMosaicItem] | - | - |
| fill | One of:
| - | - |
| fill_mask | One of:
| - | - |
| fit_mode | One of:
| - | - |
| interpolation | One of:
| - | - |
| mask_interpolation | One of:
| - | - |
get_cell_relative_positionfunction
get_cell_relative_position(
placement_coords: tuple[int, int, int, int],
target_shape: tuple[int, int]
)Return cell quadrant relative to canvas center: top_left, top_right, bottom_left, bottom_right, or center. For mosaic crop positioning. Compares the cell center to the canvas center and returns its quadrant or "center" if it lies on or very close to a central axis. Args: placement_coords (tuple[int, int, int, int]): The (x_min, y_min, x_max, y_max) coordinates of the cell. target_shape (tuple[int, int]): The (height, width) of the overall target canvas. Returns: Literal['top_left', 'top_right', 'center', 'bottom_left', 'bottom_right']: The position of the cell relative to the center of the target canvas.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| placement_coords | tuple[int, int, int, int] | - | - |
| target_shape | tuple[int, int] | - | - |
shift_all_coordinatesfunction
shift_all_coordinates(
processed_cells_geom: dict[tuple[int, int, int, int], ProcessedMosaicItem],
canvas_shape: tuple[int, int]
)Shift bbox and keypoint coordinates in each cell to final canvas positions. Same keys as input; values are ProcessedMosaicItem with shifted bboxes/keypoints. Iterates through the processed cells (keyed by placement coords), applies coordinate shifting to bboxes/keypoints, and returns a new dictionary with the same keys but updated ProcessedMosaicItem values containing the *shifted* coordinates. Args: processed_cells_geom (dict[tuple[int, int, int, int], ProcessedMosaicItem]): Output from process_all_mosaic_geometries (keyed by placement coords). canvas_shape (tuple[int, int]): The shape of the canvas. Returns: dict[tuple[int, int, int, int], ProcessedMosaicItem]: Final dictionary mapping placement coords (x1, y1, x2, y2) to processed cell data with shifted coordinates.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
| processed_cells_geom | dict[tuple[int, int, int, int], ProcessedMosaicItem] | - | - |
| canvas_shape | tuple[int, int] | - | - |
On this page
- ProcessedMosaicItem
- copy_and_paste_blend
- calculate_mosaic_center_point
- calculate_cell_placements
- filter_valid_metadata
- assign_items_to_grid_cells
- preprocess_selected_mosaic_items
- get_opposite_crop_coords
- process_cell_geometry
- shift_cell_coordinates
- assemble_mosaic_from_processed_cells
- process_all_mosaic_geometries
- get_cell_relative_position
- shift_all_coordinates