Interested in advertising?

Contact us

Stay updated

News & Insights

albumentations.augmentations.mixing.functional


Functional implementations for image mixing operations. This module provides utility functions for blending and combining images, such as copy-and-paste operations with masking.

ProcessedMosaicItemclass

Represents a single data item (primary or additional) after preprocessing. Includes the original image/mask and the *preprocessed* annotations.

assemble_mosaic_from_processed_cellsfunction

Assembles the final mosaic image or mask from processed cell data onto a canvas. Initializes the canvas with the fill value and overwrites with processed segments. Handles potentially multi-channel masks. Addresses potential broadcasting errors if mask segments have unexpected dimensions. Assumes input data is valid and correctly sized.

Parameters

NameTypeDefaultDescription
processed_cellsdict[tuple[int, int, int, int], dict[str, Any]]-Dictionary mapping placement coords to processed cell data.
target_shapetuple[int, ...]-The target shape of the output canvas (e.g., (H, W) or (H, W, C)).
dtypenp.dtype-NumPy dtype for the canvas.
data_key
One of:
  • 'image'
  • 'mask'
-Specifies whether to assemble 'image' or 'mask'.
fill
One of:
  • float
  • tuple[float, ...]
  • None
-Value used to initialize the canvas (image fill or mask fill). Should be a float/int or a tuple matching the number of channels. If None, defaults to 0.

Returns

  • np.ndarray: The assembled mosaic canvas.

assign_items_to_grid_cellsfunction

Assigns item indices to placement coordinate tuples. Assigns the primary item (index 0) to the placement with the largest area, and assigns the remaining items (indices 1 to num_items-1) randomly to the remaining placements.

Parameters

NameTypeDefaultDescription
num_itemsint-The total number of items to assign (primary + additional + replicas).
cell_placementslist[tuple[int, int, int, int]]-List of placement coords (x1, y1, x2, y2) for cells to be filled.
py_randomrandom.Random-Random state instance.

Returns

  • dict[tuple[int, int, int, int], int]: Dict mapping placement coords (x1, y1, x2, y2)

calculate_cell_placementsfunction

Calculates placements by clipping arange-defined grid lines to the crop window.

Parameters

NameTypeDefaultDescription
grid_yxtuple[int, int]-The (rows, cols) of the mosaic grid.
cell_shapetuple[int, int]-Shape of each cell in the mosaic grid.
target_sizetuple[int, int]-The final output (height, width).
center_xytuple[int, int]-The calculated (x, y) center of the final crop window, relative to the top-left of the conceptual large grid.

Returns

  • list[tuple[int, int, int, int]]:

calculate_mosaic_center_pointfunction

Calculates the center point for the mosaic crop using proportional sampling within the valid zone. Ensures the center point allows a crop of target_size to overlap all grid cells, applying randomness based on center_range proportionally within the valid region where the center can lie.

Parameters

NameTypeDefaultDescription
grid_yxtuple[int, int]-The (rows, cols) of the mosaic grid.
cell_shapetuple[int, int]-Shape of each cell in the mosaic grid.
target_sizetuple[int, int]-The final output (height, width).
center_rangetuple[float, float]-Range [0.0-1.0] for sampling center proportionally within the valid zone.
py_randomrandom.Random-Random state instance.

Returns

  • tuple[int, int]: The calculated (x, y) center point relative to the

copy_and_paste_blendfunction

Blend images by copying pixels from an overlay image to a base image using a mask. This function copies pixels from the overlay image to the base image only where the mask has non-zero values. The overlay is placed at the specified offset from the top-left corner of the base image.

Parameters

NameTypeDefaultDescription
base_imagenp.ndarray-The destination image that will be modified.
overlay_imagenp.ndarray-The source image containing pixels to copy.
overlay_masknp.ndarray-Binary mask indicating which pixels to copy from the overlay. Pixels are copied where mask > 0.
offsettuple[int, int]-The (y, x) offset specifying where to place the top-left corner of the overlay relative to the base image.

Returns

  • np.ndarray: The blended image with the overlay applied to the base image.

filter_valid_metadatafunction

Filters a list of metadata dicts, keeping only valid ones based on data compatibility.

Parameters

NameTypeDefaultDescription
metadata_input
One of:
  • Sequence[dict[str, Any]]
  • None
--
metadata_key_namestr--
datadict[str, Any]--

get_cell_relative_positionfunction

Determines the position of a cell relative to the center of the target canvas. Compares the cell center to the canvas center and returns its quadrant or "center" if it lies on or very close to a central axis.

Parameters

NameTypeDefaultDescription
placement_coordstuple[int, int, int, int]-The (x_min, y_min, x_max, y_max) coordinates of the cell.
target_shapetuple[int, int]-The (height, width) of the overall target canvas.

Returns

  • Literal["top_left", "top_right", "center", "bottom_left", "bottom_right"]:

get_opposite_crop_coordsfunction

Calculates crop coordinates positioned opposite to the specified cell_position. Given a cell of `cell_size`, this function determines the top-left (x_min, y_min) and bottom-right (x_max, y_max) coordinates for a crop of `crop_size`, such that the crop is located in the corner or center opposite to `cell_position`. For example, if `cell_position` is "top_left", the crop coordinates will correspond to the bottom-right region of the cell.

Parameters

NameTypeDefaultDescription
cell_sizetuple[int, int]-The (height, width) of the cell from which to crop.
crop_sizetuple[int, int]-The (height, width) of the desired crop.
cell_position
One of:
  • 'top_left'
  • 'top_right'
  • 'center'
  • 'bottom_left'
  • 'bottom_right'
-The reference position within the cell. The crop will be taken from the opposite position.

Returns

  • tuple[int, int, int, int]: (x_min, y_min, x_max, y_max) representing the crop coordinates.

preprocess_selected_mosaic_itemsfunction

Preprocesses bboxes/keypoints for selected raw additional items. Iterates through items, preprocesses annotations individually using processors (updating label encoders), and returns a list of dicts with original image/mask and the corresponding preprocessed bboxes/keypoints.

Parameters

NameTypeDefaultDescription
selected_raw_itemslist[dict[str, Any]]--
bbox_processor
One of:
  • BboxProcessor
  • None
--
keypoint_processor
One of:
  • KeypointsProcessor
  • None
--

process_all_mosaic_geometriesfunction

Processes the geometry (cropping/padding) for all assigned mosaic cells. Iterates through assigned placements, applies geometric transforms via process_cell_geometry, and returns a dictionary mapping final placement coordinates to the processed item data. The bbox/keypoint coordinates in the returned dict are *not* shifted yet.

Parameters

NameTypeDefaultDescription
canvas_shapetuple[int, int]-The shape of the canvas.
cell_shapetuple[int, int]-Shape of each cell in the mosaic grid.
placement_to_item_indexdict[tuple[int, int, int, int], int]-Mapping from placement coordinates (x1, y1, x2, y2) to assigned item index.
final_items_for_gridlist[ProcessedMosaicItem]-List of all preprocessed items available.
fill
One of:
  • float
  • tuple[float, ...]
-Fill value for image padding.
fill_mask
One of:
  • float
  • tuple[float, ...]
-Fill value for mask padding.
fit_mode
One of:
  • 'cover'
  • 'contain'
-Fit mode for the mosaic.
interpolation
One of:
  • cv2.INTER_NEAREST
  • cv2.INTER_NEAREST_EXACT
  • cv2.INTER_LINEAR
  • cv2.INTER_CUBIC
  • cv2.INTER_AREA
  • cv2.INTER_LANCZOS4
  • cv2.INTER_LINEAR_EXACT
-Interpolation method for image.
mask_interpolation
One of:
  • cv2.INTER_NEAREST
  • cv2.INTER_NEAREST_EXACT
  • cv2.INTER_LINEAR
  • cv2.INTER_CUBIC
  • cv2.INTER_AREA
  • cv2.INTER_LANCZOS4
  • cv2.INTER_LINEAR_EXACT
-Interpolation method for mask.

Returns

  • dict[tuple[int, int, int, int], ProcessedMosaicItem]: Dictionary mapping final placement

process_cell_geometryfunction

Applies geometric transformations (padding and/or cropping) to a single mosaic item. Uses a Compose pipeline with PadIfNeeded and Crop to ensure the output matches the target cell dimensions exactly, handling both padding and cropping cases.

Parameters

NameTypeDefaultDescription
cell_shapetuple[int, int]-(tuple[int, int]): Shape of the cell.
itemProcessedMosaicItem-(ProcessedMosaicItem): The preprocessed mosaic item dictionary.
target_shapetuple[int, int]-(tuple[int, int]): Target shape of the cell.
fill
One of:
  • float
  • tuple[float, ...]
-(float | tuple[float, ...]): Fill value for image padding.
fill_mask
One of:
  • float
  • tuple[float, ...]
-(float | tuple[float, ...]): Fill value for mask padding.
fit_mode
One of:
  • 'cover'
  • 'contain'
-(Literal["cover", "contain"]): Fit mode for the mosaic.
interpolationint-(int): Interpolation method for image.
mask_interpolationint-(int): Interpolation method for mask.
cell_position
One of:
  • 'top_left'
  • 'top_right'
  • 'center'
  • 'bottom_left'
  • 'bottom_right'
-(Literal["top_left", "top_right", "center", "bottom_left", "bottom_right"]): Position of the cell.

shift_all_coordinatesfunction

Shifts coordinates for all geometrically processed cells. Iterates through the processed cells (keyed by placement coords), applies coordinate shifting to bboxes/keypoints, and returns a new dictionary with the same keys but updated ProcessedMosaicItem values containing the *shifted* coordinates.

Parameters

NameTypeDefaultDescription
processed_cells_geomdict[tuple[int, int, int, int], ProcessedMosaicItem]-Output from process_all_mosaic_geometries (keyed by placement coords).
canvas_shapetuple[int, int]-The shape of the canvas.

Returns

  • dict[tuple[int, int, int, int], ProcessedMosaicItem]: Final dictionary mapping

shift_cell_coordinatesfunction

Shifts the coordinates of geometrically processed bboxes and keypoints.

Parameters

NameTypeDefaultDescription
processed_item_geomProcessedMosaicItem-(ProcessedMosaicItem): The output from process_cell_geometry.
placement_coordstuple[int, int, int, int]-(tuple[int, int, int, int]): The (x1, y1, x2, y2) placement on the final canvas.