Summary

This is a feature + breaking-change release. Highlights:

  • New instance_binding on Compose — keeps masks, bounding boxes and keypoints aligned per object across the whole pipeline, including Mosaic and the new CopyAndPaste. The intended workflow for instance segmentation / pose.
  • 5 new transforms: CopyAndPaste, PixelSpread, ModeFilter, Enhance, Colorize.
  • Sweeping API cleanup: every sampling-range parameter ends in _range, scalar shorthand is gone (tuples only), and several dead/legacy parameters were removed (map_resolution_range, noise_scale_factor, approximation, …). This breaks user code that still passes *_limit= or scalar *_range= values.
  • Bug fixes in additional_targets resolution, mixing label wrappers, and applied_config recording for single-sample range params.

Breaking changes

1. *_limit parameters renamed to *_range

Every sampling-range constructor argument now uses the _range suffix consistently across the library. Old names are no longer accepted (no deprecation period).

Renames:

OldNew
Rotate.limit, SafeRotate.limitangle_range
Blur / MotionBlur / MedianBlur / GaussianBlur / AdvancedBlur / GlassBlur / ZoomBlur / RingingOvershoot / UnsharpMask.blur_limitblur_range
*.sigma_limit, sigma_x_limit, sigma_y_limitsigma_range, sigma_x_range, sigma_y_range
MotionBlur.rotate_limitrotate_range
AdvancedBlur.beta_limit, noise_limitbeta_range, noise_range
HueSaturationValue.{hue,sat,val}_shift_limit*_shift_range
RandomBrightnessContrast.{brightness,contrast}_limit*_range
CLAHE.clip_limitclip_range
RandomGamma.gamma_limitgamma_range
ChromaticAberration.{primary,secondary}_distortion_limit*_range
PlanckianJitter.temperature_limittemperature_range
RGBShift.{r,g,b}_shift_limit*_shift_range
RandomShadow.num_shadows_limitnum_shadows_range
RandomScale.scale_limitscale_range
ShiftScaleRotate.{shift,scale,rotate}_limit*_range; shift_limit_x/yshift_range_x/y
OpticalDistortion / GridDistortion.distort_limitdistort_range
FDA.beta_limitbeta_range
Sharpen.{alpha,lightness}alpha_range, lightness_range
Emboss.{alpha,strength}alpha_range, strength_range
Superpixels.{p_replace,n_segments}p_replace_range, n_segments_range
RingingOvershoot.cutoffcutoff_range
UnsharpMask.alphaalpha_range
ColorJitter.{brightness,contrast,saturation,hue}*_range
Defocus.{radius,alias_blur}radius_range, alias_blur_range
ZoomBlur.{max_factor,step_factor}*_range
Spatter.{mean,std,gauss_sigma,cutout_threshold,intensity}*_range
ISONoise.{color_shift,intensity}color_shift_range, intensity_range
SaltAndPepper.{amount,salt_vs_pepper}*_range
MaskDropout.max_objectsmax_objects_range
PiecewiseAffine.{scale,nb_rows,nb_cols}*_range
Colorize.{black,white,mid}*_range
RandomBrightnessContrast.ensure_safe_range (bool)ensure_safe_output (bool flag, not a range)

A new test (test_range_field_is_two_number_tuple) enforces that every InitSchema field ending in _range is typed as a two-number tuple.

2. Scalar shorthand for _range parameters is gone — tuples only

Every _range constructor argument now requires an explicit tuple[int, int] / tuple[float, float]. Passing a single number (which the library used to expand into (-N, N) or (0, N) depending on transform) raises a validation error.

A.Rotate(angle_range=30)              # was OK in 2.1.x — now fails
A.Rotate(angle_range=(-30, 30))       # required in 2.2.0

A.RandomBrightnessContrast(brightness_range=0.2)        # was OK — now fails
A.RandomBrightnessContrast(brightness_range=(-0.2, 0.2))

A.Blur(blur_range=7)                  # was OK — now fails
A.Blur(blur_range=(7, 7))

Affine, Perspective, and ShiftScaleRotate got the same treatment for their geometric range params:

  • Perspective.scale is now tuple[float, float].
  • Affine.{scale, translate_percent, translate_px, rotate, shear} are tuples or {"x": tuple, "y": tuple} dicts. Scalar int/float are no longer accepted (_handle_dict_arg was simplified accordingly).
  • ShiftScaleRotate.{shift_range, scale_range, rotate_range, shift_range_x, shift_range_y} are tuples only.

Posterize.num_bits is now strictly tuple[int, int] | list[tuple[int, int]].

3. Removed parameters

Transform / moduleRemoved
BaseDistortion and all subclasses (ElasticTransform, PiecewiseAffine, OpticalDistortion, GridDistortion, ThinPlateSpline, WaterRefraction, PixelSpread)map_resolution_range — distortion maps are always computed at full resolution now
GaussNoisenoise_scale_factor
AdditiveNoiseapproximation
geometric/functional.pyupscale_distortion_maps helper
core/utils.pyto_tuple helper
core/pydantic.pyprocess_non_negative_range, process_non_negative_int_range, convert_to_0plus_range, convert_to_1plus_range, convert_to_1plus_int_range (all redundant with check_range_bounds)

If you imported any of these directly, switch to check_range_bounds or pass values pre-validated.

4. unpack_label_wrappers raises TypeError on non-dict label wrappers

bbox_labels / keypoint_labels are reserved keys with a {<label_field>: [...]} shape. Previously, passing a bare list/tuple/np.ndarray would silently drop the wrapper and surface as a confusing "missing label field" error from a downstream Mosaic / CopyAndPaste validator. Now it raises a TypeError naming the offending key. None is still skipped.


New features

Compose(instance_binding=...) — per-instance target alignment

Albumentations historically treated masks, bboxes and keypoints as independent arrays. When bbox filtering removed an instance (e.g. crop made it too small) the corresponding mask plane and keypoints stayed behind, breaking index alignment. Pose models were also at the mercy of filter_keypoints dropping individual out-of-bounds keypoints.

instance_binding introduces a structured instances input format (list of per-object dicts) and routes everything through bbox-driven survival:

A.Compose(
    transforms,
    bbox_params=A.BboxParams(format="pascal_voc", label_fields=["class"]),
    keypoint_params=A.KeypointParams(format="xy", label_fields=["name"]),
    instance_binding=["masks", "bboxes", "keypoints"],
)

data = {
    "image": img,
    "instances": [
        {
            "mask": mask_i,                       # (H, W)
            "bbox": np.array([x1, y1, x2, y2]),
            "keypoints": np.array([[x, y], ...]),
            "bbox_labels": {"class": "cat"},
            "keypoint_labels": {"name": ["nose", "tail"]},
        },
        ...
    ],
}
out = transform(**data)
out["instances"]  # same shape, only surviving instances

Valid binding targets: "mask", "masks", "bboxes", "keypoints" (min 2; "mask" and "masks" are mutually exclusive). When keypoints are bound, remove_invisible and check_each_transform on KeypointParams are forced to False — only instance survival drives keypoint removal.

Mosaic and the new CopyAndPaste keep instance ids and stacked masks aligned with the fused geometry across cells / pasted objects. Compose._repack_instances uses row-aligned mask indexing when len(masks) == len(bboxes) == len(_bbox_instance_id) so freshly pasted rows still match one mask plane per bbox row.

ReplayCompose forwards instance_binding. Serialization round-trips it.

See docs/design/instance_binding.md.

CopyAndPaste (mixing)

Copy-paste augmentation for instance segmentation. Each donor object is tight-cropped to its mask (or bbox rect for bbox-only donors, optionally expanded to include keypoints), shrunk to fit the target image with aspect preserved (no upscaling), optionally jittered by scale_range, then stamped at a uniformly random location. Existing instances that become sufficiently occluded by pasted objects are removed.

Departs from monolithic implementations (e.g. detectron2) by separating donor selection / instance sampling from the actual paste step — the user passes a list[dict] per call (consistent with Mosaic) and the transform pastes every object provided. Per-object content augmentation (rotation, flip, color jitter, scale-up beyond fit) is the user's responsibility.

Init args worth noting: scale_range, min_paste_area (≥1, area filter), blend_sigma_range (Gaussian feathering on the paste edge). Donors without a usable mask and bbox emit a UserWarning.

Plays nicely with instance_binding: paste rows get fresh ids allocated as max(existing_ids) + 1 so they never collide with surviving ids.

PixelSpread (geometric / distortion)

Stochastic per-pixel displacement: for every output pixel (row, col) an offset (d_row, d_col) is drawn independently and uniformly from [-radius, radius] × [-radius, radius], and the value is read from (row + d_row, col + d_col). Same dense remapping field is applied to all targets so spatial annotations stay consistent.

Sits between blur (which aggregates a neighborhood) and smooth elastic warps (coherent displacement fields): the displacement is intentionally non-smooth and high-frequency. Useful for sensor noise, compression artifacts, fine-grained texture corruption, domain shifts where local pixel structure becomes unstable but global geometry is preserved.

Defaults to cv2.INTER_NEAREST — discrete pixel reassignment, not sub-pixel blending. radius=2 by default.

ModeFilter (blur)

Replaces each pixel with the most frequent value in its local square neighborhood, computed independently per channel. Implemented with sliding_window_view + scipy.stats.mode (vectorized C). Distinct from MedianBlur: frequency-based vs order-statistic; preserves and expands dominant flat regions. Useful for cartoon / palette / quantised imagery.

Enhance (pixel)

Native implementation of Pillow's EDGE_ENHANCE, EDGE_ENHANCE_MORE, and DETAIL filter family. Single transform with mode={"edge", "detail"} and alpha_range, blending the enhanced image with the original via K(alpha) = (1 - alpha) * I + alpha * E. Kernel calibrated so alpha=1 reproduces Pillow's preset exactly and alpha=2 with mode="edge" reproduces EDGE_ENHANCE_MORE, giving a continuous strength dial across both Pillow variants. Border pixels differ slightly from PIL because cv2 defaults to BORDER_REFLECT_101 while PIL replicates.

Colorize (pixel)

Maps single-channel grayscale to a 2- or 3-color RGB gradient (Pillow ImageOps.colorize style). Bit-exact match to PIL via floor-quantized LUT, 2–5× faster on uint8 by applying the (256, 3) LUT through cv2.LUT after replicating the gray channel with cv2.cvtColor.

Each anchor (black_range, white_range, optional mid_range) is (rgb_low, rgb_high) and is sampled per-channel uniformly on every call — cheap domain randomization that PIL has no equivalent for. mid_value_range randomizes the midpoint position.


Bug fixes

additional_targets for shape and image metadata (#239)

additional_targets aliasing wasn't applied when transforms looked up shape / image metadata, so aliased keys (e.g. image2 → image) silently fell through to the canonical key. Affected several pixel transforms (channel_dropout, color, noise, weather) and the Compose pre/post checks.

Added alias-aware get_image_data / get_shape helpers in core/utils.py, BasicTransform, and Compose. Routed pixel transforms through self.get_image_data. _resolve_volume_key now also skips None values so the canonical key wins. 578 lines of new tests in tests/test_additional_targets.py cover the regression.

applied_config did not record sampled scalars for some range params (#232)

Range constructor parameters are supposed to be resolved to concrete sampled values in applied_config (replay/inspection contract). Two transforms silently violated this:

  • CopyAndPaste.blend_sigma_range
  • PixelSpread.map_resolution_range (parameter has since been removed in #233 but the recording fix went in earlier)

Audit confirmed these were the only true violators on main. Other suspects (Illumination, PhotoMetricDistort, RandomShadow, TextImage, Dithering) are conditionally sampled or sample arrays/per-element values — keeping the range tuple is the right semantics there. A new parametrized test (test_applied_config_resolves_range_param_to_scalar) catches future regressions.

mixing.unpack_label_wrappers silently dropped non-dict wrappers (#236)

See breaking changes section — now raises TypeError.

CopyAndPaste instance binding was index-driven, not id-driven (#235)

CopyAndPaste was mixing positional indices and _bbox_instance_id values, which violated Compose._repack_instances's row-aligned fast-path contract and could raise IndexError when an upstream filtering transform (e.g. crop) dropped bboxes before paste.

Refactor: paste survival is now driven by _bbox_instance_id, paste rows get fresh ids (max(existing_ids) + 1), and apply_to_masks / apply_to_bboxes / apply_to_keypoints filter by id rather than position. Zero-area surviving masks are skipped so upstream crops can't resurrect empty instances. Backwards-compatible: when binding is inactive, the legacy positional path is preserved.


Misc

  • CopyAndPaste donor pipeline redesigned (#237): donors provide bbox / keypoints in the pipeline's coord format on donor dims, the transform tight-crops, shrink-fits, optionally jitters scale, samples a uniform placement, and stamps. Adds bbox-only donor support, scale_range, min_paste_area.
  • CI: actions/checkout v4→v6, upload-artifact v4→v7, download-artifact v4→v8.
  • Saved serialization fixtures (tests/files/transform_serialization_v2*.json) regenerated for the renamed parameter names.

Commits

CommitPRDescription
d5c7c12#239fix(core): resolve additional_targets for shape and image metadata
5eb741a#237feat(CopyAndPaste): redesign donor pipeline with tight-crop, shrink-fit, jitter, random placement
16702b3#236fix(mixing): raise TypeError on non-dict bbox_labels/keypoint_labels wrappers
62e3961#235fix(CopyAndPaste): make instance binding fully ID-driven
b3cdc6c#234feat(pixel): add Colorize transform
0bae87b#233refactor!: API cleanup — _range rename, tuple-only sampling, drop legacy params
99b5ee0#232fix(applied_config): resolve range params to sampled scalars in CopyAndPaste, PixelSpread
7cb070e#231feat(pixel): add Enhance transform
31d8ec4#230feat(blur): add ModeFilter transform
6fd021f#229feat: add PixelSpread stochastic pixel displacement transform
e962762#223feat(mixing): instance binding for Mosaic / CopyAndPaste and row-aligned repack
742e0a6#222feat(core): add Compose.instance_binding for per-instance targets
981aefb#221feat(mixing): add CopyAndPaste