Summary
This is a refactor + bug-fix release focused entirely on the instance-binding subsystem. No new transforms, no API renames outside Compose.
- Structural row-alignment invariant. When
instance_bindingis active,len(masks) == len(bboxes)and positional alignment between the two are now enforced byComposeitself, not by per-transform discipline. Any transform that violates the contract raisesRuntimeErrorimmediately from_resync_instance_ids— no more silent desync that surfaces as anIndexError3 transforms later. Mosaic + Perspective + CopyAndPasteno longer crashes. The remaining 2.2.1 desync case (IndexErrorfrom_repack_mask_intoafter geometric transforms drop mask rows independently fromBboxProcessor) is fixed at the source.Mosaicand CopyAndPaste now share one keep-mask between theirapply_to_{bboxes,masks,keypoints}methods, computed once inget_params_dependent_on_data. The pre-2.2.2 dual mask-layout convention (id-indexed in Mosaic, position-indexed-with-sparse-ids in CopyAndPaste) is gone.- New
filter_bboxes_with_maskhelper +BboxProcessor.filter_with_keep_maskreturn(filtered_bboxes, keep_mask).Composeuses the keep-mask to mirror filtering decisions onto masks (positional) and keypoints (by surviving_instance_id) atomically insidecheck_data_post_transform. Composegainsstrict_instance_invariant=True(default). Set toFalsefor one minor version's worth of grace if you maintain a custom transform that violates the row-alignment contract; you'll get aUserWarninginstead ofRuntimeError. The legacy permissive code path will be removed in 2.3.
Breaking changes
1. Custom DualTransform subclasses must keep masks row-aligned with bboxes
If your transform's apply_to_bboxes drops rows (min-area culling, out-of-frame removal, internal visibility filters, …), apply_to_masks MUST drop the corresponding rows. The default BasicTransform.apply_to_masks is total — it satisfies the contract for any transform whose apply_to_mask is row-preserving.
When you can't make per-mask decisions in isolation, the canonical pattern (used by Mosaic and CopyAndPaste) is to compute the keep-mask once in get_params_dependent_on_data and ferry it through params to all three apply methods.
The processor-level mirror in Compose._bbox_filter_with_mirror covers the case where BboxProcessor is the SOLE filter (e.g. you don't filter inside the transform at all, you let min_area/min_visibility drive removal). Internal filters need their own keep-mask plumbing.
Violations now surface as:
RuntimeError: Instance-binding invariant violated: masks=N != bboxes=M.
The last transform must keep masks positionally aligned with bboxes.
If you need time to migrate:
A.Compose(
transforms,
bbox_params=...,
instance_binding=["masks", "bboxes"],
strict_instance_invariant=False, # downgrades RuntimeError to UserWarning, 2.2.2 only
)
2. The 2.2.2-era resync recovery branches are gone
Compose._resync_masks_to_bboxes is renamed to Compose._resync_instance_ids and reduced to ~30 lines: rebase ids to arange(N), translate _kp_instance_id through the old→new table, assert the length invariant. The snapshot helpers (_snapshot_pre_processor_bbox_ids, _mask_positions_for_surviving_ids) and the pre_bbox_ids plumbing through Compose.__call__ are deleted.
If you imported either helper or relied on the old method name, switch to _resync_instance_ids (or, better, don't rely on _-prefixed internals).
New features
filter_bboxes_with_mask and BboxProcessor.filter_with_keep_mask
Public helpers in albumentations/core/bbox_utils.py:
from albumentations.core.bbox_utils import filter_bboxes_with_mask
filtered, keep_mask = filter_bboxes_with_mask(
bboxes, image_shape,
min_area=1, min_visibility=0.0, min_width=1, min_height=1, max_accept_ratio=None,
)
# filtered.shape == (sum(keep_mask), bboxes.shape[1])
# keep_mask is a bool[bboxes.shape[0]] aligned with the input
BboxProcessor.filter_with_keep_mask(data, shape) -> tuple[np.ndarray, np.ndarray] exposes the same primitive on the processor instance. The original filter_bboxes and BboxProcessor.filter are preserved as one-line wrappers — public API unchanged.
This unblocks any user code that needs to mirror bbox-processor filtering decisions onto a parallel data structure (e.g. extra metadata arrays kept outside data["bboxes"]).
Compose(strict_instance_invariant: bool = True)
New constructor argument on Compose. Documented above under Breaking changes.
Bug fixes
Mosaic + Perspective + CopyAndPaste no longer raises IndexError
The pre-2.2.2 Mosaic ran filter_bboxes inside apply_to_bboxes but apply_to_masks emitted all assembled mask layers. Perspective's apply_to_masks then dropped fully-out-of-frame rows independently of BboxProcessor's min_visibility/min_area filter on bboxes. By the time the resync ran, bbox_ids = [0, 1, 2, 7] indexed into a masks tensor that had been independently compacted to size 5 — masks[7] blew up.
Fixed at three layers:
- Mosaic computes survival once in
get_params_dependent_on_data(_compute_mosaic_survival→ sharedkeep_mask+surviving_instance_ids), and a per-cell pre-pass_filter_cell_masks_to_surviving_bboxesruns beforeremap_mosaic_instance_label_idsso each cell's masks are aligned with its surviving bboxes before global concatenation. Compose.check_data_post_transformusesBboxProcessor.filter_with_keep_maskand mirrors the resulting keep-mask onto masks (positional) and keypoints (by surviving id) in one atomic step. A pre-filter realignment stage handles transforms (CoarseDropout, internal-filter Crop variants) that drop bboxes inside their ownapply_to_bboxeswithout touching masks.Compose._resync_instance_idsasserts the invariant before the next transform sees the data, surfacing any future regression asRuntimeErrorat the structural boundary instead of anIndexErrordeep in repack.
CopyAndPaste emits dense-id output
apply_to_bboxes re-stamps _bbox_instance_id = arange(N) at output; apply_to_keypoints calls _restamp_keypoint_ids with the matching old→new table. The "sparse-id positional" mask layout the resync had to special-case is no longer reachable.
CoarseDropout with instance_binding no longer breaks the next geometric transform
CoarseDropout's apply_to_bboxes filters out bboxes overlapping holes, but apply_to_mask only carves the hole into the mask data and leaves the row stack intact. The pre-filter realignment stage in Compose._bbox_filter_with_mirror collapses the resulting len(masks) > len(bboxes) mismatch by fancy-indexing masks down to the surviving id set before the next transform runs. Same fix applies to other transforms with internal bbox-only filters.
Misc
- Single conceptual
_INSTANCE_ID = "_instance_id"namespace documented for the trailing label column on both bboxes and keypoints. The two ferry-key dict constants_BBOX_INSTANCE_ID = "_bbox_instance_id"and_KP_INSTANCE_ID = "_kp_instance_id"remain as distinct dict keys (different per-row lengths) but are grouped under_INSTANCE_ID_FERRY_KEYSfor utilities like_clean_params_dict. BasicTransform.apply_to_masksnow has an explicit row-alignment contract docstring spelling out the four invariants and theRuntimeErrorconsequence.tests/test_instance_binding.pyadds:TestPipelineInvariantsHypothesis.test_pipeline_invariant_after_every_transform— 1000-example hypothesis fuzz ofpre + mix + postpipelines (including CoarseDropout) asserting the structural invariant after every transform boundary, not just at the end.TestStructuralInvariantContract— contract test that a deliberately-brokenDualTransformwhoseapply_to_masksdrops a row triggersRuntimeErrorin strict mode andUserWarningin legacy mode.
docs/design/instance_binding.mdrewritten with the new single-invariant contract + mermaid state diagram of the three enforcement layers.