Albumentations to Torchvision Transform Mapping
On this page
- How to Read This Page
- Migration Rules
- Geometry and Size
- Color, Intensity, and Tensor Operations
- Weather, Illumination, and Dropout
- Target-Aware and Multi-Image Policies
- Volume and Domain-Specific Transforms
- Migration Example
- Related Pages
This page maps Albumentations transforms to the closest Torchvision v2 operation when Torchvision has a practical direct equivalent. Use it as a migration and capability guide, not as a performance page.
How to Read This Page
- A named Torchvision transform means Torchvision v2 has a practical direct operation for the same idea.
-means Torchvision does not support that transform as a built-in augmentation primitive.(partial)means Torchvision covers part of the behavior, but not the full Albumentations transform contract.- The table shows built-in Torchvision operations. Custom PyTorch implementations are not counted as Torchvision support.
Migration Rules
- Albumentations receives NumPy arrays in OpenCV-style
H,W,Cchannel-last layout. Torchvision usually receives PIL images orC,H,Wtensors. - Albumentations uses transform names plus
p; Torchvision often encodes randomness in class names such asRandomHorizontalFlip. - Albumentations can update masks, boxes, keypoints, oriented bounding boxes (OBB), volumes, and videos through
A.Compose. Torchvision v2 can update supported TVTensor targets, but its policy model and serialization/debug contract are different. - Torchvision color transforms often follow PIL semantics. Albumentations often uses OpenCV or explicit dtype-aware behavior, so small pixel differences are expected.
- Albumentations
MosaicandCopyAndPasteare per-sample multi-image policies with target handling. They are not the same product boundary as batch-level MixUp or CutMix after collation. - MixUp, CutMix, GPU-side normalization, and other batch-level tensor policies are a good reason to combine Albumentations with Torchvision/PyTorch tensor code, not a reason to give up Albumentations for the rest of the pipeline.
Geometry and Size
| Albumentations transform | Torchvision transform |
|---|---|
| Resize | Resize |
| RandomCrop | RandomCrop |
| RandomResizedCrop | RandomResizedCrop |
| HorizontalFlip | RandomHorizontalFlip |
| VerticalFlip | RandomVerticalFlip |
| Pad | Pad |
| Rotate | RandomRotation |
| Affine | RandomAffine |
| Perspective | RandomPerspective |
| ElasticTransform | ElasticTransform |
| Transpose | - |
| RandomRotate90 | - |
| SquareSymmetry | - |
| SafeRotate | - |
| RandomScale | ScaleJitter / RandomResize (partial) |
| ShiftScaleRotate | RandomAffine (partial) |
| GridDistortion | - |
PiecewiseAffine | - |
| ThinPlateSpline | - |
| OpticalDistortion | - |
| RandomGridShuffle | - |
| Morphological | - |
| LongestMaxSize | Resize / RandomShortestSize (partial) |
| SmallestMaxSize | Resize / RandomShortestSize (partial) |
| PadIfNeeded | Pad (partial) |
| CropAndPad | RandomZoomOut (partial) |
| RandomSizedCrop | RandomResizedCrop (partial) |
Color, Intensity, and Tensor Operations
Weather, Illumination, and Dropout
| Albumentations transform | Torchvision transform |
|---|---|
| CoarseDropout | RandomErasing (partial) |
| Illumination | - |
| PlasmaBrightnessContrast | - |
| PlasmaShadow | - |
| RandomRain | - |
| SaltAndPepper | - |
| RandomSnow | - |
| RandomFog | - |
| RandomShadow | - |
| RandomSunFlare | - |
| RandomGravel | - |
| GridDropout | - |
| PixelDropout | - |
| ConstrainedCoarseDropout | - |
| AtmosphericFog | - |
| Vignetting | - |
| Dithering | - |
| FilmGrain | - |
| Halftone | - |
| LensFlare | - |
| GridMask | - |
| WaterRefraction | - |
Target-Aware and Multi-Image Policies
| Albumentations transform | Torchvision transform |
|---|---|
| AtLeastOneBBoxRandomCrop | RandomIoUCrop (partial) |
| BBoxSafeRandomCrop | RandomIoUCrop (partial) |
| RandomSizedBBoxSafeCrop | RandomIoUCrop + RandomResizedCrop (partial) |
| CropNonEmptyMaskIfExists | - |
| RandomCropNearBBox | - |
| RandomCropFromBorders | - |
| LetterBox | - |
| MaskDropout | - |
| Mosaic | - |
| CopyAndPaste | - |
Volume and Domain-Specific Transforms
| Albumentations transform | Torchvision transform |
|---|---|
| FDA | - |
| HistogramMatching | - |
| PixelDistributionAdaptation | - |
| HEStain | - |
| FrequencyMasking | - |
| TimeMasking | - |
| TimeReverse | - |
| CenterCrop3D | - |
| RandomCrop3D | - |
| CoarseDropout3D | - |
| CubicSymmetry | - |
| GridShuffle3D | - |
| Pad3D | - |
| PadIfNeeded3D | - |
Migration Example
Torchvision v2 classification pipelines map cleanly when the policy is image-only:
import torch
import torchvision.transforms.v2 as T
pipeline = T.Compose(
[
T.RandomResizedCrop(size=(224, 224), scale=(0.08, 1.0), ratio=(0.75, 4 / 3)),
T.RandomHorizontalFlip(p=0.5),
T.ColorJitter(brightness=0.5, contrast=1.5, saturation=1.5, hue=0.5),
T.ToImage(),
T.ToDtype(torch.float32, scale=True),
T.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
],
)
The equivalent Albumentations policy keeps the same structure:
import albumentations as A
pipeline = A.Compose(
[
A.RandomResizedCrop(size=(224, 224), scale=(0.08, 1.0), ratio=(0.75, 4 / 3), p=1.0),
A.HorizontalFlip(p=0.5),
A.ColorJitter(
brightness_range=(0.5, 1.5),
contrast_range=(0, 2.5),
saturation_range=(0, 2.5),
hue_range=(-0.5, 0.5),
p=0.5,
),
A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
],
)
For detection, the Torchvision version usually relies on TVTensors:
import torchvision.transforms.v2 as T
pipeline = T.Compose(
[
T.RandomResizedCrop(size=(512, 512), scale=(0.8, 1.0)),
T.RandomHorizontalFlip(p=0.5),
],
)
The Albumentations version puts the target contract in A.Compose:
import albumentations as A
pipeline = A.Compose(
[
A.RandomResizedCrop(size=(512, 512), scale=(0.8, 1.0), p=1.0),
A.HorizontalFlip(p=0.5),
],
bbox_params=A.BboxParams(coord_format="coco", label_fields=["labels"]),
)