Albumentations vs Kornia

Albumentations is the default augmentation library for most computer vision users who need broad per-sample policies, target-aware augmentation, replay, serialization, rich array workflows, and benchmarked DataLoader performance. It works normally with PyTorch, TensorFlow, JAX, CUDA, and GPU training pipelines. In PyTorch training, it commonly runs in Dataset.__getitem__ or DataLoader workers before collation and before the model step.

Kornia is strong when augmentation must be tensor-native, differentiable, batched, or part of a PyTorch graph. Its augmentation API works on tensors, supports batch mode on supported devices, exposes controls such as same_on_batch and p_batch, can return transformation matrices, supports inverse geometry workflows, and can coordinate images, masks, boxes, keypoints, and labels through AugmentationSequential data keys.

That is a real use case, not the normal default. Most training pipelines need broad, target-aware, debuggable augmentation before collation, while each sample still carries its image, masks, boxes, keypoints, labels, and metadata together.

Use Case Fit

User need	Better fit
Main training augmentation policy in `Dataset` / `DataLoader` workers	Albumentations
Test-time augmentation, validation diagnostics, and preprocessing experiments	Albumentations
One policy that updates images, masks, boxes, keypoints, oriented bounding boxes (OBB), labels, volumes, videos, and related arrays together	Albumentations
Replay, serialization, readable experiment configs, and inspection of sampled parameters	Albumentations
Bbox formats, label fields, filtering, visibility rules, mask interpolation, and additional targets as pipeline configuration	Albumentations
Broad transform catalog for detection, segmentation, pose, OCR, restoration, medical, remote-sensing, non-RGB, and 3D workflows	Albumentations
Differentiable augmentation inside a PyTorch graph	Kornia
Tensor-native batched augmentation on a specific device	Kornia
Transformation matrices, inverse geometry, or reusing sampled tensor augmentation parameters	Kornia
Tensor-native differentiable policies, transformation matrices, inverse geometry, or profiled GPU-side tensor operations	Kornia or PyTorch tensor code, combined with Albumentations for the main policy

Supported Targets

Kornia supports more than plain image tensors. The official docs show AugmentationSequential support for configured data_keys such as input, image, mask, bbox, bbox_xyxy, bbox_xywh, keypoints, class, and label. The same docs also state that mix augmentations such as MixUp/CutMix work only with input / image data keys, because mask, bbox, and keypoint conversion is not clearly defined for those operations.

Read the table as a decision matrix: Supported, Limited, or Not supported. Details are handled in prose below the table.

Target / data type	Albumentations	Kornia
Images	Supported	Supported
Masks	Supported	Supported
Axis-aligned bounding boxes	Supported	Supported
Oriented bounding boxes (OBB)	Supported	Not supported
Keypoints	Supported	Supported
Classification labels	Supported	Supported
Multiple related targets	Supported	Supported
Video / batched frames	Supported	Supported
Volumes / 3D	Supported	Limited
Arbitrary-channel arrays/tensors	Supported	Supported
Mosaic	Supported	Limited
CopyAndPaste	Supported	Limited
MixUp/CutMix for images	Not supported	Supported
MixUp/CutMix with masks, boxes, or keypoints	Not supported	Not supported

Kornia support for masks, boxes, keypoints, labels, and multiple targets depends on tensor/data-key setup. Kornia volume support means 3D tensor augmentation APIs. Albumentations volume support means target-aware 3D augmentation policies. Kornia RandomMosaic and RandomTransplantation are tensor/batch policies with different target semantics from Albumentations Mosaic and CopyAndPaste.

Transform Coverage

Kornia has a substantial tensor augmentation catalog: common geometric transforms, color/intensity transforms, noise, blur, weather-like transforms, MixUp/CutMix/Jigsaw/Mosaic-style tensor policies, video sequencing, and some 3D transforms. It is not a small library.

Albumentations still has the broader augmentation policy catalog for production data pipelines. The practical gap appears when the policy needs bbox-safe crops, target filtering, OBB workflows, mask-aware dropout, per-sample multi-image policies such as Mosaic and CopyAndPaste, replay/serialization, non-RGB array ergonomics, or rich 3D/video target semantics. The Kornia transform mapping separates direct Kornia mappings from unsupported rows.

Speed and Pipeline Efficiency

Use the generated Kornia benchmark page for performance comparison. It reports GPU microbenchmarks separately from the full training input path:

decode -> per-sample augmentation -> collate -> host-to-device transfer -> model

The generated benchmark results separate micro CPU, micro GPU, DataLoader CPU, and DataLoader GPU regimes:

micro benchmarks measure isolated transform execution
DataLoader benchmarks measure the training input path more closely
GPU regimes must account for tensor materialization, host-to-device transfer, dtype, batch layout, memory, and model contention
unsupported, missing, and early-stopped transforms are part of the result

For the common PyTorch training pattern where augmentation happens before collation in DataLoader workers, Albumentations is the stronger default when the benchmarked transform set and target contract match the workload.

Integration Cost

Kornia expects PyTorch tensors, usually B,C,H,W, and fits naturally inside PyTorch modules or tensor-processing code. If the project already has tensors on the right device and the augmentation needs differentiability, matrices, or inverse geometry, integration cost can be low.

Albumentations commonly receives NumPy arrays in OpenCV-style H,W,C channel-last layout and returns augmented arrays plus targets. In a PyTorch project, the normal integration point is the dataset: decode or load the sample, run Albumentations, then convert and collate for the model. That same array-first boundary also works in TensorFlow, JAX, and custom training loops.

For annotated data, Albumentations keeps the policy and target contract in A.Compose: bbox formats, label fields, keypoint formats, visibility/filtering rules, mask interpolation, replay, serialization, and additional targets live in one place.

GPU Memory

Kornia can run augmentation on GPU tensors. That is useful when the GPU has real idle capacity and the complete training step gets faster. It is harmful when augmentation competes with the model for memory, kernels, or batch size.

Albumentations usually keeps augmentation in CPU-side dataset or DataLoader code before the batch reaches GPU training. That is a normal GPU training pipeline: the model trains on GPU while augmentation prepares future batches before transfer. GPU memory comparisons should be reported separately for Kornia GPU regimes and should include dtype, batch shape, transfer scope, and early stops.

Combining Albumentations and Kornia

The best PyTorch pipeline can use both libraries:

run Albumentations before collation for per-sample spatial, color, corruption, target-aware, video, volume, Mosaic, or CopyAndPaste policies
convert and collate the batch
run Kornia where tensor-native execution is the requirement: differentiable transforms, transformation matrices, inverse geometry, batch-level tensor work, or GPU-side operations when profiling shows an end-to-end win

Do not treat Kornia MixUp/CutMix or GPU tensor support as a reason to move the whole augmentation stack out of Albumentations. Treat them as post-collation tools when their semantics match the training objective.

What You Gain Moving from Kornia

Broader augmentation-policy coverage for real dataset pipelines.
A target-aware A.Compose contract for masks, boxes, keypoints, OBB, labels, additional targets, video, and volumes.
Explicit filtering, visibility, bbox/keypoint format, and mask interpolation configuration.
Replay and serialization for debugging and reproducible experiments.
Array-first integration across PyTorch, TensorFlow, JAX, and custom pipelines.
Benchmark coverage for the full DataLoader path, not only isolated tensor transforms.

What You Lose Moving from Kornia

Differentiability through augmentation operations.
Native placement inside a PyTorch graph or torch.nn.Module-style tensor pipeline.
Built-in batch controls such as same_on_batch and p_batch.
Transformation matrices, inverse geometry helpers, and reuse of sampled tensor augmentation parameters.
Convenience for GPU tensor augmentation when profiling proves the GPU path improves the complete step.

Bottom Line

Use Albumentations for the main augmentation policy in most training, TTA, validation, preprocessing, and target-aware workflows.

Use Kornia when differentiability, PyTorch graph placement, transformation matrices, inverse geometry, or profiled tensor-side augmentation is the actual requirement. In many PyTorch projects, combine them: Albumentations for the main augmentation policy and Kornia where tensor semantics are the point.

Evidence Sources

Kornia capability source: Kornia augmentation documentation, AugmentationSequential, and image augmentation APIs
Benchmark source: albumentations-team/benchmark
Generated benchmark route: Kornia benchmarks
Mapping route: Kornia transform mapping