Albumentations vs Kornia

On this page

Use Albumentations for normal training augmentation. Use Kornia only when you specifically need differentiable transforms, augmentation inside a PyTorch graph, or GPU tensor augmentation that profiling proves is not competing with model training.

That is a narrow exception. Most training pipelines do not need differentiable augmentation. They need fast, diverse, correct augmentation before the forward pass, while each sample still carries its image, masks, boxes, keypoints, labels, and metadata together. That is the Albumentations use case.

Short Version

Choose Albumentations when:

  • augmentation runs in the dataloader before batching
  • you train detection, segmentation, pose, OCR, medical, remote-sensing, or video models
  • masks, boxes, keypoints, rotated boxes, volumes, or video frames must stay aligned with the image
  • images may have arbitrary channel counts
  • you need replay, sampled parameters, and reproducible debugging
  • you want a broader augmentation catalog without writing PyTorch modules for every missing transform

Choose Kornia only when:

  • the transform must be differentiable
  • augmentation is part of a model-like PyTorch graph
  • profiling shows GPU augmentation uses idle compute instead of slowing training
  • the task is image-only or you already own all annotation transformation logic

What You Get with Albumentations

  • Correct samples before batching. Albumentations transforms an image and its related targets in one call. That is the clean boundary for crops, flips, affine transforms, perspective transforms, bbox filtering, mask updates, and keypoint handling.
  • Less custom target code. With a tensor-first augmentation layer, annotation propagation often becomes your code. With Albumentations, common target handling is part of the pipeline.
  • Faster CPU augmentation in the common case. The Kornia benchmark route and benchmark source show Albumentations ahead for many CPU-side transforms used in real pipelines, including Affine, Blur, GaussianBlur, MotionBlur, Perspective, RandomResizedCrop, RandomGamma, and Solarize.
  • More augmentation diversity. The Kornia transform mapping shows many Albumentations transforms where the direct Kornia benchmark equivalent is -: weather effects, camera effects, dropout variants, padding/cropping utilities, and annotation-aware transforms.
  • No GPU tax unless you asked for one. CPU-side augmentation can prepare the next batch while the GPU trains. GPU augmentation only helps when it uses otherwise idle GPU time; otherwise it steals compute from forward/backward passes.
  • Debuggable and serializable policies. Replay, saved sampled parameters, and pipeline serialization make augmentation part of the experiment definition instead of invisible tensor-side randomness.

What You Lose If You Default to Kornia

If you do not need differentiability, using Kornia for the main augmentation pipeline usually buys little and can add work:

  • you may move augmentation onto the same GPU that should train the model
  • you often keep owning bbox clipping, filtering, visibility, and keypoint semantics
  • object-level crop logic can happen after batching, where it is harder to reason about individual samples
  • replay/debug tooling for sampled augmentation parameters becomes your responsibility
  • transforms missing from Kornia's augmentation API become custom PyTorch modules
  • annotation-heavy workloads still need separate correctness tests

For image-only differentiable transforms, Kornia can be the right tool. For ordinary training augmentation, those benefits rarely matter; correctness, diversity, throughput, and maintainability matter more.

Pipeline Placement

The difference is not just array type. It is where augmentation happens.

Albumentations: decode -> per-sample augmentation with targets -> batch -> model
Kornia in a dataloader: decode -> tensor conversion -> per-sample tensor augmentation -> batch -> model

If augmentation must update annotations before collation, Albumentations puts that work at the natural boundary. The sample is still a sample. Boxes can be filtered, masks can stay aligned, keypoint labels can move with coordinates, and metadata can remain attached.

Kornia is useful when the operation is naturally a PyTorch tensor operation: differentiable geometry, model-graph image processing, or GPU-side tensor math. It is a poor default when the main problem is bookkeeping: keeping boxes, masks, keypoints, labels, and per-object filtering correct after each random transform.

Channel Support

Kornia tensors can hold arbitrary channel counts for many tensor operations. Albumentations can also preserve arbitrary channel counts for many channel-agnostic transforms. In both libraries, RGB/color transforms still have color assumptions.

For hyperspectral, medical, microscopy, remote-sensing, or sensor-fusion inputs, the rule is simple: use channel-agnostic transforms unless converting to RGB is deliberate. Albumentations keeps this workflow in the same per-sample pipeline as the rest of your targets.

Migration Example

For image-only geometry, the code shape is familiar:

import kornia.augmentation as K
import albumentations as A

kornia_pipeline = K.AugmentationSequential(
    K.RandomHorizontalFlip(p=0.5),
    K.RandomRotation(degrees=(-10, 10), p=0.5),
    data_keys=["input"],
)

albumentations_pipeline = A.Compose([
    A.HorizontalFlip(p=0.5),
    A.Rotate(angle_range=(-10, 10), p=0.5),
])

The migration value appears when the sample has targets:

pipeline = A.Compose(
    [
        A.HorizontalFlip(p=0.5),
        A.Affine(rotate=(-10, 10), p=0.5),
    ],
    keypoint_params=A.KeypointParams(coord_format="xy", label_fields=["keypoint_labels"]),
)

Now the augmentation policy and target contract live in one place.