AlbumentationsX vs Kornia

Compare AlbumentationsX with Kornia for image augmentation: CPU/GPU tradeoffs, RGB benchmark results, and conversion examples.

What Is Different?

Kornia is a differentiable computer vision library for PyTorch tensors. AlbumentationsX is a fast CPU augmentation library for NumPy arrays before data reaches the model.

  • Kornia is tensor-first and shines on batched GPU augmentation; AlbumentationsX is NumPy-first and shines in CPU data-loading pipelines.
  • Kornia transforms can be differentiable and participate in model graphs; AlbumentationsX transforms are preprocessing/data augmentation steps.
  • AlbumentationsX has broad task-level target handling for images, masks, boxes, keypoints, volumes, and multiple related inputs.
  • Kornia is a better fit when augmentation must happen on-device after batching; AlbumentationsX is usually simpler before batching.

Benchmark vs Kornia

The benchmark data is generated during the site build from the published benchmark artifacts. Results are shown for the full PyTorch DataLoader pipeline and for the single-operation RGB micro benchmark. Published GPU artifacts are shown in the DataLoader table where they exist; the RGB micro table on this page is CPU.

DataLoader Pipeline

End-to-end in-memory DataLoader throughput with 8 workers, batch size 256, normalization, tensor conversion, batching, and collation included. The table shows the variable transform name only. Higher images/second is better. CPU and GPU columns are shown where published artifacts exist.

56 / 57
pipelines where AlbumentationsX is faster
3.6x
average speedup
average computed from benchmark table rows
2.2.6 vs 0.8.2
library versions
from generated benchmark metadata
TransformSpeedup
Albx / best
AlbumentationsX
CPU · 2.2.6
Kornia
CPU · 0.8.2
Kornia
GPU · 0.8.2
MedianBlur12x4038 ± 089 ± 0327 ± 2
Elastic9.0-9.2x2878 ± 0103 ± 0316 ± 2
JpegCompression5.5x4267 ± 0771 ± 0590 ± 2
ColorJiggle5.3x4255 ± 0800 ± 0529 ± 2
PlasmaBrightness4.7-5.1x2727 ± 0458 ± 0562 ± 22
CLAHE4.4x3505 ± 0789 ± 0165 ± 0
ColorJitter4.4x4299 ± 0987 ± 0573 ± 3
Blur4.2x5507 ± 01321 ± 0651 ± 2
Hue4.1x4723 ± 01144 ± 0586 ± 6
PlasmaContrast4.1x2291 ± 0455 ± 0557 ± 2
Saturation4.1x4646 ± 01141 ± 0587 ± 4
Sharpen3.9x4978 ± 01267 ± 0627 ± 5
GaussianBlur3.9x5029 ± 01289 ± 0650 ± 5
MotionBlur3.9x4785 ± 01238 ± 0657 ± 2
Snow3.8x4135 ± 01102 ± 0596 ± 3
Erasing3.6x5415 ± 01522 ± 0503 ± 2
Solarize3.4x5407 ± 01609 ± 0644 ± 2
RandomGamma3.4x5450 ± 01624 ± 0653 ± 5
Equalize3.4x4511 ± 01344 ± 0321 ± 0
RandomRotate903.3x5087 ± 01520 ± 0632 ± 8
Posterize3.3x5431 ± 01660 ± 0562 ± 3
Rotate3.3x4865 ± 01488 ± 0642 ± 5
ChannelShuffle3.1x5467 ± 01739 ± 0643 ± 2
Perspective3.1x4132 ± 01325 ± 0635 ± 1
SaltAndPepper3.1x4508 ± 01462 ± 0375 ± 11
ChannelDropout3.0x5314 ± 01742 ± 0660 ± 3
RandomResizedCrop3.0x5056 ± 01659 ± 0905 ± 7
Contrast3.0x5435 ± 01829 ± 0649 ± 5
Invert2.9x5569 ± 01898 ± 0659 ± 3
PlasmaShadow2.9x2797 ± 0956 ± 0612 ± 22
RandomJigsaw2.9x4863 ± 01664 ± 0636 ± 2
VerticalFlip2.9x5512 ± 01887 ± 0669 ± 2
Grayscale2.9x5378 ± 01845 ± 0655 ± 2
Brightness2.9x5310 ± 01832 ± 0651 ± 5
Affine2.9x4528 ± 01566 ± 0646 ± 2
PlankianJitter2.8x4979 ± 01763 ± 0654 ± 8
RGBShift2.8x4977 ± 01770 ± 0658 ± 3
RandomCrop2242.8x5130 ± 01856 ± 0959 ± 9
HorizontalFlip2.8x5002 ± 01819 ± 0670 ± 2
Rain2.7x4198 ± 01541 ± 0306 ± 2
AutoContrast2.7x4646 ± 01721 ± 0651 ± 4
Shear2.7x4141 ± 01557 ± 0
CornerIllumination2.6x3969 ± 01502 ± 0395 ± 2
GaussianIllumination2.5x3656 ± 01458 ± 0
Resize2.5x1350 ± 0538 ± 0181 ± 0
OpticalDistortion2.5x3579 ± 01445 ± 0641 ± 6
SmallestMaxSize2.5x1370 ± 0555 ± 0179 ± 0
LongestMaxSize2.4x1363 ± 0560 ± 0179 ± 0
LinearIllumination2.4x4179 ± 01726 ± 0503 ± 2
GaussianNoise2.1x3415 ± 01634 ± 0660 ± 7
ThinPlateSpline0.9x743 ± 0793 ± 0598 ± 2
EnhanceDetail5033 ± 0
EnhanceEdge4923 ± 0
Pad4996 ± 0
PhotoMetricDistort4269 ± 0
Transpose5338 ± 0
UnsharpMask4635 ± 0

Micro Benchmark

The benchmark below compares single-threaded RGB image augmentation throughput. Kornia receives normalized tensors; AlbumentationsX receives OpenCV-loaded RGB NumPy arrays. Higher images/second is better.

51 / 51
transforms where AlbumentationsX is faster
9.10x
median speedup
4.61x-17.06x IQR
2.2.6 vs 0.8.2
library versions
from generated benchmark metadata
TransformSpeedup
Albx / best
albumentationsx
2.2.6
Kornia
0.8.2
Blur78-79x4449 ± 1757 ± 0
Posterize48-51x14399 ± 58290 ± 9
Solarize45-46x9760 ± 34214 ± 1
MedianBlur≥42x843 ± 4≤20
GaussianBlur41x2343 ± 457 ± 0
RandomCrop22439-40x38380 ± 192981 ± 5
RandomGamma32-33x9938 ± 46308 ± 2
Erasing32x9511 ± 74298 ± 1
MotionBlur24-25x1953 ± 2181 ± 1
Sharpen24x1388 ± 558 ± 0
RandomJigsaw23-24x5172 ± 16219 ± 2
ColorJiggle19x639 ± 534 ± 0
RandomRotate9018x5990 ± 85333 ± 4
JpegCompression16x692 ± 743 ± 0
Invert15x15095 ± 611015 ± 2
Hue15x967 ± 166 ± 0
PlasmaBrightness≥13x267 ± 1≤20
VerticalFlip13x14051 ± 551067 ± 2
Saturation12-13x847 ± 1767 ± 0
Grayscale12x5194 ± 1418 ± 1
ColorJitter12-13x641 ± 152 ± 1
RandomResizedCrop11-12x7150 ± 19622 ± 3
Elastic≥9.5x191 ± 0≤20
SmallestMaxSize9.3-9.6x2017 ± 25214 ± 1
HorizontalFlip9.0-9.3x8416 ± 19920 ± 10
Resize8.9-9.3x2463 ± 37271 ± 1
Brightness8.9-9.1x6912 ± 13766 ± 7
Contrast8.8-9.1x6933 ± 30771 ± 9
ChannelShuffle8.9-9.0x4337 ± 13487 ± 2
LongestMaxSize8.4-8.7x2825 ± 42330 ± 1
ChannelDropout8.1-8.4x6810 ± 65828 ± 6
PlasmaShadow7.9-8.0x420 ± 353 ± 0
Snow7.8-7.9x489 ± 362 ± 0
PlasmaContrast≥7.1x143 ± 0≤20
Equalize6.3x807 ± 3128 ± 0
AutoContrast5.3-5.5x1243 ± 19231 ± 1
SaltAndPepper4.7-4.9x738 ± 10154 ± 1
GaussianNoise4.6x225 ± 049 ± 0
CLAHE4.6x283 ± 162 ± 0
Rotate4.2-4.5x1408 ± 40325 ± 2
PlankianJitter3.8-3.9x2253 ± 17580 ± 2
RGBShift3.8-3.9x2292 ± 3597 ± 3
Perspective3.1x559 ± 2181 ± 1
CornerIllumination2.7x425 ± 2157 ± 0
Rain2.4x1259 ± 2527 ± 5
Affine2.1-2.2x872 ± 8402 ± 3
GaussianIllumination2.1x388 ± 1188 ± 0
Shear1.9-2.0x784 ± 6403 ± 1
LinearIllumination1.6x521 ± 1327 ± 3
ThinPlateSpline1.4x52 ± 036 ± 0
OpticalDistortion1.4x274 ± 1201 ± 1
EnhanceDetail2148 ± 13
EnhanceEdge1373 ± 16
Pad13181 ± 118
PhotoMetricDistort581 ± 4
Transpose4627 ± 26
UnsharpMask906 ± 2

See the aggregate image benchmark or inspect the benchmark source code.

Conversion Guide

The conversion usually moves augmentation from tensor batches in the training step into the Dataset/DataLoader preprocessing path.

  • Apply AlbumentationsX before converting the sample to a tensor.
  • Use ToTensorV2 at the end of the pipeline if the model expects PyTorch tensors.
  • Move per-batch GPU-only transforms to AlbumentationsX only when they do not require differentiability or batched tensor semantics.
  • Keep Kornia for model-integrated or differentiable computer vision operations.
Kornia
import kornia.augmentation as K
import torch.nn as nn

augment = nn.Sequential(
    K.RandomHorizontalFlip(p=0.5),
    K.ColorJitter(brightness=0.2, contrast=0.2, p=0.5),
)

batch = augment(batch)
AlbumentationsX
import albumentations as A
from albumentations.pytorch import ToTensorV2

transform = A.Compose([
    A.HorizontalFlip(p=0.5),
    A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.5),
    ToTensorV2(),
])

sample = transform(image=image_np)
image = sample["image"]

Use AlbumentationsX When

  • CPU-side data augmentation before batching.
  • Classic supervised CV pipelines with masks, bounding boxes, keypoints, or multiple aligned images.
  • Workloads where augmentation speed in the input pipeline matters more than differentiability.

Use Kornia When

  • Differentiable image processing inside PyTorch models.
  • GPU batched augmentation, especially when the input pipeline is already tensor-native.
  • Research code that needs gradients through geometric or photometric image operations.