AlbumentationsX vs Kornia

Compare AlbumentationsX with Kornia for image augmentation: CPU/GPU tradeoffs, RGB benchmark results, and conversion examples.

What Is Different?

Kornia is a differentiable computer vision library for PyTorch tensors. AlbumentationsX is a fast CPU augmentation library for NumPy arrays before data reaches the model.

Kornia is tensor-first and shines on batched GPU augmentation; AlbumentationsX is NumPy-first and shines in CPU data-loading pipelines.
Kornia transforms can be differentiable and participate in model graphs; AlbumentationsX transforms are preprocessing/data augmentation steps.
AlbumentationsX has broad task-level target handling for images, masks, boxes, keypoints, volumes, and multiple related inputs.
Kornia is a better fit when augmentation must happen on-device after batching; AlbumentationsX is usually simpler before batching.

Benchmark vs Kornia

The benchmark data is generated during the site build from the published benchmark artifacts. Results are shown for the full PyTorch DataLoader pipeline and for the single-operation RGB micro benchmark. Published GPU artifacts are shown in the DataLoader table where they exist; the RGB micro table on this page is CPU.

DataLoader Pipeline

End-to-end in-memory DataLoader throughput with 8 workers, batch size 256, normalization, tensor conversion, batching, and collation included. The table shows the variable transform name only. Higher images/second is better. CPU and GPU columns are shown where published artifacts exist.

56 / 57

pipelines where AlbumentationsX is faster

3.6x

average speedup

average computed from benchmark table rows

2.2.6 vs 0.8.2

library versions

from generated benchmark metadata

Transform	Speedup Albx / best	AlbumentationsX CPU · 2.2.6	Kornia CPU · 0.8.2	Kornia GPU · 0.8.2
MedianBlur	12x	4038 ± 0	89 ± 0	327 ± 2
Elastic	9.0-9.2x	2878 ± 0	103 ± 0	316 ± 2
JpegCompression	5.5x	4267 ± 0	771 ± 0	590 ± 2
ColorJiggle	5.3x	4255 ± 0	800 ± 0	529 ± 2
PlasmaBrightness	4.7-5.1x	2727 ± 0	458 ± 0	562 ± 22
CLAHE	4.4x	3505 ± 0	789 ± 0	165 ± 0
ColorJitter	4.4x	4299 ± 0	987 ± 0	573 ± 3
Blur	4.2x	5507 ± 0	1321 ± 0	651 ± 2
Hue	4.1x	4723 ± 0	1144 ± 0	586 ± 6
PlasmaContrast	4.1x	2291 ± 0	455 ± 0	557 ± 2
Saturation	4.1x	4646 ± 0	1141 ± 0	587 ± 4
Sharpen	3.9x	4978 ± 0	1267 ± 0	627 ± 5
GaussianBlur	3.9x	5029 ± 0	1289 ± 0	650 ± 5
MotionBlur	3.9x	4785 ± 0	1238 ± 0	657 ± 2
Snow	3.8x	4135 ± 0	1102 ± 0	596 ± 3
Erasing	3.6x	5415 ± 0	1522 ± 0	503 ± 2
Solarize	3.4x	5407 ± 0	1609 ± 0	644 ± 2
RandomGamma	3.4x	5450 ± 0	1624 ± 0	653 ± 5
Equalize	3.4x	4511 ± 0	1344 ± 0	321 ± 0
RandomRotate90	3.3x	5087 ± 0	1520 ± 0	632 ± 8
Posterize	3.3x	5431 ± 0	1660 ± 0	562 ± 3
Rotate	3.3x	4865 ± 0	1488 ± 0	642 ± 5
ChannelShuffle	3.1x	5467 ± 0	1739 ± 0	643 ± 2
Perspective	3.1x	4132 ± 0	1325 ± 0	635 ± 1
SaltAndPepper	3.1x	4508 ± 0	1462 ± 0	375 ± 11
ChannelDropout	3.0x	5314 ± 0	1742 ± 0	660 ± 3
RandomResizedCrop	3.0x	5056 ± 0	1659 ± 0	905 ± 7
Contrast	3.0x	5435 ± 0	1829 ± 0	649 ± 5
Invert	2.9x	5569 ± 0	1898 ± 0	659 ± 3
PlasmaShadow	2.9x	2797 ± 0	956 ± 0	612 ± 22
RandomJigsaw	2.9x	4863 ± 0	1664 ± 0	636 ± 2
VerticalFlip	2.9x	5512 ± 0	1887 ± 0	669 ± 2
Grayscale	2.9x	5378 ± 0	1845 ± 0	655 ± 2
Brightness	2.9x	5310 ± 0	1832 ± 0	651 ± 5
Affine	2.9x	4528 ± 0	1566 ± 0	646 ± 2
PlankianJitter	2.8x	4979 ± 0	1763 ± 0	654 ± 8
RGBShift	2.8x	4977 ± 0	1770 ± 0	658 ± 3
RandomCrop224	2.8x	5130 ± 0	1856 ± 0	959 ± 9
HorizontalFlip	2.8x	5002 ± 0	1819 ± 0	670 ± 2
Rain	2.7x	4198 ± 0	1541 ± 0	306 ± 2
AutoContrast	2.7x	4646 ± 0	1721 ± 0	651 ± 4
Shear	2.7x	4141 ± 0	1557 ± 0	—
CornerIllumination	2.6x	3969 ± 0	1502 ± 0	395 ± 2
GaussianIllumination	2.5x	3656 ± 0	1458 ± 0	—
Resize	2.5x	1350 ± 0	538 ± 0	181 ± 0
OpticalDistortion	2.5x	3579 ± 0	1445 ± 0	641 ± 6
SmallestMaxSize	2.5x	1370 ± 0	555 ± 0	179 ± 0
LongestMaxSize	2.4x	1363 ± 0	560 ± 0	179 ± 0
LinearIllumination	2.4x	4179 ± 0	1726 ± 0	503 ± 2
GaussianNoise	2.1x	3415 ± 0	1634 ± 0	660 ± 7
ThinPlateSpline	0.9x	743 ± 0	793 ± 0	598 ± 2
EnhanceDetail	—	5033 ± 0	—	—
EnhanceEdge	—	4923 ± 0	—	—
Pad	—	4996 ± 0	—	—
PhotoMetricDistort	—	4269 ± 0	—	—
Transpose	—	5338 ± 0	—	—
UnsharpMask	—	4635 ± 0	—	—

Micro Benchmark

The benchmark below compares single-threaded RGB image augmentation throughput. Kornia receives normalized tensors; AlbumentationsX receives OpenCV-loaded RGB NumPy arrays. Higher images/second is better.

51 / 51

transforms where AlbumentationsX is faster

9.10x

median speedup

4.61x-17.06x IQR

2.2.6 vs 0.8.2

library versions

from generated benchmark metadata

Transform	Speedup Albx / best	albumentationsx 2.2.6	Kornia 0.8.2
Blur	78-79x	4449 ± 17	57 ± 0
Posterize	48-51x	14399 ± 58	290 ± 9
Solarize	45-46x	9760 ± 34	214 ± 1
MedianBlur	≥42x	843 ± 4	≤20
GaussianBlur	41x	2343 ± 4	57 ± 0
RandomCrop224	39-40x	38380 ± 192	981 ± 5
RandomGamma	32-33x	9938 ± 46	308 ± 2
Erasing	32x	9511 ± 74	298 ± 1
MotionBlur	24-25x	1953 ± 21	81 ± 1
Sharpen	24x	1388 ± 5	58 ± 0
RandomJigsaw	23-24x	5172 ± 16	219 ± 2
ColorJiggle	19x	639 ± 5	34 ± 0
RandomRotate90	18x	5990 ± 85	333 ± 4
JpegCompression	16x	692 ± 7	43 ± 0
Invert	15x	15095 ± 61	1015 ± 2
Hue	15x	967 ± 1	66 ± 0
PlasmaBrightness	≥13x	267 ± 1	≤20
VerticalFlip	13x	14051 ± 55	1067 ± 2
Saturation	12-13x	847 ± 17	67 ± 0
Grayscale	12x	5194 ± 1	418 ± 1
ColorJitter	12-13x	641 ± 1	52 ± 1
RandomResizedCrop	11-12x	7150 ± 19	622 ± 3
Elastic	≥9.5x	191 ± 0	≤20
SmallestMaxSize	9.3-9.6x	2017 ± 25	214 ± 1
HorizontalFlip	9.0-9.3x	8416 ± 19	920 ± 10
Resize	8.9-9.3x	2463 ± 37	271 ± 1
Brightness	8.9-9.1x	6912 ± 13	766 ± 7
Contrast	8.8-9.1x	6933 ± 30	771 ± 9
ChannelShuffle	8.9-9.0x	4337 ± 13	487 ± 2
LongestMaxSize	8.4-8.7x	2825 ± 42	330 ± 1
ChannelDropout	8.1-8.4x	6810 ± 65	828 ± 6
PlasmaShadow	7.9-8.0x	420 ± 3	53 ± 0
Snow	7.8-7.9x	489 ± 3	62 ± 0
PlasmaContrast	≥7.1x	143 ± 0	≤20
Equalize	6.3x	807 ± 3	128 ± 0
AutoContrast	5.3-5.5x	1243 ± 19	231 ± 1
SaltAndPepper	4.7-4.9x	738 ± 10	154 ± 1
GaussianNoise	4.6x	225 ± 0	49 ± 0
CLAHE	4.6x	283 ± 1	62 ± 0
Rotate	4.2-4.5x	1408 ± 40	325 ± 2
PlankianJitter	3.8-3.9x	2253 ± 17	580 ± 2
RGBShift	3.8-3.9x	2292 ± 3	597 ± 3
Perspective	3.1x	559 ± 2	181 ± 1
CornerIllumination	2.7x	425 ± 2	157 ± 0
Rain	2.4x	1259 ± 2	527 ± 5
Affine	2.1-2.2x	872 ± 8	402 ± 3
GaussianIllumination	2.1x	388 ± 1	188 ± 0
Shear	1.9-2.0x	784 ± 6	403 ± 1
LinearIllumination	1.6x	521 ± 1	327 ± 3
ThinPlateSpline	1.4x	52 ± 0	36 ± 0
OpticalDistortion	1.4x	274 ± 1	201 ± 1
EnhanceDetail	—	2148 ± 13	—
EnhanceEdge	—	1373 ± 16	—
Pad	—	13181 ± 118	—
PhotoMetricDistort	—	581 ± 4	—
Transpose	—	4627 ± 26	—
UnsharpMask	—	906 ± 2	—

See the aggregate image benchmark or inspect the benchmark source code.

Conversion Guide

The conversion usually moves augmentation from tensor batches in the training step into the Dataset/DataLoader preprocessing path.

Apply AlbumentationsX before converting the sample to a tensor.
Use ToTensorV2 at the end of the pipeline if the model expects PyTorch tensors.
Move per-batch GPU-only transforms to AlbumentationsX only when they do not require differentiability or batched tensor semantics.
Keep Kornia for model-integrated or differentiable computer vision operations.

Kornia

import kornia.augmentation as K
import torch.nn as nn

augment = nn.Sequential(
    K.RandomHorizontalFlip(p=0.5),
    K.ColorJitter(brightness=0.2, contrast=0.2, p=0.5),
)

batch = augment(batch)

AlbumentationsX

import albumentations as A
from albumentations.pytorch import ToTensorV2

transform = A.Compose([
    A.HorizontalFlip(p=0.5),
    A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.5),
    ToTensorV2(),
])

sample = transform(image=image_np)
image = sample["image"]

Use AlbumentationsX When

CPU-side data augmentation before batching.
Classic supervised CV pipelines with masks, bounding boxes, keypoints, or multiple aligned images.
Workloads where augmentation speed in the input pipeline matters more than differentiability.

Use Kornia When

Differentiable image processing inside PyTorch models.
GPU batched augmentation, especially when the input pipeline is already tensor-native.
Research code that needs gradients through geometric or photometric image operations.