AlbumentationsX vs Kornia
Compare AlbumentationsX with Kornia for image augmentation: CPU/GPU tradeoffs, RGB benchmark results, and conversion examples.
What Is Different?
Kornia is a differentiable computer vision library for PyTorch tensors. AlbumentationsX is a fast CPU augmentation library for NumPy arrays before data reaches the model.
- Kornia is tensor-first and shines on batched GPU augmentation; AlbumentationsX is NumPy-first and shines in CPU data-loading pipelines.
- Kornia transforms can be differentiable and participate in model graphs; AlbumentationsX transforms are preprocessing/data augmentation steps.
- AlbumentationsX has broad task-level target handling for images, masks, boxes, keypoints, volumes, and multiple related inputs.
- Kornia is a better fit when augmentation must happen on-device after batching; AlbumentationsX is usually simpler before batching.
Benchmark vs Kornia
The benchmark below compares single-threaded RGB image augmentation throughput. Kornia receives normalized tensors; AlbumentationsX receives OpenCV-loaded RGB NumPy arrays. Higher images/second is better.
52 / 53
transforms where AlbumentationsX is faster
4.99x
median speedup
2.74x-10.13x IQR
2.2.5 vs 0.8.2
library versions
from generated benchmark metadata
| Transform | albumentationsx 2.2.5 CPU · macOS arm64 | Kornia 0.8.2 CPU · macOS arm64 | Speedup Albx / best other (range) |
|---|---|---|---|
| MedianBlur | 1546 ± 16 | 7 ± 0 | 231-240x |
| Elastic | 453 ± 2 | 4 ± 0 | 123-127x |
| RandomGamma | 14482 ± 424 | 252 ± 4 | 55-60x |
| RandomCrop128 | 93574 ± 1964 | 3223 ± 17 | 28-30x |
| Posterize | 28724 ± 3259 | 1080 ± 67 | 22-32x |
| Erasing | 27849 ± 4028 | 1228 ± 8 | 19-26x |
| Solarize | 13505 ± 442 | 695 ± 18 | 18-21x |
| RandomRotate90 | 8652 ± 167 | 464 ± 2 | 18-19x |
| CenterCrop128 | 95346 ± 1281 | 5226 ± 57 | 18-19x |
| RandomJigsaw | 9413 ± 136 | 638 ± 2 | 14-15x |
| MotionBlur | 3847 ± 49 | 322 ± 6 | 12-12x |
| Grayscale | 19593 ± 350 | 1679 ± 16 | 11-12x |
| ColorJiggle | 1208 ± 16 | 107 ± 0 | 11-11x |
| Blur | 7544 ± 134 | 745 ± 2 | 9.9-10x |
| HorizontalFlip | 13200 ± 430 | 1352 ± 8 | 9.4-10x |
| Hue | 1908 ± 18 | 204 ± 1 | 9.2-9.5x |
| VerticalFlip | 29169 ± 2657 | 3409 ± 3 | 7.8-9.3x |
| Invert | 31753 ± 1327 | 3718 ± 23 | 8.1-9.0x |
| ColorJitter | 1221 ± 10 | 169 ± 1 | 7.1-7.3x |
| Rotate | 2996 ± 12 | 442 ± 3 | 6.7-6.8x |
| JpegCompression | 1351 ± 11 | 202 ± 2 | 6.6-6.8x |
| RandomResizedCrop | 4354 ± 22 | 653 ± 6 | 6.6-6.8x |
| Saturation | 1389 ± 27 | 216 ± 1 | 6.3-6.6x |
| Perspective | 1185 ± 9 | 214 ± 2 | 5.4-5.6x |
| Resize | 3542 ± 11 | 677 ± 7 | 5.2-5.3x |
| Sharpen | 2221 ± 35 | 434 ± 6 | 5.0-5.3x |
| SmallestMaxSize | 2676 ± 7 | 537 ± 1 | 5.0-5.0x |
| ChannelShuffle | 8235 ± 86 | 1729 ± 3 | 4.7-4.8x |
| LongestMaxSize | 3847 ± 62 | 855 ± 7 | 4.4-4.6x |
| ChannelDropout | 11971 ± 434 | 2878 ± 11 | 4.0-4.3x |
| Snow | 754 ± 4 | 188 ± 0 | 4.0-4.0x |
| PlasmaBrightness | 394 ± 9 | 115 ± 0 | 3.3-3.5x |
| GaussianBlur | 2462 ± 11 | 717 ± 11 | 3.4-3.5x |
| Contrast | 10045 ± 119 | 2983 ± 11 | 3.3-3.4x |
| Brightness | 9849 ± 99 | 2992 ± 8 | 3.2-3.3x |
| Affine | 1456 ± 23 | 456 ± 2 | 3.1-3.3x |
| CLAHE | 644 ± 5 | 206 ± 1 | 3.1-3.2x |
| RGBShift | 5025 ± 48 | 1710 ± 15 | 2.9-3.0x |
| Equalize | 1086 ± 12 | 390 ± 6 | 2.7-2.9x |
| Shear | 1322 ± 7 | 482 ± 4 | 2.7-2.8x |
| GaussianNoise | 328 ± 20 | 133 ± 1 | 2.3-2.6x |
| AutoContrast | 1619 ± 44 | 739 ± 14 | 2.1-2.3x |
| PlasmaContrast | 250 ± 6 | 117 ± 0 | 2.1-2.2x |
| PlasmaShadow | 526 ± 8 | 281 ± 1 | 1.8-1.9x |
| SaltAndPepper | 946 ± 4 | 510 ± 1 | 1.8-1.9x |
| OpticalDistortion | 395 ± 4 | 243 ± 1 | 1.6-1.6x |
| CornerIllumination | 866 ± 28 | 596 ± 3 | 1.4-1.5x |
| Normalize | 1642 ± 26 | 1226 ± 13 | 1.3-1.4x |
| Rain | 2169 ± 27 | 1725 ± 3 | 1.2-1.3x |
| ThinPlateSpline | 92 ± 1 | 78 ± 0 | 1.2-1.2x |
| GaussianIllumination | 773 ± 21 | 680 ± 13 | 1.1-1.2x |
| PlankianJitter | 3278 ± 13 | 2996 ± 26 | 1.1-1.1x |
| LinearIllumination | 557 ± 18 | 1195 ± 7 | 0.4-0.5x |
| Colorize | 3858 ± 11 | — | — |
| Dithering | 6 ± 0 | — | — |
| Pad | 34979 ± 3274 | — | — |
| PhotoMetricDistort | 1070 ± 19 | — | — |
| Transpose | 8184 ± 199 | — | — |
| UnsharpMask | 3063 ± 37 | — | — |
See the aggregate image benchmark or inspect the benchmark source code.
Conversion Guide
The conversion usually moves augmentation from tensor batches in the training step into the Dataset/DataLoader preprocessing path.
- Apply AlbumentationsX before converting the sample to a tensor.
- Use ToTensorV2 at the end of the pipeline if the model expects PyTorch tensors.
- Move per-batch GPU-only transforms to AlbumentationsX only when they do not require differentiability or batched tensor semantics.
- Keep Kornia for model-integrated or differentiable computer vision operations.
Kornia
import kornia.augmentation as K
import torch.nn as nn
augment = nn.Sequential(
K.RandomHorizontalFlip(p=0.5),
K.ColorJitter(brightness=0.2, contrast=0.2, p=0.5),
)
batch = augment(batch)AlbumentationsX
import albumentations as A
from albumentations.pytorch import ToTensorV2
transform = A.Compose([
A.HorizontalFlip(p=0.5),
A.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, p=0.5),
ToTensorV2(),
])
sample = transform(image=image_np)
image = sample["image"]Use AlbumentationsX When
- CPU-side data augmentation before batching.
- Classic supervised CV pipelines with masks, bounding boxes, keypoints, or multiple aligned images.
- Workloads where augmentation speed in the input pipeline matters more than differentiability.
Use Kornia When
- Differentiable image processing inside PyTorch models.
- GPU batched augmentation, especially when the input pipeline is already tensor-native.
- Research code that needs gradients through geometric or photometric image operations.