Albumentations in Geospatial: Who Actually Uses It

Vladimir Iglovikov
Vladimir Iglovikov
Maintainer
8 min read
geospatialremote-sensingsatellite-imageryearth-observationmultispectraladoption
Albumentations in Geospatial: Who Actually Uses It

Albumentations is infrastructure in the satellite / remote-sensing ecosystem, not a research curio. This post is the receipts: which named organizations import it, which OSS geo libraries declare it as a direct dependency, how many papers cite it, and how that adoption has grown year over year.

All numbers below are reproducible from public APIs: OpenAlex (citations), GitHub Code Search (org-scoped import queries), the Hugging Face Hub (tagged model cards), and root-level packaging files (requirements.txt, pyproject.toml, etc.) in each OSS repo. The headline org-scoped grep is org:<name> "import albumentations".

Headline

  • 382 geospatial papers cite Albumentations
  • 5 OSS geospatial libraries declare it as a direct dependency
  • 44 public repositories across 19 named geospatial organizations import it
  • 3 HuggingFace artifacts in the geo / remote-sensing tag space reference it

"Albumentations" here means the project stewarded by Albumentations LLC: the legacy MIT albumentations package (archived June 2025) plus the maintained successor albumentationsx (AGPL-3.0 + Commercial), which preserves API compatibility — see the dual-licensing post for context.

Why Geospatial Pulls in an Augmentation Library at All

Three things make satellite / drone / aerial imagery harder than the consumer-photo case Albumentations was originally designed for, and all three are exactly what an augmentation library buys you:

  1. Multi-band, non-RGB rasters. Sentinel-2 has 13 bands, Landsat-8 has 11, Planet has 4–8, hyperspectral sensors can have 200+. Most ImageNet-era augmentation code assumes 3 channels of uint8. Albumentations transforms operate on arbitrary (H, W, C) arrays in uint8 or float32. Native EO rasters are usually uint16 (Sentinel-2 L1C/L2A reflectance, Landsat surface reflectance) — the standard approach is to scale to float32 once at load time (e.g. arr.astype(np.float32) / 10000.0 for Sentinel-2 reflectance) and let the augmentation pipeline run on the float tensor. Chromatic-shift / spectral / atmospheric ops stay band-aware.
  2. Tight label co-transforms. A geo training sample is typically the image plus a segmentation mask (land cover, building footprint, burn scar) plus optionally bounding boxes (vehicles, ships, planes) plus keypoints (tower bases, well heads). Geometric ops have to apply identically to all of them or the labels silently drift. Albumentations is built around Compose over (image, mask, bboxes, keypoints) — that's why every geo OSS library below ends up using it.
  3. Tile pipelines. Geo training is rarely "load whole image, augment, train." It's "stream tiles from a COG / GeoTIFF / Zarr, augment per-tile, batch." Augmentation has to be CPU-side and fast enough to feed the GPU. Albumentations is OpenCV-backed and dominates on multi-channel inputs: in our 9-channel CPU benchmark it is fastest on 58 of 68 transforms, with a median 3.7× speedup vs Kornia and 2.3× vs Torchvision on the head-to-head subset, and a long tail of transforms that the other libraries don't implement for arbitrary channel counts at all. Both of those facts matter for geo: the speed feeds the GPU, and the coverage means you don't silently lose half your augmentation toolbox the moment you go past 3 channels.

Concretely, the typical geospatial use looks like this — note the multi-channel input, the paired mask, and the chromatic ops chosen specifically to be safe across bands:

import albumentations as A
import numpy as np

image = np.load("sentinel2_tile.npy")
mask = np.load("landcover_tile.npy")

transform = A.Compose([
    A.RandomCrop(height=256, width=256),
    A.HorizontalFlip(p=0.5),
    A.VerticalFlip(p=0.5),
    A.RandomRotate90(p=0.5),
    A.RandomBrightnessContrast(
        brightness_range=(-0.1, 0.1),
        contrast_range=(-0.1, 0.1),
        p=0.5,
    ),
    A.GaussNoise(std_range=(0.02, 0.08), p=0.3),
])

out = transform(image=image, mask=mask)
tile, label = out["image"], out["mask"]

Same Compose would also accept bboxes=... and keypoints=... and keep them aligned.

OSS Geospatial Libraries That Depend on Albumentations

These are repository-rooted facts — the dependency is declared in pyproject.toml / requirements.txt / setup.py / environment.yml, not inferred from a citation graph.

LibraryOrgEvidence file(s)Repo
raster-visionAzavea / Element 84requirements.txtazavea/raster-vision
solarisCosmiQ / IQTsetup.py; requirements.txt; environment.ymlCosmiQ/solaris
TerraTorchIBM Researchpyproject.tomlIBM/terratorch
prithvi-pytorchNASA / IBMrequirements.txtNASA-IMPACT/Prithvi-EO-2.0
GeoSegAcademic (Wuhan University)requirements.txtWangLibo1995/GeoSeg

Notable: Prithvi is the NASA/IBM foundation model for Earth Observation. TerraTorch is IBM's geospatial fine-tuning toolkit built on top of Prithvi. Raster Vision is Azavea's (now Element 84's) production geospatial deep-learning framework. Solaris is the CosmiQ / IQT toolkit used for SpaceNet challenges. All four declare Albumentations as a direct, hard dependency — meaning anyone who pip installs these libraries pulls Albumentations transitively.

Named Geospatial Organizations Using It

Org-scoped GitHub Code Search (org:<name> "import albumentations") found import albumentations in 44 repositories across 19 organizations from a hand-curated tier-1 list (commercial EO providers, space agencies, research labs, OSS geo ML projects).

OrganizationReposNotes
aws-samples10AWS reference architectures (SageMaker, etc.)
IBM6TerraTorch, TerraMind, ML4EO, peft-geofm
microsoft5Microsoft AI for Earth / planetary computer
satellogic3Commercial EO constellation operator
developmentseed3Geospatial ML consultancy (NASA, World Bank)
zhu-xlab2TUM Prof. Zhu's lab — major SSL-for-EO group
nasa-jpl2NASA Jet Propulsion Laboratory
DLR-MF-DAS2German Aerospace Center (SSL4EO-S12, etc.)
allenai1Allen Institute for AI
radiantearth1Radiant Earth Foundation
azavea1Maker of raster-vision (now Element 84)
awslabs1AWS Labs
CosmiQ1CosmiQ Works / IQT (SpaceNet)
NASA-IMPACT1NASA IMPACT (Prithvi, ESA-NASA workshops)
spaceml-org1SpaceML / FDL (NASA Frontier Development Lab)
tudelft3d1TU Delft 3D geoinformation
wri1World Resources Institute
WildMeOrg1Wildlife computer vision
GlobalFishingWatch1Global Fishing Watch (industrial activity SAR)

The interesting cluster here is the foundation-model orgs — IBM (TerraTorch / TerraMind / Prithvi tooling), NASA-IMPACT (Prithvi-EO-2.0, ESA-NASA workshop notebooks), DLR (SSL4EO-S12), zhu-xlab (TUM SSL-for-EO). All of them ship public training notebooks where the augmentation pipeline is import albumentations as A.

A few representative paths from the search (one per org, abridged):

RepoFile
CosmiQ/solarissolaris/nets/transform.py
NASA-IMPACT/ESA-NASA-workshop-2025Track 1 (EO)/TerraMind/notebooks/terramind_v1_base_sen1floods11.ipynb
IBM/terramindnotebooks/terramind_v1_small_burnscars.ipynb
IBM/peft-geofmsrc/peft_geofm/datamodules/utils.py
DLR-MF-DAS/SSL4EO-S12-v1.1README.md
GlobalFishingWatch/paper-industrial-activitynnets/fishing/dataset.py
aws-samples/aws-vegetation-management-workshopremars2022-workshop/dataset.py
azavea/raster-visionrastervision_pytorch_backend/.../semantic_segmentation/utils.py

Academic Citations

Filtered from 2,403 unique citing papers (12,015 author-paper-affiliation rows in OpenAlex), keeping only those whose title / abstract / venue contain geospatial keywords (satellite, remote sensing, aerial, UAV, drone, multispectral, hyperspectral, land cover, crop, wildfire, canopy, etc.) — 382 unique geospatial papers cite Albumentations.

Year-over-year growth

YearGeo papers citing Albumentations
20206
202128
202256
202376
202464
2025132
202620 (YTD, April)

The 2024→2025 jump (64 → 132) tracks the rise of geospatial foundation models (Prithvi, TerraMind, SatMAE, Clay) — each one ships a downstream-task notebook, and almost all of them ship it with Albumentations as the augmentation layer.

Top-cited geo papers (sample)

Top affiliations (≥ 3 geo papers)

Affiliation# papers
Michigan State University7
Wuhan University7
Chinese Academy of Sciences4
Skolkovo Institute of Science and Technology4
Zhejiang University4
Central South University3
Facultad de Minas3
Institute of Intelligent Emergency Information Processing3
Ocean University of China3
Silesian University of Technology3
Technical University of Munich (TUM)3
University of California, Davis3

HuggingFace Ecosystem

Across HuggingFace Hub artifacts tagged remote-sensing / satellite-imagery / earth-observation / aerial-imagery / geospatial / land-cover, 3 model cards reference Albumentations in their training recipe (0 datasets — datasets typically don't carry augmentation pipelines, only training notebooks do).

What This Means

The 3D-geospatial / Earth-observation ML ecosystem already runs on Albumentations for the imagery half of essentially every supervised pipeline that ingests sensor frames, orthophotos, drone tiles, or satellite rasters. Funding maintenance of the underlying augmentation primitives — chromatic-shift, atmospheric, geometric, multispectral-safe ops — directly reduces friction for every grantee, every academic group, and every commercial EO operator listed above.

Every named org in the table above is a current, public-code user. Every library in the dependency table ships Albumentations transitively to its own users. The 382-paper citation count is a lower bound — it only counts papers whose metadata explicitly contains a geospatial keyword.

If you maintain a geospatial OSS project, foundation model, or training pipeline and want to be added to (or removed from) this evidence set, ping me — the methodology is fully scripted and the audit is rerun on demand.


This brief is regenerated from the public APIs above. All counts are reproducible. Last regenerated 2026-04-19.

Hero image: nine Albumentations 2.2.0 transforms applied to the same Sentinel-2 tile of Moorea (French Polynesia). Source tile: ESA / CNES, Copernicus Sentinel-2 imagery, 21 June 2021, CC BY-SA 3.0 IGO. Grid produced by build_hero.py.