AutoAlbument introduction and core concepts¶
What is AutoAlbument¶
AutoAlbument is a tool that automatically searches for the best augmentation policies for your data.
Under the hood, it uses the Faster AutoAugment algorithm. Briefly speaking, the idea is to use a GAN-like architecture in which Generator applies augmentation to some input images, and Discriminator must determine whether an image was or wasn't augmented. This process helps to find augmentation policies that will produce images similar to the original images.
How to use AutoAlbument¶
To use AutoAlbument, you need to define two things: a PyTorch Dataset for your data and configuration parameters for AutoAlbument. You can read the detailed instruction in the How to use AutoAlbument article.
Internally AutoAlbument uses PyTorch Lightning for training a GAN and Hydra for handling configuration parameters.
Here are a few things about AutoAlbument and Hydra.
Hydra¶
The main internal configuration file is located at autoalbument/cli/conf/config.yaml
Here is its content:
defaults:
- _version
- task
- policy_model: default
- classification_model: default
- semantic_segmentation_model: default
- data: default
- searcher: default
- trainer: default
- optim: default
- callbacks: default
- logger: default
- hydra: default
- seed
- search
Basically, it includes a bunch of config files with default values. Those config files are split into sets of closely related parameters such as model parameters or optimizer parameters. All default config files are located in their respective directories inside autoalbument/cli/conf
The main config file also includes the search.yaml
file, which you will use for overriding default parameters for your specific dataset and task (you can read more about creating the search.yaml
file with autoalbument-create
in How to use AutoAlbument)
To allow great flexibility, AutoAlbument relies heavily on the instantiate
function from Hydra. This function allows to define a path to a Python class in a YAML config (using the _target_
parameter) along with arguments to that class, and Hydra will create an instance of this class with the provided arguments.
As a practice example, if a config contains a definition like this:
_target_: autoalbument.faster_autoaugment.models.ClassificationModel
num_classes: 10
architecture: resnet18
pretrained: False
AutoAlbument will translate it approximately to the following call:
from autoalbument.faster_autoaugment.models import ClassificationModel
model = ClassificationModel(num_classes=10, architecture='resnet18', pretrained=False)
By relying on this feature, AutoAlbument allows customizing its behavior without changing the library's internal code.
PyTorch Lightning¶
AutoAlbument relies on PyTorch Lightning to train a GAN. In AutoAlbument configs, you can configure PyTorch Lightning by passing the appropriate arguments to Trainer through the trainer
config or defining a list of Callbacks through the callbacks
config.