Skip to content
YAML
---
title: "Example on how to apply TextImage Augmentation"
description: "An example of using Albumentations to add text to an image, featuring Mooze and Meeko"
image: "images/cats.jpg"
---

Example on how to write on top of images

Note: - Code for the transform is based on the code from https://github.com/danaaubakirova/doc-augmentation by Dana Aubakirova - Many thanks to Sarah Bieszczad for letting us feature her cats Mooze (small one) and Meeko (the giant) in our project

Important!

As input this transform takes bounding boxes in the Albumentations format, which normalized Pascal VOC. I.e.

bbox = [x_min / width, y_min / height, x_max / width, y_max, height]

For this transform to work we need to install optional dependency pillow

Python
from __future__ import annotations
Python
%load_ext autoreload
%autoreload 2
Python
!pip install -U pillow
Requirement already satisfied: pillow in /Users/vladimiriglovikov/anaconda3/envs/albumentations_examples/lib/python3.10/site-packages (10.4.0)
Python
import albumentations as A
import cv2
from matplotlib import pyplot as plt
Python
def visualize(image):
    plt.figure(figsize=(10, 5))
    plt.axis('off')
    plt.imshow(image)
Python
bgr_image = cv2.imread("images/cats.jpg")
image = cv2.cvtColor(bgr_image, cv2.COLOR_BGR2RGB)
Python
font_path = "../data/documents/LiberationSerif-Regular.ttf"
Python
visualize(image)

png

Write text

Python
transform = A.Compose([A.TextImage(font_path=font_path, p=1, font_color="yellow")])
Python
metadata = {
    "bbox": [0.15, 0.9, 0.9, 0.98],
    "text": "Mooze and Meeko",
}
Python
transformed = transform(image=image, textimage_metadata=metadata)
Python
visualize(transformed["image"])

png

Inpaint background

We black out parts of the image where insert text and inpaint them. Could be useful when replacing old text with a new one.

Python
transform = A.Compose([A.TextImage(font_path=font_path, p=1, font_color=(255, 0, 0), clear_bg=True)])
Python
metadata = {
    "bbox": [0.1, 0.3, 0.9, 0.38],
    "text": "Dangerous Tigers",
}
Python
transformed = transform(image=image, textimage_metadata=metadata)
Python
visualize(transformed["image"])

png

Write several lines

Python
transform = A.Compose([A.TextImage(font_path=font_path, p=1, font_color="black", clear_bg=True)])
Python
metadata = [{
    "bbox": [0.02, 0.1, 0.95, 0.17],
    "text": "Big dreams in small packages...",
},
            {
    "bbox": [0.02, 0.85, 0.95, 0.91],
    "text": "...and even bigger in bigger ones."}
           ]
Python
transformed = transform(image=image, textimage_metadata=metadata)
Python
visualize(transformed["image"])

png

Augment text

We can insert text as is, or augment it on the fly.

Swap words

Python
transform = A.Compose([A.TextImage(font_path=font_path, p=1, font_color="white", augmentations=["swap"])])
Python
metadata = [{
    "bbox": [0.02, 0.1, 0.95, 0.16],
    "text": "Big dreams in small packages...",
},
            {
    "bbox": [0.02, 0.85, 0.95, 0.91],
    "text": "...and even bigger in bigger ones."}
           ]
Python
transformed = transform(image=image, textimage_metadata=metadata)
Python
visualize(transformed["image"])

png

Random Deletion

Python
transform = A.Compose([A.TextImage(font_path=font_path, p=1, font_color="red", augmentations=["deletion"])])
Python
metadata = [{
    "bbox": [0.02, 0.1, 0.95, 0.16],
    "text": "Growing up with a giant...",
},
            {
    "bbox": [0.02, 0.85, 0.95, 0.91],
    "text": "...is always an adventure.."}
           ]
Python
transformed = transform(image=image, textimage_metadata=metadata)
Python
visualize(transformed["image"])

png

Insert random stopwords

Python
!pip install nltk
Requirement already satisfied: nltk in /Users/vladimiriglovikov/anaconda3/envs/albumentations_examples/lib/python3.10/site-packages (3.8.1)
Requirement already satisfied: click in /Users/vladimiriglovikov/anaconda3/envs/albumentations_examples/lib/python3.10/site-packages (from nltk) (8.1.7)
Requirement already satisfied: joblib in /Users/vladimiriglovikov/anaconda3/envs/albumentations_examples/lib/python3.10/site-packages (from nltk) (1.3.2)
Requirement already satisfied: regex>=2021.8.3 in /Users/vladimiriglovikov/anaconda3/envs/albumentations_examples/lib/python3.10/site-packages (from nltk) (2024.7.24)
Requirement already satisfied: tqdm in /Users/vladimiriglovikov/anaconda3/envs/albumentations_examples/lib/python3.10/site-packages (from nltk) (4.66.2)
Python
import nltk

nltk.download('stopwords')
from nltk.corpus import stopwords
[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/vladimiriglovikov/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
Python
stops = stopwords.words('english')
Python
transform = A.Compose([A.TextImage(font_path=font_path, p=1, font_color="white", augmentations=["insertion"], stopwords=stops)])
Python
metadata = {
    "bbox": [0.15, 0.9, 0.9, 0.95],
    "text": "Mooze and Meeko",
}
Python
transformed = transform(image=image, textimage_metadata=metadata)
Python
visualize(transformed["image"])

png

Returning augmented text

If you need text that was added to the image after "swap", "insertion" or "deletion" you may get it with:

Python
transform = A.Compose([A.TextImage(font_path=font_path, p=1, font_color="white", augmentations=["insertion", "swap"], stopwords=stops)])
Python
metadata = [{
    "bbox": [0.02, 0.1, 0.95, 0.16],
    "text": "Big dreams in small packages...",
},
            {
    "bbox": [0.02, 0.85, 0.95, 0.91],
    "text": "...and even bigger in bigger ones."}
           ]
Python
transformed = transform(image=image, textimage_metadata=metadata)
Python
transformed["overlay_data"]
[{'bbox_coords': (19, 1088, 912, 1164),
  'text': 'for ...and even bigger in as bigger didn ones.',
  'original_text': '...and even bigger in bigger ones.',
  'bbox_index': 1,
  'font_color': 'white'},
 {'bbox_coords': (19, 128, 912, 204),
  'text': 'dreams in Big small packages...',
  'original_text': 'Big dreams in small packages...',
  'bbox_index': 0,
  'font_color': 'white'}]