---
title: "Example on how to apply TextImage Augmentation"
description: "An example of using Albumentations to add text to an image, featuring Mooze and Meeko"
image: "images/cats.jpg"
---
Example on how to write on top of images 🔗
Note:
- Code for the transform is based on the code from https://github.com/danaaubakirova/doc-augmentation by Dana Aubakirova
- Many thanks to Sarah Bieszczad for letting us feature her cats Mooze (small one) and Meeko (the giant) in our project
Important!
As input this transform takes bounding boxes in the Albumentations format, which normalized Pascal VOC. I.e.
bbox = [x_min / width, y_min / height, x_max / width, y_max, height]
For this transform to work we need to install optional dependency pillow
from __future__ import annotations
%load_ext autoreload
%autoreload 2
!pip install -U pillow
Requirement already satisfied: pillow in /opt/homebrew/Caskroom/miniconda/base/envs/albumentations_examples/lib/python3.9/site-packages (11.1.0)
import albumentations as A
import cv2
from matplotlib import pyplot as plt
/opt/homebrew/Caskroom/miniconda/base/envs/albumentations_examples/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html from .autonotebook import tqdm as notebook_tqdm
def visualize(image):
plt.figure(figsize=(10, 5))
plt.axis("off")
plt.imshow(image)
image = cv2.imread("images/cats.jpg", cv2.IMREAD_COLOR_RGB)
font_path = "../data/documents/LiberationSerif-Regular.ttf"
visualize(image)
Write text 🔗
transform = A.Compose([A.TextImage(font_path=font_path, p=1, font_color="yellow")])
metadata = {
"bbox": [0.15, 0.9, 0.9, 0.98],
"text": "Mooze and Meeko",
}
transformed = transform(image=image, textimage_metadata=metadata)
visualize(transformed["image"])
Inpaint background 🔗
We black out parts of the image where insert text and inpaint them. Could be useful when replacing old text with a new one.
transform = A.Compose(
[A.TextImage(font_path=font_path, p=1, font_color=(255, 0, 0), clear_bg=True)],
strict=True,
seed=137,
)
metadata = {
"bbox": [0.1, 0.3, 0.9, 0.38],
"text": "Dangerous Tigers",
}
transformed = transform(image=image, textimage_metadata=metadata)
visualize(transformed["image"])
Write several lines 🔗
transform = A.Compose([A.TextImage(font_path=font_path, p=1, font_color="black", clear_bg=True)])
metadata = [
{
"bbox": [0.02, 0.1, 0.95, 0.17],
"text": "Big dreams in small packages...",
},
{"bbox": [0.02, 0.85, 0.95, 0.91], "text": "...and even bigger in bigger ones."},
]
transformed = transform(image=image, textimage_metadata=metadata)
visualize(transformed["image"])
Augment text 🔗
We can insert text as is, or augment it on the fly.
Swap words 🔗
transform = A.Compose(
[A.TextImage(font_path=font_path, p=1, font_color="white", augmentations=["swap"])],
strict=True,
seed=137,
)
metadata = [
{
"bbox": [0.02, 0.1, 0.95, 0.16],
"text": "Big dreams in small packages...",
},
{"bbox": [0.02, 0.85, 0.95, 0.91], "text": "...and even bigger in bigger ones."},
]
transformed = transform(image=image, textimage_metadata=metadata)
visualize(transformed["image"])
Random Deletion 🔗
transform = A.Compose(
[A.TextImage(font_path=font_path, p=1, font_color="red", augmentations=["deletion"])],
strict=True,
seed=137,
)
metadata = [
{
"bbox": [0.02, 0.1, 0.95, 0.16],
"text": "Growing up with a giant...",
},
{"bbox": [0.02, 0.85, 0.95, 0.91], "text": "...is always an adventure.."},
]
transformed = transform(image=image, textimage_metadata=metadata)
visualize(transformed["image"])
Insert random stopwords 🔗
!pip install nltk
Collecting nltk Downloading nltk-3.9.1-py3-none-any.whl.metadata (2.9 kB) Collecting click (from nltk) Downloading click-8.1.8-py3-none-any.whl.metadata (2.3 kB) Collecting joblib (from nltk) Downloading joblib-1.4.2-py3-none-any.whl.metadata (5.4 kB) Collecting regex>=2021.8.3 (from nltk) Downloading regex-2024.11.6-cp39-cp39-macosx_11_0_arm64.whl.metadata (40 kB) Requirement already satisfied: tqdm in /opt/homebrew/Caskroom/miniconda/base/envs/albumentations_examples/lib/python3.9/site-packages (from nltk) (4.67.1) Downloading nltk-3.9.1-py3-none-any.whl (1.5 MB) [2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.5/1.5 MB[0m [31m15.3 MB/s[0m eta [36m0:00:00[0m [?25hDownloading regex-2024.11.6-cp39-cp39-macosx_11_0_arm64.whl (284 kB) Downloading click-8.1.8-py3-none-any.whl (98 kB) Downloading joblib-1.4.2-py3-none-any.whl (301 kB) Installing collected packages: regex, joblib, click, nltk Successfully installed click-8.1.8 joblib-1.4.2 nltk-3.9.1 regex-2024.11.6
import nltk
nltk.download("stopwords")
from nltk.corpus import stopwords
[nltk_data] Downloading package stopwords to [nltk_data] /Users/vladimiriglovikov/nltk_data... [nltk_data] Unzipping corpora/stopwords.zip.
stops = stopwords.words("english")
transform = A.Compose(
[A.TextImage(font_path=font_path, p=1, font_color="white", augmentations=["insertion"], stopwords=stops)],
strict=True,
seed=137,
)
metadata = {
"bbox": [0.15, 0.9, 0.9, 0.95],
"text": "Mooze and Meeko",
}
transformed = transform(image=image, textimage_metadata=metadata)
visualize(transformed["image"])
Returning augmented text 🔗
If you need text that was added to the image after "swap", "insertion" or "deletion" you may get it with:
transform = A.Compose(
[A.TextImage(font_path=font_path, p=1, font_color="white", augmentations=["insertion", "swap"], stopwords=stops)],
strict=True,
seed=137,
)
metadata = [
{
"bbox": [0.02, 0.1, 0.95, 0.16],
"text": "Big dreams in small packages...",
},
{"bbox": [0.02, 0.85, 0.95, 0.91], "text": "...and even bigger in bigger ones."},
]
transformed = transform(image=image, textimage_metadata=metadata)
transformed["overlay_data"]
[{'bbox_coords': (19, 1088, 912, 1164), 'text': 'where ...and even bigger down in bigger ones. before', 'original_text': '...and even bigger in bigger ones.', 'bbox_index': 1, 'font_color': 'white'}, {'bbox_coords': (19, 128, 912, 204), 'text': 'Big small packages... dreams in', 'original_text': 'Big dreams in small packages...', 'bbox_index': 0, 'font_color': 'white'}]