Open in Google ColabRun this notebook interactively
---
title: "Example on how to apply TextImage Augmentation"
description: "An example of using Albumentations to add text to an image, featuring Mooze and Meeko"
image: "images/cats.jpg"
---

Example on how to write on top of images 🔗

Note:

Important!

As input this transform takes bounding boxes in the Albumentations format, which normalized Pascal VOC. I.e.

bbox = [x_min / width, y_min / height, x_max / width, y_max, height]

For this transform to work we need to install optional dependency pillow

from __future__ import annotations
%load_ext autoreload
%autoreload 2
!pip install -U pillow
Requirement already satisfied: pillow in /opt/homebrew/Caskroom/miniconda/base/envs/albumentations_examples/lib/python3.9/site-packages (11.1.0)
import albumentations as A
import cv2
from matplotlib import pyplot as plt
/opt/homebrew/Caskroom/miniconda/base/envs/albumentations_examples/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
def visualize(image):
    plt.figure(figsize=(10, 5))
    plt.axis("off")
    plt.imshow(image)
image = cv2.imread("images/cats.jpg", cv2.IMREAD_COLOR_RGB)
font_path = "../data/documents/LiberationSerif-Regular.ttf"
visualize(image)
No code provided

png

No code provided

Write text 🔗

transform = A.Compose([A.TextImage(font_path=font_path, p=1, font_color="yellow")])
metadata = {
    "bbox": [0.15, 0.9, 0.9, 0.98],
    "text": "Mooze and Meeko",
}
transformed = transform(image=image, textimage_metadata=metadata)
visualize(transformed["image"])
No code provided

png

No code provided

Inpaint background 🔗

We black out parts of the image where insert text and inpaint them. Could be useful when replacing old text with a new one.

transform = A.Compose(
    [A.TextImage(font_path=font_path, p=1, font_color=(255, 0, 0), clear_bg=True)],
    strict=True,
    seed=137,
)
metadata = {
    "bbox": [0.1, 0.3, 0.9, 0.38],
    "text": "Dangerous Tigers",
}
transformed = transform(image=image, textimage_metadata=metadata)
visualize(transformed["image"])
No code provided

png

No code provided

Write several lines 🔗

transform = A.Compose([A.TextImage(font_path=font_path, p=1, font_color="black", clear_bg=True)])
metadata = [
    {
        "bbox": [0.02, 0.1, 0.95, 0.17],
        "text": "Big dreams in small packages...",
    },
    {"bbox": [0.02, 0.85, 0.95, 0.91], "text": "...and even bigger in bigger ones."},
]
transformed = transform(image=image, textimage_metadata=metadata)
visualize(transformed["image"])
No code provided

png

No code provided

Augment text 🔗

We can insert text as is, or augment it on the fly.

Swap words 🔗

transform = A.Compose(
    [A.TextImage(font_path=font_path, p=1, font_color="white", augmentations=["swap"])],
    strict=True,
    seed=137,
)
metadata = [
    {
        "bbox": [0.02, 0.1, 0.95, 0.16],
        "text": "Big dreams in small packages...",
    },
    {"bbox": [0.02, 0.85, 0.95, 0.91], "text": "...and even bigger in bigger ones."},
]
transformed = transform(image=image, textimage_metadata=metadata)
visualize(transformed["image"])
No code provided

png

No code provided

Random Deletion 🔗

transform = A.Compose(
    [A.TextImage(font_path=font_path, p=1, font_color="red", augmentations=["deletion"])],
    strict=True,
    seed=137,
)
metadata = [
    {
        "bbox": [0.02, 0.1, 0.95, 0.16],
        "text": "Growing up with a giant...",
    },
    {"bbox": [0.02, 0.85, 0.95, 0.91], "text": "...is always an adventure.."},
]
transformed = transform(image=image, textimage_metadata=metadata)
visualize(transformed["image"])
No code provided

png

No code provided

Insert random stopwords 🔗

!pip install nltk
Collecting nltk
  Downloading nltk-3.9.1-py3-none-any.whl.metadata (2.9 kB)
Collecting click (from nltk)
  Downloading click-8.1.8-py3-none-any.whl.metadata (2.3 kB)
Collecting joblib (from nltk)
  Downloading joblib-1.4.2-py3-none-any.whl.metadata (5.4 kB)
Collecting regex>=2021.8.3 (from nltk)
  Downloading regex-2024.11.6-cp39-cp39-macosx_11_0_arm64.whl.metadata (40 kB)
Requirement already satisfied: tqdm in /opt/homebrew/Caskroom/miniconda/base/envs/albumentations_examples/lib/python3.9/site-packages (from nltk) (4.67.1)
Downloading nltk-3.9.1-py3-none-any.whl (1.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 15.3 MB/s eta 0:00:00
Downloading regex-2024.11.6-cp39-cp39-macosx_11_0_arm64.whl (284 kB)
Downloading click-8.1.8-py3-none-any.whl (98 kB)
Downloading joblib-1.4.2-py3-none-any.whl (301 kB)
Installing collected packages: regex, joblib, click, nltk
Successfully installed click-8.1.8 joblib-1.4.2 nltk-3.9.1 regex-2024.11.6
import nltk

nltk.download("stopwords")
from nltk.corpus import stopwords
[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/vladimiriglovikov/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
stops = stopwords.words("english")
transform = A.Compose(
    [A.TextImage(font_path=font_path, p=1, font_color="white", augmentations=["insertion"], stopwords=stops)],
    strict=True,
    seed=137,
)
metadata = {
    "bbox": [0.15, 0.9, 0.9, 0.95],
    "text": "Mooze and Meeko",
}
transformed = transform(image=image, textimage_metadata=metadata)
visualize(transformed["image"])
No code provided

png

No code provided

Returning augmented text 🔗

If you need text that was added to the image after "swap", "insertion" or "deletion" you may get it with:

transform = A.Compose(
    [A.TextImage(font_path=font_path, p=1, font_color="white", augmentations=["insertion", "swap"], stopwords=stops)],
    strict=True,
    seed=137,
)
metadata = [
    {
        "bbox": [0.02, 0.1, 0.95, 0.16],
        "text": "Big dreams in small packages...",
    },
    {"bbox": [0.02, 0.85, 0.95, 0.91], "text": "...and even bigger in bigger ones."},
]
transformed = transform(image=image, textimage_metadata=metadata)
transformed["overlay_data"]
[{'bbox_coords': (19, 1088, 912, 1164),
  'text': 'where ...and even bigger down in bigger ones. before',
  'original_text': '...and even bigger in bigger ones.',
  'bbox_index': 1,
  'font_color': 'white'},
 {'bbox_coords': (19, 128, 912, 204),
  'text': 'Big small packages... dreams in',
  'original_text': 'Big dreams in small packages...',
  'bbox_index': 0,
  'font_color': 'white'}]