Data Augmentation in Tensorflow

Data augmentation is a powerful technique used in training neural networks, particularly in image classification, computer vision, natural language processing, speech recognition, etc. This process helps improve the model’s ability to generalize to new, unseen data, making it more robust and effective in real-world applications. This article will show an example of how to use this useful technique with Tensorflow for image classification.

What is Data augmentation?

“Data augmentation” is a technique for creating additional training data from existing data. It involves applying various transformations to the input data to expand the size and diversity of the training dataset artificially. Deep neural networks require training on extensive datasets to obtain the best results. Consequently, if the initial training set has a limited number of images, it becomes essential to apply data augmentation techniques to enhance the model’s performance.

Why is Data augmentation needed?

When building machine learning models, the problem of limited source data is quite common. The more accurately the training sample approximates the general set of images that will be input to your system, the higher the maximum achievable quality of the result will be. 

Data augmentation helps in:

  • Preventing Overfitting: By introducing variations in the training data, models are less likely to memorize the training examples and more likely to generalize well to new, unseen data.
  • Improving Model Robustness: Augmented data can help models become more resilient to variations and noise in real-world data.
  • Increasing Dataset Size: It effectively increases the size of the training dataset, which can be particularly useful when the original dataset is small.

Thus, a properly compiled training sample is a very specific technical task. In many such cases, it makes sense to use a technique called Data Augmentation.

How to use Data Augmentation in TensorFlow? 

At TensorFlow, there are many augmentation options. The most popular are as follows: horizontal and vertical flip, random crop, and color change. There are various combinations available, to cite one example, simultaneously perform rotation and random scaling. In addition, you can vary the saturation value and the value of all pixels (components S and V of the HSV color space). In particular, you can raise these components to a power from the interval from 0.25 to 4, multiply them by a coefficient from the interval from 0.7 to 1.4, or add to them a value from the interval from -0.1 to 0.1. You can also add a value from the interval from -0.1 to 0.1 to the hue value of all pixels (component H HSV color space). Similar transformations can be applied to image fragments. Let’s give it a try to build the augmentation pipeline in TensorFlow of images in practice.

How to set up all necessary libraries?

As is standard practice, we begin by importing all the essential libraries. TensorFlow conveniently includes all the tools needed for data augmentation. The figure illustrates examples of the necessary imports for implementing augmentation techniques:

import tensorflow as tf
from keras.layers.preprocessing.image_preprocessing import HORIZONTAL, VERTICAL

How to implement Data Augmentation using Python?

self._data_augmentation = tf.keras.Sequential([
    tf.keras.layers.experimental.preprocessing.RandomRotation(),
    tf.keras.layers.experimental.preprocessing.RandomContrast(),
    tf.keras.layers.experimental.preprocessing.RandomFlip(HORIZONTAL),
    tf.keras.layers.experimental.preprocessing.RandomFlip(VERTICAL),
    tf.keras.layers.experimental.preprocessing.RandomZoom(),
    tf.keras.layers.experimental.preprocessing.CenterCrop(),
    tf.keras.layers.experimental.preprocessing.Rescaling()
])

This code snippet creates a data augmentation pipeline using TensorFlow’s Keras API. In the first step, it initializes a Sequential model to stack layers. It then adds several preprocessing layers: RandomRotation to randomly rotate images, RandomContrast to adjust contrast, RandomFlip to flip images horizontally and vertically, RandomZoom to zoom in, CenterCrop to crop the central region, and Rescaling to normalize pixel values. These transformations enhance the training dataset’s diversity, improving the model’s robustness.

When is Data Augmentation being used in the code?

The augmentation technology is used right after the input layer is defined, ensuring that all input images undergo the specified transformations before being fed into the subsequent layers of the model. This enhances the diversity of the training data, making the model more robust and better at generalizing to new, unseen data. In the current example, the input layer is an image with 3 color channels (RGB):

def _build_model(self, ef=0):
    inp = tf.keras.layers.Input(shape=(*self._img_size_list, 3))
    x = self._data_augmentation(inp)

 

This input layer is then passed through the data augmentation pipeline, which applies various transformations to the input images by the time of model initialization.

What does it look like before and after?

Usually in practice, augmentation technology is used not for one image, but for the entire dataset at once. But for visualization purposes, below is an example of using this technology for one random image. The figure illustrates the effects of the augmentation process on a sample image, showcasing the differences before and after augmentation:

image augmentation pipeline

In the augmented image, several random modifications have been applied, including adjustments to contrast, vertical and horizontal flips, rotations, and other transformations. These changes enhance the diversity of the training dataset, making the model more robust and capable of handling various real-world scenarios. 

Conclusion

In this example, we have implemented an image augmentation pipeline using TensorFlow. This technique allows you to significantly expand your dataset, making it more suitable for training robust neural network models. By applying various transformations, you can generate a diverse set of images from a single input image. For instance, you can produce approximately 20-30 different variations from one original image. Consequently, a dataset with 100 images per class can be augmented to contain 2,000-3,000 images per class. This expanded dataset size is typically sufficient for training neural networks effectively, ensuring the model learns to generalize well to new, unseen data. Additionally, you can adjust the number of augmented images produced at each stage to optimize the dataset for your specific task, enhancing the overall performance of your model.

Contact Us
Contact Us


    Insert math as
    Block
    Inline
    Additional settings
    Formula color
    Text color
    #333333
    Type math using LaTeX
    Preview
    \({}\)
    Nothing to preview
    Insert