The accuracy of any deep learning model depends upon the dataset which we throw on that. If a dataset is not enough for the machine to learn then it will not be able to give the correct answers for the data which is missing. There is a great relationship between the model and the datasets which we throw on the model because these on the basis of the dataset our model can go underfitting or overfitting.
The meaning of underfitting is that when we through more datasets on the model to train but at last it is not able to decrease the loss (loss still high) then we call the model is in underfitting.
In the other case of overfitting, when the model touches the high accuracy and low loss but isn't able to generalize the unseen data.
Data augmentation not only prevents the model from being overfitting or underfitting but it also helps to increase the performance of the model. So it's clear that data augmentation can help us to improve the performance of the model. We can augment any type of data, like::
1. Image data
2. Audio data
3. Text data
4. Any other data type.
But in this article, we will discuss about image augmentation.
We can apply a different number of techniques on the image data of which some of the techniques are mentioned below:
Colour space: We can change the RGB colour of the image to any other colour like grey.
Kernel filters: We can also apply kernels on the image to sharpen or blur the image before passing to the models for training.
Erasing randomly some part of the image so that the machine is able to learn more features to predict better.
We can apply different types of geometric transformation techniques like flip, rotate, zoom, crop, etc.
We can apply the following techniques on the text data during augmentation:
We can use shuffle the sentence or word.
Change the word with synonyms.
Correct the sentence using the different words of similar meaning.
For the audio data we can use the following techniques:
We can inject noise in the audio.
We can change the speed of the tape.
So as per the above discussion, we now understand what augmentation techniques can be used on data to make it better to train the machine. Another important point to note is that we generally need a large amount of data, but if we do not have that much data, to solve the above problem data augmentation comes in the scenario.
Data augmentation is a method to generate new training data without changing the class labels by applying some random jitters and perturbations. The main motive for data augmentation is to increase the model generalizability because if we throw more data to the neural network then it can train itself more accurately by using the new data. So the model can learn more robust features by always seeing new data. We should apply data augmentation only during the model training time and not during the test time. After applying the data augmentation we can see model accuracy increase as compared to a model which is trained without data augmentation.
Data augmentation plays a very important role during the training of the model. The main idea behind why it's important is explained in the example below:
Imagine we are going to train a system to detect an object like a car, bicycle, etc. in the image. But not all the images have an object at the centre of the image or even easily detectable? No, it's not possible. Because when we click the image the object may become to the right side of the image or left side of the image or might be just more on top or might be also touching the below border. And not only the position of the object, but there some also concepts which comes like lighting condition. When we clicked the image then that object also depends upon on which condition of the light was there. Because some times object comes in the picture very bright but some times it comes under very dark. So these all the above scenario comes during the model training to get the better results else we are not able to get the proper model. So the augmentation will help us to some extent to overcome these problems.
To apply the data augmentation there are basically two methods.
The first method is that we can apply the augmentation method on each of the image datasets before start training and create a new dataset.
The second method is to apply the augmentation technique during the training itself. But we do not apply the data augmentation techniques during test time because test time we need the same data because that is already unseen data to check results on that data.
So in this tutorial, we are going to learn the following concepts:
Data augmentation types
Clearing your doubt regarding the data augmentation.
And last learning how we can apply this method data augmentation in our model using ImageDataGenerator and Keras.
Image data augmentation is the most important part of computer vision. The following geometric transformation methods data augmentation use to re-generate the image data.
Changes in scale
Horizontal and vertical flips
Therefore to generate more training dataset data augmentation applying small changes into them and create a new dataset which helps to computer vision to give better results. And creating new data from already available data using augmentation method is purely natural and it does not change the original class label.
There are mostly three types of augmentation method exist in computer vision and those are given below:
In this technique, we are going to generate more data from the existing data without changing their label name. This method is very common but less use as other methods are also providing more powerful features.
This above method only for data generation. As we know to train the neural model we need more data, so that machine can learn more features from the training data and give the best result.
But can you examine, if you have only one image and you want to apply the data augmentation on that image to create the whole dataset for the neural network training? And for that you have to follow the following steps:
You will load the image from the disk or local drive to your system.
And then apply a series of steps of augmentation transformation on that image.
We then store that image back to the disk or local drive of the system.
We run step 2 and 3 as per the number of iteration N.
After the above process over, we will get a lot of images based on only one single image and that all images will help us to train the model. So this method is really very simple and to create more data.
But what will be in that case when you have more than one images like 100 images and you have to create 500+ images of each 100 images. Then this case will not worthy as we first generate the datasets and then we train the model. It will consume a lot of times of the data creation itself.
The above method which we used to generate datasets are good but in general we just only created small datasets on the behalf of the small image datasets. But actually, we not done our model generalize which make predictions on the new datasets on which it was not trained.
As we know very well, as we give more data to the neural network, it will learn more and generalize more datasets when they come for predictions. But on these small amounts of data which we created using data augmentation which not work properly. So, generating more datasets on the behalf of the small datasets which not work. Instead of generating new datasets, we have to look something new way to create datasets using data augmentation while training itself.
This second type of data augmentation is called replace data augmentation, which generally ImageDataGenerator Keras library class do. These type of data augmentation always create new variations of the training data during each epoch, so that neural network sees all-time new variations of data at each epoch.
This second method works as follows:
A set of images in the form of batch goes to the ImageDataGenerator.
ImageDataGenerator transforms into all images by using scaling, rotation, etc. and generate a new set of images.
Now all these images (new transforms) then back to the calling method and from there it will go for further CNN training.
The two main concepts of these methods are:
The ImageDataGenerator is not passing all data to the called function. It passing to them only the newly generated data randomly.
This method we called training time augmentation because we did not generate the new images based on the transformation before training as we did in the first case. Here we are doing during training time, so we called this method training time augmentation.
So in this method, we can say that ImageDataGenerator is interrupting during training time as it passing only modified data randomly to the
neural network and also neural network don't know the images on which it's working that all are not original.
The people generally thought that, when they applying ImageDataGenerator, then this class first take one batch and transforms those data to a new data and then combine original data and newly generated data and then send to the neural network for training. But the ImageDataGenerator not works like this. The ImageDataGenerator class only randomly send those data to the neural network which are newly formed using transformations.
The main goal was of the augmentation to increase the generalizability of the model and the network should see only the new images on each epoch.
But when we pass the newly generated data along with the original data then the network sees all-time original data which is not our motive. So, that's why ImageDataGenerator only send the newly transforms data to the neural network so that we can achieve our goal. While this method helps us to achieve our goal of generalizability of the model and works well for the training and testing way but in another way like training, it can be not good as we generate the random transformation images.
Now in this method, we are going to combine both the original data and new forms data using the data augmentation. These type of datasets
we generally use in the real-time scenario like a self-driving car.
To create a dataset of the self-driving car will be a very tedious job. So instead of creating a dataset we use video game and play that game and while playing that game it will generate data which we will use for the training of the self-driving car.
Once we have all the training datasets, we can go to method no. 2 and train our machine.
So in the next blog, we will see all the above image geometric transformations in details.
The data augmentation is a part of the regularization method that works on the training data. Data augmentation is a method to generate new training data without changing the class labels by applying some random jitters and perturbations. The main motive of our to increase the model generalizability using this method because if we throw more data to the neural network then it can train themselves more accurately by always seeing new data. So from the above discussion of the augmentation, we now understand how actually data augmentation can help us to improve our model accuracy and also how we can increase our datasets in case of less amount.
But it always recommends to gather the natural data and not depends fully on the data augmentation because data augmentation we can use in the case of the small dataset. But if we have the capability to gather the more data then we must have to collect as much as possible rather depends upon the data augmentation.