Transfer learning is a very valuable technique that allows us to take advantage of models already trained by other devs to use them in our datasets.
Through this module we will understand what transfer learning is, how to load pre-trained systems, how to consume them from third party sources and how to apply them to our projects.

How does transfer learning work?
To understand how transfer learning works let's make an analogy: when you were a child you learned to ride a bicycle, during that learning you understood the concepts of balance, force, speed and so on. In the future, when you learn to ride a motorcycle, you will be able to transfer most of the concepts you have already learned to take advantage of this new learning.
At the machine level, transfer learning is done through the features or characteristics of the model you originally trained. Suppose you have an apple detector and now you want to detect other fruits (oranges, pineapples, etc).
It will not be necessary to retrain a model from the beginning since you already have a configuration that detects shapes and colors, it would be enough to make some iterations on this main model with the new data and you will get a model just as functional.
On the web you can find dozens of configurations that have been trained for months by the hand of great protagonists of deep learning research.
The configuration process will be to remove the final layer of the network that we are going to take advantage of (the original prediction layer) and replace it with our own output configuration.

Using a pre-trained network
Before using a pre-trained model it is essential to understand its architecture.
The MobileNet V2 architecture was designed for object detection in embedded and mobile devices, its input is a 300x300 pixel image and through a series of convolutional layers with max pooling all the features to be classified with a neural network system are acquired. If we wanted to use it, it would be enough to delete the last layer and customize it to our needs.

For this occasion we will load the Inception version 3 model (another well-known convolutional network architecture). We will import our Keras dependencies and load the configuration from the location where they are stored on disk, create a sequential model and inject it from the first layer (note that the output layer is not included).
Our new output layer is added and the model is configured as untrainable.
````python from tensorflow.keras.layers import Dense from tensorflow.keras.Model import Sequential from tensorflow.keras.applications.inception_v3 import InceptionV3
URL inception model
weights_file = "/tmp/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5"
new_model = Sequential() new_model.add(InceptionV3(include_top = False, weights = weights_file))
new_model.add(Dense(num_classes, activation = "softmax")) new_model.layers[0].trainable = False ````.
With this we can leverage hundreds of models already trained by other AI devs.
Contribution created by Sebasti谩n Franco G贸mez.
Want to see more contributions, questions and answers from the community?