What is transfer learning? — Computer Vision Division

When trying to find options regarding what network architectures and learning methods we could use, we ran into NVIDIA’s Transfer Learning Toolkit. Transfer learning is the process in which features are extracted from a network which was previously trained to perform some task, in order to feed them into a new network capable of solving a similar problem with new data. This technique is currently very popular in the deep learning realm because it enables deep networks to be trained with a small data set, which comes in handy in real life problems where there aren’t millions of tagged images to use in order to train a model.

But how is this done? Well, first we need to talk about Convolutional Neural Networks and we will do so by looking at an oversimplified diagram of one of this networks.

Image taken from: https://docs.ecognition.com/v9.5.0/eCognition_documentation/Reference%20Book/23%20Convolutional%20Neural%20Network%20Algorithms/Convolutional%20Neural%20Network%20Algorithms.html

In convolutional neural networks (known as CNNs) three separate types of layers can be seen: convolution layers, pooling layers and the classifier layers. Convolution and pooling layers make up the first and middle stages of the CNN as can be seen above, and by virtue of a few mathematical operations are able to recognize edges, mostly associated to convolutional layers, and shapes, mostly associated to pooling layers, in input images. Finally, this data is fed to one or more classifier layers which classify the objects in the images according to pre-fed classes.

In transfer learning the pooling and convolutional layers are mearly used, and only the final classifier layers are retrained in order for the network to recognize the new classes that might have been added in our dataset for a given problem.

Why use transfer learning? Well, for one there’s no way we can generate and tag the millions of images needed to train a neural net from zero. Tagging itself is extremely time consuming and even if we could store the millions of images needed, tagging itself would take a lot of time. Transfer learning significantly reduces the size of the training dataset needed given the network was already previously trained with an extensive dataset. Furthermore, TLT already has a few models trained to recognize pedestrians and other similar classes to the ones we are trying to detect, so retraining should be a fairly straightforward process.

--

--

--

We’re FlowLabs, a Software Company specialized in AI, APP and WEB solutions. Join us on our journey through some interesting solutions to our daily challenges!

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

I personally really enjoyed the reading of this paper as it helped encourage thought experiments…

Conditional GANs: How they work and how to use them

Understanding TensorFlow (&Swift?)

Self-Supervised Visual Terrain Classification

How to write a homemade K Nearest Neighbors model

Why tf.data is so much better than feed_dict and how to build a simple data pipeline in 5 minutes.

EM Algorithm — A simple Explanation

Use Scikit-Learn Pipelines to clean data and train models faster

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Flow Labs

Flow Labs

We’re FlowLabs, a Software Company specialized in AI, APP and WEB solutions. Join us on our journey through some interesting solutions to our daily challenges!

More from Medium

ResNet50 vs ResNeXt50: an experiment with CIFAR10 data

Vanishing Point Detection

GAN Computer Generate Art: A GANs Survey

FizzBuzz in TensorFlow