Tags: #AI

References:
What is Few-Shot Learning? - Unite.AI

Introduction

Few-shot learning refers to a variety of algorithms and techniques used to develop an AI model using a very small amount of training data.

Methods

Most few-shot learning approaches can fit into one of three categories: data-level approaches, parameter-level approaches, and metrics-based approaches.

Data-level

Get more training data

  1. Similar training data:
    If you are training a classifier to recognize specific kinds of dogs but lacked many images of the particular species you were trying to classify, you could include many images of dogs which would help the classifier determine the general features that make up a dog.
  2. Data augmentation:
    Apply transformation to existing data, e.g. rotating, GANs.

Parameter-level

Meta-learning

Meta-Learning: Learning to Learn Fast seems to be a good one.

Teach a model how to learn
One problem with few-shot training: overfit the training data ⇐ high-dimensional spaces.
To solve ⇒ limit the parameter space ⇒ regulation techniques & proper loss functions & a teacher

The process of a gradient-based training:

  1. Create the base-learner (teacher) model
  2. Train the base-learner model on the support set
  3. Have the base-learner return predictions for the query set
  4. Train the meta-learner (student) on the loss derived from the classification error

Starting with randomly initialized parameters ⇒ still potentially overfit the data.
"Model-agnostic" meta-learner is created by limiting the influence of the teacher model/base model. Instead of training the student model directly on the loss for the predictions made by the teacher model, the student model is trained on the loss for its own predictions.
The process of a model-agnostic training:

  1. A copy of the current meta-learner model is created.
  2. The copy is trained with the assistance of the base model/teacher model.
  3. The copy returns predictions for the training data.
  4. Computed loss is used to update the meta-learner.

Metric-based

  1. Use basic distance metrics, to classify query samples based on their similarity to the supporting samples.
  2. Prototypical network, cluster data points together combing clustering models with the metric-based classification described above. (Like K-means clustering.)