This AI Paper Proposes CaFo: A Cascade of Foundation Models that Incorporates Diverse Prior Knowledge of Various Pre-Training Paradigms for Better Few-Shot Learning


Image Recognition

ARTICLE SOURCE

Instead, few-shot learning, where the networks are confined to learn from constrained pictures with annotations, also becomes a research hotspot for various data-deficient and resource-finite scenarios. Recent results demonstrate good zero-shot transfer ability for open-vocabulary visual identification using CLIP pre-trained by large-scale language-image pairings. This shows that the network has strong representational capabilities even while the few-shot training material is inadequate, which greatly aids the few-shot learning on downstream domains. They combine CLIP, DINO, DALL-E, and GPT3 to give CaFo four forms of previous knowledge, as seen in Figure 1. The following summarizes their key contributions:• For improved few-shot learning, they suggest using CaFo to incorporate past information from diverse pre-training paradigms.