One common shortcoming of modern computer vision is the inability of most models to generalize to new classes—one/few shot image recognition. We propose a new problem formulation for this task and present a network architecture and training methodology to solve this task. Further, we provide insights into how careful focus on how not just the data, but the way data presented to the model can have significant impact on performance. Using these method, we achieve high accuracy in few-shot image recognition tasks.
College and Department
Physical and Mathematical Sciences; Computer Science
BYU ScholarsArchive Citation
Hurlburt, Daniel true, "The "What"-"Where" Network: A Tool for One-Shot Image Recognition and Localization" (2021). Theses and Dissertations. 9366.
computer vision, semantic segmentation, few-shot learning, one-shot learning, embedding