Abstract

In the past decade state-of-the-art deep learning models have shown impressive performance in many computer vision tasks by learning from large and diverse image datasets. Most of these datasets consist of web-scraped image collections. This approach, however, makes it very challenging to obtain desirable data such as multiple views of the same object, 3D geometric information, or camera parameters for a large-scale image dataset. In this paper, we propose a 3D-scanned multi-view 2D image dataset of fine-grained category instances with accurate camera calibration parameters. We describe our bi-directional, multi-camera and 3D scanning system and the data collection pipeline. Our target objects are relatively small, highly-detailed fine-grained category instances, such as insects. We present this dataset as a contribution to fine-grained visual categorization, 3D representation learning, and for use in other computer vision tasks.

Degree

College and Department

Computational, Mathematical, and Physical Sciences; Computer Science

Rights

https://lib.byu.edu/about/copyright/