Emergent Mind

Abstract

Fine-Grained Visual Classification (FGVC) is a longstanding and fundamental problem in computer vision and pattern recognition, and underpins a diverse set of real-world applications. This paper describes our contribution at SnakeCLEF2022 with FGVC. Firstly, we design a strong multimodal backbone to utilize various meta-information to assist in fine-grained identification. Secondly, we provide new loss functions to solve the long tail distribution with dataset. Then, in order to take full advantage of unlabeled datasets, we use self-supervised learning and supervised learning joint training to provide pre-trained model. Moreover, some effective data process tricks also are considered in our experiments. Last but not least, fine-tuned in downstream task with hard mining, ensambled kinds of model performance. Extensive experiments demonstrate that our method can effectively improve the performance of fine-grained recognition. Our method can achieve a macro f1 score 92.7% and 89.4% on private and public dataset, respectively, which is the 1st place among the participators on private leaderboard.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.