Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 148 tok/s
Gemini 2.5 Pro 44 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 38 tok/s Pro
GPT-4o 85 tok/s Pro
Kimi K2 210 tok/s Pro
GPT OSS 120B 442 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Integrating kNN with Foundation Models for Adaptable and Privacy-Aware Image Classification (2402.12500v1)

Published 19 Feb 2024 in cs.CV, cs.LG, and eess.IV

Abstract: Traditional deep learning models implicity encode knowledge limiting their transparency and ability to adapt to data changes. Yet, this adaptability is vital for addressing user data privacy concerns. We address this limitation by storing embeddings of the underlying training data independently of the model weights, enabling dynamic data modifications without retraining. Specifically, our approach integrates the $k$-Nearest Neighbor ($k$-NN) classifier with a vision-based foundation model, pre-trained self-supervised on natural images, enhancing interpretability and adaptability. We share open-source implementations of a previously unpublished baseline method as well as our performance-improving contributions. Quantitative experiments confirm improved classification across established benchmark datasets and the method's applicability to distinct medical image classification tasks. Additionally, we assess the method's robustness in continual learning and data removal scenarios. The approach exhibits great promise for bridging the gap between foundation models' performance and challenges tied to data privacy. The source code is available at https://github.com/TobArc/privacy-aware-image-classification-with-kNN.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. “Bert: Pre-training of deep bidirectional transformers for language understanding,” in North American Chapter of the Association for Computational Linguistics, 2019.
  2. “Language models are few-shot learners,” in Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, Eds. 2020, vol. 33, pp. 1877–1901, Curran Associates, Inc.
  3. “An image is worth 16x16 words: Transformers for image recognition at scale,” in International Conference on Learning Representations, 2021.
  4. “Learning transferable visual models from natural language supervision,” in International Conference on Machine Learning, 2021.
  5. “Emerging properties in self-supervised vision transformers,” 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9630–9640, 2021.
  6. “Retrieval augmented language model pre-training,” in Proceedings of the 37th International Conference on Machine Learning, Hal Daumé III and Aarti Singh, Eds. 13–18 Jul 2020, vol. 119 of Proceedings of Machine Learning Research, pp. 3929–3938, PMLR.
  7. “Revisiting a knn-based image classification system with high-capacity storage,” in Computer Vision – ECCV 2022. 2022, pp. 457–474, Springer Nature Switzerland.
  8. “Data security issues in deep learning: Attacks, countermeasures, and opportunities,” IEEE Communications Magazine, vol. 57, no. 11, pp. 116–122, 2019.
  9. European Union, “Regulation (eu) 2016/679 of the european parliament and of the council of 27 april 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46/ec (general data protection regulation),” 2016, Article 17, Right to erasure (‘right to be forgotten’).
  10. “A comprehensive survey of forgetting in deep learning beyond continual learning,” 2023.
  11. “A continual learning survey: Defying forgetting in classification tasks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 07, pp. 3366–3385, 2022.
  12. “k-nearest neighbour classifiers - a tutorial,” ACM Computing Surveys (CSUR), vol. 54, pp. 1 – 25, 2020.
  13. “Attention is all you need,” in Neural Information Processing Systems, 2017.
  14. “Hierarchical text-conditional image generation with clip latents,” ArXiv, vol. abs/2204.06125, 2022.
  15. “Dinov2: Learning robust visual features without supervision,” 2023.
  16. “Gradient episodic memory for continual learning,” in Neural Information Processing Systems, 2017.
  17. “Experience replay for continual learning,” in Neural Information Processing Systems, 2018.
  18. “icarl: Incremental classifier and representation learning,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5533–5542, 2016.
  19. “Deep k-NN for noisy labels,” in Proceedings of the 37th International Conference on Machine Learning, Hal Daumé III and Aarti Singh, Eds. 13–18 Jul 2020, vol. 119 of Proceedings of Machine Learning Research, pp. 540–550, PMLR.
  20. “Improving language models by retrieving from trillions of tokens,” in International Conference on Machine Learning, 2021.
  21. Distance and Similarity Measures, pp. 385–400, Springer New York, New York, NY, 2014.
  22. “Chroma - the open-source embedding database,” 2023.
  23. “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, F. Pereira, C.J. Burges, L. Bottou, and K.Q. Weinberger, Eds. 2012, vol. 25, Curran Associates, Inc.
  24. “An analysis of single-layer networks in unsupervised feature learning,” in Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, Geoffrey Gordon, David Dunson, and Miroslav Dudík, Eds., Fort Lauderdale, FL, USA, 11–13 Apr 2011, vol. 15 of Proceedings of Machine Learning Research, pp. 215–223, PMLR.
  25. “Identifying medical diagnoses and treatable diseases by image-based deep learning,” Cell, vol. 172, no. 5, pp. 1122–1131.e9, 2018.
  26. “Skin lesion analysis toward melanoma detection 2018: A challenge hosted by the international skin imaging collaboration (isic),” 2019.
  27. “Descriptor : The ham 10000 dataset , a large collection of multi-source dermatoscopic images of common pigmented skin lesions,” 2018.
  28. “Wide residual networks,” 2017.
  29. “Imagenet: A large-scale hierarchical image database,” 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255, 2009.
  30. “Covxnet: A multi-dilation convolutional neural network for automatic covid-19 and other pneumonia detection from chest x-ray images with transferable multi-receptive feature optimization,” Computers in Biology and Medicine, vol. 122, pp. 103869, 2020.
  31. “Analysis of the isic image datasets: Usage, benchmarks and recommendations,” Medical Image Analysis, vol. 75, pp. 102305, 2022.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Questions

We haven't generated a list of open questions mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 tweets and received 9 likes.

Upgrade to Pro to view all of the tweets about this paper:

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube