Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards In-Vehicle Multi-Task Facial Attribute Recognition: Investigating Synthetic Data and Vision Foundation Models (2403.06088v1)

Published 10 Mar 2024 in cs.CV, cs.AI, cs.LG, and eess.IV

Abstract: In the burgeoning field of intelligent transportation systems, enhancing vehicle-driver interaction through facial attribute recognition, such as facial expression, eye gaze, age, etc., is of paramount importance for safety, personalization, and overall user experience. However, the scarcity of comprehensive large-scale, real-world datasets poses a significant challenge for training robust multi-task models. Existing literature often overlooks the potential of synthetic datasets and the comparative efficacy of state-of-the-art vision foundation models in such constrained settings. This paper addresses these gaps by investigating the utility of synthetic datasets for training complex multi-task models that recognize facial attributes of passengers of a vehicle, such as gaze plane, age, and facial expression. Utilizing transfer learning techniques with both pre-trained Vision Transformer (ViT) and Residual Network (ResNet) models, we explore various training and adaptation methods to optimize performance, particularly when data availability is limited. We provide extensive post-evaluation analysis, investigating the effects of synthetic data distributions on model performance in in-distribution data and out-of-distribution inference. Our study unveils counter-intuitive findings, notably the superior performance of ResNet over ViTs in our specific multi-task context, which is attributed to the mismatch in model complexity relative to task complexity. Our results highlight the challenges and opportunities for enhancing the use of synthetic data and vision foundation models in practical applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (162)
  1. D. J. Fagnant and K. Kockelman, “Preparing a nation for autonomous vehicles: opportunities, barriers and policy recommendations,” Transportation Research Part A: Policy and Practice, vol. 77, pp. 167–181, 2015.
  2. A. K. Tyagi and N. Sreenath, “Autonomous vehicles and intelligent transportation systems—a framework of intelligent vehicles,” in Intelligent Transportation Systems: Theory and Practice.   Springer, 2022, pp. 75–98.
  3. M. Da Lio, F. Biral, E. Bertolazzi, M. Galvani, P. Bosetti, D. Windridge, A. Saroldi, and F. Tango, “Artificial co-drivers as a universal enabling technology for future intelligent vehicles and transportation systems,” IEEE Transactions on intelligent transportation systems, vol. 16, no. 1, pp. 244–263, 2014.
  4. G. Li, C. Olaverri-Monreal, X. Qu, C. S. Wu, S. E. Li, H. Taghavifar, Y. Xing, and S. Li, “Driver behavior in intelligent transportation systems [guest editorial],” IEEE Intelligent Transportation Systems Magazine, vol. 14, no. 3, pp. 7–9, 2022.
  5. M. Natarajan, E. Seraj, B. Altundas, R. Paleja, S. Ye, L. Chen, R. Jensen, K. C. Chang, and M. Gombolay, “Human-robot teaming: Grand challenges,” Current Robotics Reports, pp. 1–20, 2023.
  6. M. Torstensson, T. H. Bui, D. Lindström, C. Englund, and B. Duran, “In-vehicle driver and passenger activity recognition,” in 37th Annual Swedish Symposium on Image Analysis (SSBA 2019), Gothenburg, Sweden, March 19-20, 2019, 2019.
  7. A. Kashevnik, I. Lashkov, D. Ryumin, and A. Karpov, “Smartphone-based driver support in vehicle cabin: Human-computer interaction interface,” in Interactive Collaborative Robotics: 4th International Conference, ICR 2019, Istanbul, Turkey, August 20–25, 2019, Proceedings 4.   Springer, 2019, pp. 129–138.
  8. S. Sasidharan and V. Kanagarajan, “Vehicle cabin safety alert system,” in 2015 International Conference on Computer Communication and Informatics (ICCCI).   IEEE, 2015, pp. 1–4.
  9. A. Kashevnik, A. Ponomarev, and A. Krasov, “Human-computer threats classification in intelligent transportation systems,” in 2020 26th Conference of Open Innovations Association (FRUCT).   IEEE, 2020, pp. 151–157.
  10. S. Fernandez and T. Ito, “Driver classification for intelligent transportation systems using fuzzy logic,” in 2016 IEEE 19th international conference on intelligent transportation systems (ITSC).   IEEE, 2016, pp. 1212–1216.
  11. E. Seraj, A. Silva, and M. Gombolay, “Safe coordination of human-robot firefighting teams,” arXiv preprint arXiv:1903.06847, 2019.
  12. E. Namazi, R. N. Holthe-Berg, C. S. Lofsberg, and J. Li, “Using vehicle-mounted camera to collect information for managing mixed traffic,” in 2019 15th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS).   IEEE, 2019, pp. 222–230.
  13. J. Chen, Y. Fang, H. Sheng, I. Masaki, B. Horn, and Z. Xiong, “Real-time vehicle status perception without frame-based segmentation for smart camera network,” in 2018 4th International Conference on Universal Village (UV).   IEEE, 2018, pp. 1–6.
  14. C. Ergenc and L. Yifei, “A review of art and real world applications of intelligent perception systems,” Advances in Intelligent Systems and Technologies, pp. 076–086, 2022.
  15. M. Abufadda and K. Mansour, “A survey of synthetic data generation for machine learning,” in 2021 22nd international arab conference on information technology (ACIT).   IEEE, 2021, pp. 1–7.
  16. Y. Lu, H. Wang, and W. Wei, “Machine learning for synthetic data generation: a review,” arXiv preprint arXiv:2302.04062, 2023.
  17. S. Gu, D. Chen, J. Bao, F. Wen, B. Zhang, D. Chen, L. Yuan, and B. Guo, “Vector quantized diffusion model for text-to-image synthesis,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10 696–10 706.
  18. F.-A. Croitoru, V. Hondru, R. T. Ionescu, and M. Shah, “Diffusion models in vision: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
  19. J. Ho, C. Saharia, W. Chan, D. J. Fleet, M. Norouzi, and T. Salimans, “Cascaded diffusion models for high fidelity image generation,” The Journal of Machine Learning Research, vol. 23, no. 1, pp. 2249–2281, 2022.
  20. R. J. Chen, M. Y. Lu, T. Y. Chen, D. F. Williamson, and F. Mahmood, “Synthetic data in machine learning for medicine and healthcare,” Nature Biomedical Engineering, vol. 5, no. 6, pp. 493–497, 2021.
  21. J. Dahmen and D. Cook, “Synsys: A synthetic data generation system for healthcare applications,” Sensors, vol. 19, no. 5, p. 1181, 2019.
  22. D. Talwar, S. Guruswamy, N. Ravipati, and M. Eirinaki, “Evaluating validity of synthetic data in perception tasks for autonomous vehicles,” in 2020 IEEE International Conference On Artificial Intelligence Testing (AITest).   IEEE, 2020, pp. 73–80.
  23. Y. Yuan and M. Sester, “Comap: A synthetic dataset for collective multi-agent perception of autonomous driving,” The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. 43, pp. 255–263, 2021.
  24. J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable are features in deep neural networks?” Advances in neural information processing systems, vol. 27, 2014.
  25. S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, and W. Samek, “On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation,” PloS one, vol. 10, no. 7, p. e0130140, 2015.
  26. C. Shorten and T. M. Khoshgoftaar, “A survey on image data augmentation for deep learning,” Journal of big data, vol. 6, no. 1, pp. 1–48, 2019.
  27. E. J. Topol, “High-performance medicine: the convergence of human and artificial intelligence,” Nature medicine, vol. 25, no. 1, pp. 44–56, 2019.
  28. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” 2021.
  29. S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, and M. Shah, “Transformers in vision: A survey,” ACM computing surveys (CSUR), vol. 54, no. 10s, pp. 1–41, 2022.
  30. K. Han, Y. Wang, H. Chen, X. Chen, J. Guo, Z. Liu, Y. Tang, A. Xiao, C. Xu, Y. Xu et al., “A survey on vision transformer,” IEEE transactions on pattern analysis and machine intelligence, vol. 45, no. 1, pp. 87–110, 2022.
  31. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
  32. Z. G. Muhammad Shafiq, “Deep residual learning for image recognition: A survey,” Applied Sciences, 2022. [Online]. Available: https://doi.org/10.3390/APP12188972
  33. R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill et al., “On the opportunities and risks of foundation models,” arXiv preprint arXiv:2108.07258, 2021.
  34. G. Mai, C. Cundy, K. Choi, Y. Hu, N. Lao, and S. Ermon, “Towards a foundation model for geospatial artificial intelligence (vision paper),” in Proceedings of the 30th International Conference on Advances in Geographic Information Systems, 2022, pp. 1–4.
  35. W. Wang, J. Dai, Z. Chen, Z. Huang, Z. Li, X. Zhu, X. Hu, T. Lu, L. Lu, H. Li et al., “Internimage: Exploring large-scale vision foundation models with deformable convolutions,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 14 408–14 419.
  36. P. K. Murali, M. Kaboli, and R. Dahiya, “Intelligent in-vehicle interaction technologies,” Advanced Intelligent Systems, vol. 4, no. 2, p. 2100122, 2022.
  37. J. Lu, X. Zhang, X. Yang, I. Unwala et al., “Intelligent in-vehicle safety and security monitoring system with face recognition,” in 2019 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC).   IEEE, 2019, pp. 225–229.
  38. P. Archana, P. Divyabharathi, S. Balaji, N. Kumareshan, P. Veeramanikandan, S. Naitik, S. M. Rafi, P. V. Nandankar, and G. Manikandan, “Face recognition based vehicle starter using machine learning,” Measurement: Sensors, vol. 24, p. 100575, 2022.
  39. D. Yang, K. Jiang, D. Zhao, C. Yu, Z. Cao, S. Xie, Z. Xiao, X. Jiao, S. Wang, and K. Zhang, “Intelligent and connected vehicles: Current status and future perspectives,” Science China Technological Sciences, vol. 61, pp. 1446–1471, 2018.
  40. C. Mingyang, H. Heye, X. Qing, W. Jianqiang, T. SEKIGUCHI, G. Lu, and L. Keqiang, “Survey of intelligent and connected vehicle technologies: Architectures, functions and applications,” Journal of Tsinghua University (Science and Technology), vol. 62, no. 3, pp. 493–508, 2022.
  41. J. Liu and J. Liu, “Intelligent and connected vehicles: Current situation, future directions, and challenges,” IEEE Communications Standards Magazine, vol. 2, no. 3, pp. 59–65, 2018.
  42. E. Seraj, “Enhancing teamwork in multi-robot systems: Embodied intelligence via model- and data-driven approaches,” Ph.D. dissertation, Georgia Institute of Technology, 2023.
  43. W. Chu, Q. Wuniri, X. Du, Q. Xiong, T. Huang, and K. Li, “Cloud control system architectures, technologies and applications on intelligent and connected vehicles: a review,” Chinese Journal of Mechanical Engineering, vol. 34, no. 1, pp. 1–23, 2021.
  44. Y. Xun, Y. Sun, and J. Liu, “An experimental study towards driver identification for intelligent and connected vehicles,” in ICC 2019-2019 IEEE International Conference on Communications (ICC).   IEEE, 2019, pp. 1–6.
  45. E. Seraj, “Embodied team intelligence in multi-robot systems,” in Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, 2022, pp. 1869–1871.
  46. E. Seraj, Z. Wang, R. Paleja, D. Martin, M. Sklar, A. Patel, and M. Gombolay, “Learning efficient diverse communication for cooperative heterogeneous teaming,” in Proceedings of the 21st international conference on autonomous agents and multiagent systems, 2022, pp. 1173–1182.
  47. E. Seraj, “Embodied, intelligent communication for multi-agent cooperation,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 13, 2023, pp. 16 135–16 136.
  48. L. Du, W. Chen, J. Ji, Z. Pei, B. Tong, H. Zheng et al., “A novel intelligent approach to lane-change behavior prediction for intelligent and connected vehicles,” Computational Intelligence and Neuroscience, vol. 2022, 2022.
  49. L. Du, J. Ji, Z. Pei, H. Zheng, S. Fu, H. Kong, and W. Chen, “Improved detection method for traffic signs in real scenes applied in intelligent and connected vehicles,” IET Intelligent Transport Systems, vol. 14, no. 12, pp. 1555–1564, 2020.
  50. L. Du, W. Chen, S. Fu, H. Kong, C. Li, and Z. Pei, “Real-time detection of vehicle and traffic light for intelligent and connected vehicles based on yolov3 network,” in 2019 5th International Conference on Transportation Information and Safety (ICTIS).   IEEE, 2019, pp. 388–392.
  51. W. Zhou, L. Yang, T. Ying, J. Yuan, and Y. Yang, “Velocity prediction of intelligent and connected vehicles for a traffic light distance on the urban road,” IEEE Transactions on Intelligent Transportation Systems, vol. 20, no. 11, pp. 4119–4133, 2018.
  52. B. Okumura, M. R. James, Y. Kanzawa, M. Derry, K. Sakai, T. Nishi, and D. Prokhorov, “Challenges in perception and decision making for intelligent automotive vehicles: A case study,” IEEE Transactions on Intelligent Vehicles, vol. 1, no. 1, pp. 20–32, 2016.
  53. E. Seraj, Z. Wang, R. Paleja, M. Sklar, A. Patel, and M. Gombolay, “Heterogeneous graph attention networks for learning diverse communication,” arXiv preprint arXiv:2108.09568, 2021.
  54. S. G. Konan, E. Seraj, and M. Gombolay, “Contrastive decision transformers,” in Conference on Robot Learning.   PMLR, 2023, pp. 2159–2169.
  55. L. Hu, X. Zhou, X. Zhang, F. Wang, Q. Li, and W. Wu, “A review on key challenges in intelligent vehicles: Safety and driver-oriented features,” IET Intelligent Transport Systems, vol. 15, no. 9, pp. 1093–1105, 2021.
  56. E. Seraj, X. Wu, and M. Gombolay, “Firecommander: An interactive, probabilistic multi-agent environment for heterogeneous robot teams,” arXiv preprint arXiv:2011.00165, 2020.
  57. Q. Xu, M. Cai, K. Li, B. Xu, J. Wang, and X. Wu, “Coordinated formation control for intelligent and connected vehicles in multiple traffic scenarios,” IET Intelligent Transport Systems, vol. 15, no. 1, pp. 159–173, 2021.
  58. E. Seraj and M. Gombolay, “Coordinated control of uavs for human-centered active sensing of wildfires,” in 2020 American control conference (ACC).   IEEE, 2020, pp. 1845–1852.
  59. D. Xiaoping, L. Dongxin, L. Shen, W. Qiqige, and C. Wenbo, “Coordinated control algorithm at non-recurrent freeway bottlenecks for intelligent and connected vehicles,” IEEE Access, vol. 8, pp. 51 621–51 633, 2020.
  60. E. Seraj, L. Chen, and M. C. Gombolay, “A hierarchical coordination framework for joint perception-action tasks in composite robot teams,” IEEE Transactions on Robotics, vol. 38, no. 1, pp. 139–158, 2021.
  61. E. Seraj, V. Azimi, C. Abdallah, S. Hutchinson, and M. Gombolay, “Adaptive leader-follower control for multi-robot teams with uncertain network structure,” in 2021 American control conference (ACC).   IEEE, 2021, pp. 1088–1094.
  62. D. Jia and D. Ngoduy, “Enhanced cooperative car-following traffic model with the combination of v2v and v2i communication,” Transportation Research Part B: Methodological, vol. 90, pp. 172–191, 2016.
  63. K. C. Dey, A. Rayamajhi, M. Chowdhury, P. Bhavsar, and J. Martin, “Vehicle-to-vehicle (v2v) and vehicle-to-infrastructure (v2i) communication in a heterogeneous wireless network–performance evaluation,” Transportation Research Part C: Emerging Technologies, vol. 68, pp. 168–184, 2016.
  64. J. Santa, A. F. Gómez-Skarmeta, and M. Sánchez-Artigas, “Architecture and evaluation of a unified v2v and v2i communication system based on cellular networks,” Computer Communications, vol. 31, no. 12, pp. 2850–2861, 2008.
  65. S. G. Konan, E. Seraj, and M. Gombolay, “Iterated reasoning with mutual information in cooperative and byzantine decentralized teaming,” in International Conference on Learning Representations, 2021.
  66. E. Seraj, J. Y. Xiong, M. L. Schrum, and M. Gombolay, “Mixed-initiative multiagent apprenticeship learning for human training of robot teams,” in Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  67. L. Pimentel, R. Paleja, Z. Wang, E. Seraj, J. E. Pagan, and M. Gombolay, “Scaling multi-agent reinforcement learning via state upsampling,” in RSS 2022 Workshop on Scaling Robot Learning, 2022.
  68. Z. Tan, N. Dai, Y. Su, R. Zhang, Y. Li, D. Wu, and S. Li, “Human–machine interaction in intelligent and connected vehicles: a review of status quo, issues, and opportunities,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 9, pp. 13 954–13 975, 2021.
  69. S. Cafiso, A. Di Graziano, and G. Pappalardo, “In-vehicle stereo vision system for identification of traffic conflicts between bus and pedestrian,” Journal of traffic and transportation engineering (English edition), vol. 4, no. 1, pp. 3–13, 2017.
  70. R. Dinakaran, L. Zhang, and R. Jiang, “In-vehicle object detection in the wild for driverless vehicles,” in Developments of Artificial Intelligence Technologies in Computation and Robotics: Proceedings of the 14th International FLINS Conference (FLINS 2020).   World Scientific, 2020, pp. 1139–1147.
  71. T. Huang and S. Russell, “Object identification: A bayesian analysis with application to traffic surveillance,” Artificial Intelligence, vol. 103, no. 1-2, pp. 77–93, 1998.
  72. G. Qi, H. Wang, M. Haner, C. Weng, S. Chen, and Z. Zhu, “Convolutional neural network based detection and judgement of environmental obstacle in vehicle operation,” CAAI Transactions on Intelligence Technology, vol. 4, no. 2, pp. 80–91, 2019.
  73. S. Aradi, “Survey of deep reinforcement learning for motion planning of autonomous vehicles,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 2, pp. 740–759, 2020.
  74. D. Yudin, A. Skrynnik, A. Krishtopik, I. Belkin, and A. Panov, “Object detection with deep neural networks for reinforcement learning in the task of autonomous vehicles path planning at the intersection,” Optical Memory and Neural Networks, vol. 28, pp. 283–295, 2019.
  75. R. Xie, Z. Meng, L. Wang, H. Li, K. Wang, and Z. Wu, “Unmanned aerial vehicle path planning algorithm based on deep reinforcement learning in large-scale and dynamic environments,” IEEE Access, vol. 9, pp. 24 884–24 900, 2021.
  76. C. J. Normark, “Personalizable vehicle user interfaces for better user experience,” Advances in Affective and Pleasurable Design, p. 169, 2021.
  77. H. Du, S. Teng, H. Chen, J. Ma, X. Wang, C. Gou, B. Li, S. Ma, Q. Miao, X. Na et al., “Chat with chatgpt on intelligent vehicles: An ieee tiv perspective,” IEEE Transactions on Intelligent Vehicles, 2023.
  78. J. Watson, E. Duff, C. Fletcher, J. McVeigh-Schultz, J. Stein, and S. S. Fisher, “Ambient storytelling for vehicle-driver interaction.”
  79. K. Uvarov and A. Ponomarev, “Maintaining vehicle driver’s state using personalized interventions,” in 2022 31st Conference of Open Innovations Association (FRUCT).   IEEE, 2022, pp. 347–354.
  80. M. Hasenjäger and H. Wersing, “Personalization in advanced driver assistance systems and autonomous vehicles: A review,” in 2017 ieee 20th international conference on intelligent transportation systems (itsc).   IEEE, 2017, pp. 1–7.
  81. J. Xu, J. Chen, and Z. Liu, “Research on active interaction and user experience of community intelligent vehicle system,” in 2021 International Symposium on Artificial Intelligence and its Application on Media (ISAIAM).   IEEE, 2021, pp. 43–50.
  82. Y. Jacob, S. Manitsaris, F. Moutarde, G. Lele, and L. Pradere, “Hand gesture recognition for driver vehicle interaction,” in IEEE Computer Society Workshop on Observing and understanding hands in action (Hands 2015) of 28th IEEE conf. on Computer Vision and Pattern Recognition (CVPR’2015), 2015.
  83. C. Pickering, “Gesture recognition driver controls,” Computing and Control Engineering, vol. 16, no. 1, pp. 26–27, 2005.
  84. U. Agrawal, S. Giripunje, and P. Bajaj, “Emotion and gesture recognition with soft computing tool for drivers assistance system in human centered transportation,” in 2013 IEEE International Conference on Systems, Man, and Cybernetics.   IEEE, 2013, pp. 4612–4616.
  85. A. Doshi and M. M. Trivedi, “Tactical driver behavior prediction and intent inference: A review,” in 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC).   IEEE, 2011, pp. 1892–1897.
  86. H.-B. Kang, “Various approaches for driver and driving behavior monitoring: A review,” in Proceedings of the IEEE International Conference on Computer Vision Workshops, 2013, pp. 616–623.
  87. D. Yang, X. Li, X. Dai, R. Zhang, L. Qi, W. Zhang, and Z. Jiang, “All in one network for driver attention monitoring,” in ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).   IEEE, 2020, pp. 2258–2262.
  88. Z. Hu, Y. Zhang, Q. Li, and C. Lv, “A novel heterogeneous network for modeling driver attention with multi-level visual content,” IEEE transactions on intelligent transportation systems, vol. 23, no. 12, pp. 24 343–24 354, 2022.
  89. Y. Rong, N.-R. Kassautzki, W. Fuhl, and E. Kasneci, “Where and what: Driver attention-based object detection,” Proceedings of the ACM on Human-Computer Interaction, vol. 6, no. ETRA, pp. 1–22, 2022.
  90. A. Zyner, S. Worrall, J. Ward, and E. Nebot, “Long short term memory for driver intent prediction,” in 2017 IEEE Intelligent Vehicles Symposium (IV).   IEEE, 2017, pp. 1484–1489.
  91. B. Morris, A. Doshi, and M. Trivedi, “Lane change intent prediction for driver assistance: On-road design and evaluation,” in 2011 IEEE Intelligent Vehicles Symposium (IV).   IEEE, 2011, pp. 895–901.
  92. D. J. Phillips, T. A. Wheeler, and M. J. Kochenderfer, “Generalizable intention prediction of human drivers at intersections,” in 2017 IEEE intelligent vehicles symposium (IV).   IEEE, 2017, pp. 1665–1670.
  93. Z. Li, F. Liu, W. Yang, S. Peng, and J. Zhou, “A survey of convolutional neural networks: analysis, applications, and prospects,” IEEE transactions on neural networks and learning systems, 2021.
  94. K. O’Shea and R. Nash, “An introduction to convolutional neural networks,” arXiv preprint arXiv:1511.08458, 2015.
  95. F. Vicente, Z. Huang, X. Xiong, F. De la Torre, W. Zhang, and D. Levi, “Driver gaze tracking and eyes off the road detection system,” IEEE Transactions on Intelligent Transportation Systems, vol. 16, no. 4, pp. 2014–2027, 2015.
  96. H. S. Yoon, N. R. Baek, N. Q. Truong, and K. R. Park, “Driver gaze detection based on deep residual networks using the combined single image of dual near-infrared cameras,” IEEE Access, vol. 7, pp. 93 448–93 461, 2019.
  97. R. A. Naqvi, M. Arsalan, G. Batchuluun, H. S. Yoon, and K. R. Park, “Deep learning-based gaze detection system for automobile drivers using a nir camera sensor,” Sensors, vol. 18, no. 2, p. 456, 2018.
  98. L. Fletcher and A. Zelinsky, “Driver inattention detection based on eye gaze—road event correlation,” The international journal of robotics research, vol. 28, no. 6, pp. 774–801, 2009.
  99. S. M. Shah, Z. Sun, K. Zaman, A. Hussain, M. Shoaib, and L. Pei, “A driver gaze estimation method based on deep learning,” Sensors, vol. 22, no. 10, p. 3959, 2022.
  100. L. Fridman, P. Langhans, J. Lee, and B. Reimer, “Driver gaze region estimation without use of eye movement,” IEEE Intelligent Systems, vol. 31, no. 3, pp. 49–56, 2016.
  101. S. Dari, N. Kadrileev, and E. Hüllermeier, “A neural network-based driver gaze classification system with vehicle signals,” in 2020 International Joint Conference on Neural Networks (IJCNN).   IEEE, 2020, pp. 1–7.
  102. X. Wang, R. Guo, and C. Kambhamettu, “Deeply-learned feature for age estimation,” in 2015 IEEE Winter Conference on Applications of Computer Vision.   IEEE, 2015, pp. 534–541.
  103. S. Ashiqur Rahman, P. Giacobbi, L. Pyles, C. Mullett, G. Doretto, and D. A. Adjeroh, “Deep learning for biological age estimation,” Briefings in bioinformatics, vol. 22, no. 2, pp. 1767–1781, 2021.
  104. J. H. Lee and K. G. Kim, “Applying deep learning in medical images: the case of bone age estimation,” Healthcare informatics research, vol. 24, no. 1, pp. 86–92, 2018.
  105. G. Du, Z. Wang, B. Gao, S. Mumtaz, K. M. Abualnaja, and C. Du, “A convolution bidirectional long short-term memory neural network for driver emotion recognition,” IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 7, pp. 4570–4578, 2020.
  106. S. Zepf, J. Hernandez, A. Schmitt, W. Minker, and R. W. Picard, “Driver emotion recognition for intelligent vehicles: A survey,” ACM Computing Surveys (CSUR), vol. 53, no. 3, pp. 1–30, 2020.
  107. W. Li, G. Zeng, J. Zhang, Y. Xu, Y. Xing, R. Zhou, G. Guo, Y. Shen, D. Cao, and F.-Y. Wang, “Cogemonet: A cognitive-feature-augmented driver emotion recognition model for smart cockpit,” IEEE Transactions on Computational Social Systems, vol. 9, no. 3, pp. 667–678, 2021.
  108. H. Xiao, W. Li, G. Zeng, Y. Wu, J. Xue, J. Zhang, C. Li, and G. Guo, “On-road driver emotion recognition using facial expression,” Applied Sciences, vol. 12, no. 2, p. 807, 2022.
  109. C. Park, Y. Jeong, M. Cho, and J. Park, “Fast point transformer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 16 949–16 958.
  110. Z. Liu, J. Ning, Y. Cao, Y. Wei, Z. Zhang, S. Lin, and H. Hu, “Video swin transformer,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 3202–3211.
  111. Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 10 012–10 022.
  112. J. Donahue, L. Anne Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell, “Long-term recurrent convolutional networks for visual recognition and description,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 2625–2634.
  113. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  114. J. Gehring, M. Auli, D. Grangier, D. Yarats, and Y. N. Dauphin, “Convolutional sequence to sequence learning,” in International conference on machine learning.   PMLR, 2017, pp. 1243–1252.
  115. H. Hosseini, B. Xiao, M. Jaiswal, and R. Poovendran, “On the limitation of convolutional neural networks in recognizing negative images,” in 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).   IEEE, 2017, pp. 352–358.
  116. R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi, “Convolutional neural networks: an overview and application in radiology,” Insights into imaging, vol. 9, pp. 611–629, 2018.
  117. C. Nebauer, “Evaluation of convolutional neural networks for visual recognition,” IEEE transactions on neural networks, vol. 9, no. 4, pp. 685–696, 1998.
  118. B. Li, Y. Shi, Z. Qi, and Z. Chen, “A survey on semantic segmentation,” in 2018 IEEE International Conference on Data Mining Workshops (ICDMW).   IEEE, 2018, pp. 1233–1240.
  119. B. Wang, Y. Lei, T. Yan, N. Li, and L. Guo, “Recurrent convolutional neural network: A new framework for remaining useful life prediction of machinery,” Neurocomputing, vol. 379, pp. 117–129, 2020.
  120. K. Islam, “Recent advances in vision transformer: A survey and outlook of recent work,” arXiv preprint arXiv:2203.01536, 2022.
  121. L. Yuan, Y. Chen, T. Wang, W. Yu, Y. Shi, Z.-H. Jiang, F. E. Tay, J. Feng, and S. Yan, “Tokens-to-token vit: Training vision transformers from scratch on imagenet,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 558–567.
  122. H. Thisanke, C. Deshan, K. Chamith, S. Seneviratne, R. Vidanaarachchi, and D. Herath, “Semantic segmentation using vision transformers: A survey,” Engineering Applications of Artificial Intelligence, vol. 126, p. 106669, 2023.
  123. Z. Long, Z. Meng, G. A. Camarasa, and R. McCreadie, “Lacvit: A label-aware contrastive training framework for vision transformers,” arXiv preprint arXiv:2303.18013, 2023.
  124. Y. Li, K. Zhang, J. Cao, R. Timofte, and L. Van Gool, “Localvit: Bringing locality to vision transformers,” arXiv preprint arXiv:2104.05707, 2021.
  125. H. Xu, Q. Xu, F. Cong, J. Kang, C. Han, Z. Liu, A. Madabhushi, and C. Lu, “Vision transformers for computational histopathology,” IEEE Reviews in Biomedical Engineering, 2023.
  126. Y. Zhang and Q. Yang, “An overview of multi-task learning,” National Science Review, vol. 5, no. 1, pp. 30–43, 2018.
  127. K. Weiss, T. M. Khoshgoftaar, and D. Wang, “A survey of transfer learning,” Journal of Big data, vol. 3, no. 1, pp. 1–40, 2016.
  128. L. Torrey and J. Shavlik, “Transfer learning,” in Handbook of research on machine learning applications and trends: algorithms, methods, and techniques.   IGI global, 2010, pp. 242–264.
  129. S. Ruder, “An overview of multi-task learning in deep neural networks,” arXiv preprint arXiv:1706.05098, 2017.
  130. Y. Zhang and Q. Yang, “A survey on multi-task learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 34, no. 12, pp. 5586–5609, 2021.
  131. K.-H. Thung and C.-Y. Wee, “A brief review on multi-task learning,” Multimedia Tools and Applications, vol. 77, pp. 29 705–29 725, 2018.
  132. T. Evgeniou and M. Pontil, “Regularized multi–task learning,” in Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, 2004, pp. 109–117.
  133. T. Standley, A. Zamir, D. Chen, L. Guibas, J. Malik, and S. Savarese, “Which tasks should be learned together in multi-task learning?” in International Conference on Machine Learning.   PMLR, 2020, pp. 9120–9132.
  134. D.-K. Nguyen and T. Okatani, “Multi-task learning of hierarchical vision-language representation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 10 492–10 501.
  135. F. Zhuang, Z. Qi, K. Duan, D. Xi, Y. Zhu, H. Zhu, H. Xiong, and Q. He, “A comprehensive survey on transfer learning,” Proceedings of the IEEE, vol. 109, no. 1, pp. 43–76, 2020.
  136. S. Niu, Y. Liu, J. Wang, and H. Song, “A decade survey of transfer learning (2010–2020),” IEEE Transactions on Artificial Intelligence, vol. 1, no. 2, pp. 151–166, 2020.
  137. S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Transactions on knowledge and data engineering, vol. 22, no. 10, pp. 1345–1359, 2009.
  138. X. Li, Y. Grandvalet, F. Davoine, J. Cheng, Y. Cui, H. Zhang, S. Belongie, Y.-H. Tsai, and M.-H. Yang, “Transfer learning in computer vision tasks: Remember where you come from,” Image and Vision Computing, vol. 93, p. 103853, 2020.
  139. B.-X. Wu, C.-G. Yang, and J.-P. Zhong, “Research on transfer learning of vision-based gesture recognition,” International Journal of Automation and Computing, vol. 18, pp. 422–431, 2021.
  140. K. Gopalakrishnan, S. K. Khaitan, A. Choudhary, and A. Agrawal, “Deep convolutional neural networks with transfer learning for computer vision-based data-driven pavement distress detection,” Construction and building materials, vol. 157, pp. 322–330, 2017.
  141. M. Awais, M. Naseer, S. Khan, R. M. Anwer, H. Cholakkal, M. Shah, M.-H. Yang, and F. S. Khan, “Foundational models defining a new era in vision: A survey and outlook,” arXiv preprint arXiv:2307.13721, 2023.
  142. C. Zhang, S. Zheng, C. Li, Y. Qiao, T. Kang, X. Shan, C. Zhang, C. Qin, F. Rameau, S.-H. Bae et al., “A survey on segment anything model (sam): Vision foundation model meets prompt engineering,” arXiv preprint arXiv:2306.06211, 2023.
  143. J. Zhang, J. Huang, S. Jin, and S. Lu, “Vision-language models for vision tasks: A survey,” arXiv preprint arXiv:2304.00685, 2023.
  144. L. Cuimei, Q. Zhiliang, J. Nan, and W. Jianhua, “Human face detection algorithm via haar cascade classifier combined with three additional classifiers,” in 2017 13th IEEE International Conference on Electronic Measurement & Instruments (ICEMI).   IEEE, 2017, pp. 483–487.
  145. S. I. Serengil and A. Ozpinar, “Lightface: A hybrid deep face recognition framework,” in 2020 Innovations in Intelligent Systems and Applications Conference (ASYU).   IEEE, 2020, pp. 23–27. [Online]. Available: https://doi.org/10.1109/ASYU50717.2020.9259802
  146. ——, “Hyperextended lightface: A facial attribute analysis framework,” in 2021 International Conference on Engineering and Emerging Technologies (ICEET).   IEEE, 2021, pp. 1–4. [Online]. Available: https://doi.org/10.1109/ICEET53442.2021.9659697
  147. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition.   Ieee, 2009, pp. 248–255.
  148. C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang, and C. Liu, “A survey on deep transfer learning,” in Artificial Neural Networks and Machine Learning–ICANN 2018: 27th International Conference on Artificial Neural Networks, Rhodes, Greece, October 4-7, 2018, Proceedings, Part III 27.   Springer, 2018, pp. 270–279.
  149. X. L. Li and P. Liang, “Prefix-tuning: Optimizing continuous prompts for generation,” arXiv preprint arXiv:2101.00190, 2021.
  150. A. Kumar, A. Raghunathan, R. Jones, T. Ma, and P. Liang, “Fine-tuning can distort pretrained features and underperform out-of-distribution,” arXiv preprint arXiv:2202.10054, 2022.
  151. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
  152. J. Duchi, E. Hazan, and Y. Singer, “Adaptive subgradient methods for online learning and stochastic optimization.” Journal of machine learning research, vol. 12, no. 7, 2011.
  153. S. Ruder, “An overview of gradient descent optimization algorithms,” arXiv preprint arXiv:1609.04747, 2016.
  154. Z. Zhang, Y. Song, and H. Qi, “Age progression/regression by conditional adversarial autoencoder,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 5810–5818.
  155. M. C. Cieslak, A. M. Castelfranco, V. Roncalli, P. H. Lenz, and D. K. Hartline, “t-distributed stochastic neighbor embedding (t-sne): A tool for eco-physiological transcriptomic analysis,” Marine genomics, vol. 51, p. 100723, 2020.
  156. F. Karimzadeh, “Hardware-friendly model compression techniques for deep learning accelerators,” Ph.D. dissertation, Georgia Institute of Technology, 2022.
  157. F. Karimzadeh and A. Raychowdhury, “Towards cim-friendly and energy-efficient dnn accelerator via bit-level sparsity,” in 2022 IFIP/IEEE 30th International Conference on Very Large Scale Integration (VLSI-SoC).   IEEE, 2022, pp. 1–2.
  158. F. Karimzadeh, J.-H. Yoon, and A. Raychowdhury, “Bits-net: Bit-sparse deep neural network for energy-efficient rram-based compute-in-memory,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 69, no. 5, pp. 1952–1961, 2022.
  159. F. Karimzadeh, N. Cao, B. Crafton, J. Romberg, and A. Raychowdhury, “A hardware-friendly approach towards sparse neural networks based on lfsr-generated pseudo-random sequences,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 68, no. 2, pp. 751–764, 2020.
  160. F. Karimzadeh and A. Raychowdhury, “Memory and energy efficient method toward sparse neural network using lfsr indexing,” in 2020 IFIP/IEEE 28th International Conference on Very Large Scale Integration (VLSI-SOC).   IEEE, 2020, pp. 206–207.
  161. F. Karimzadeh, N. Cao, B. Crafton, J. Romberg, and A. Raychowdhury, “Hardware-aware pruning of dnns using lfsr-generated pseudo-random indices,” in 2020 IEEE International Symposium on Circuits and Systems (ISCAS).   IEEE, 2020, pp. 1–5.
  162. F. Karimzadeh and A. Raychowdhury, “Towards energy efficient dnn accelerator via sparsified gradual knowledge distillation,” in 2022 IFIP/IEEE 30th International Conference on Very Large Scale Integration (VLSI-SoC).   IEEE, 2022, pp. 1–6.

Summary

We haven't generated a summary for this paper yet.