Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
9 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
40 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Designing a Direct Feedback Loop between Humans and Convolutional Neural Networks through Local Explanations (2307.04036v1)

Published 8 Jul 2023 in cs.HC, cs.AI, cs.CV, and cs.LG

Abstract: The local explanation provides heatmaps on images to explain how Convolutional Neural Networks (CNNs) derive their output. Due to its visual straightforwardness, the method has been one of the most popular explainable AI (XAI) methods for diagnosing CNNs. Through our formative study (S1), however, we captured ML engineers' ambivalent perspective about the local explanation as a valuable and indispensable envision in building CNNs versus the process that exhausts them due to the heuristic nature of detecting vulnerability. Moreover, steering the CNNs based on the vulnerability learned from the diagnosis seemed highly challenging. To mitigate the gap, we designed DeepFuse, the first interactive design that realizes the direct feedback loop between a user and CNNs in diagnosing and revising CNN's vulnerability using local explanations. DeepFuse helps CNN engineers to systemically search "unreasonable" local explanations and annotate the new boundaries for those identified as unreasonable in a labor-efficient manner. Next, it steers the model based on the given annotation such that the model doesn't introduce similar mistakes. We conducted a two-day study (S2) with 12 experienced CNN engineers. Using DeepFuse, participants made a more accurate and "reasonable" model than the current state-of-the-art. Also, participants found the way DeepFuse guides case-based reasoning can practically improve their current practice. We provide implications for design that explain how future HCI-driven design can move our practice forward to make XAI-driven insights more actionable.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (104)
  1. COGAM: measuring and moderating cognitive load in machine learning model explanations. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–14.
  2. John G Adair. 1984. The Hawthorne effect: a reconsideration of the methodological artifact. Journal of applied psychology 69, 2 (1984), 334.
  3. Modeltracker: Redesigning performance analysis tools for machine learning. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 337–346.
  4. Managing bias and unfairness in data for decision support: a survey of machine learning and data engineering approaches to identify and mitigate bias and unfairness within data management and analytics systems. The VLDB Journal 30, 5 (2021), 739–768.
  5. Do convolutional neural networks learn class hierarchy? IEEE transactions on visualization and computer graphics 24, 1 (2017), 152–162.
  6. Algorithmic Bias and Fairness in Case-Based Reasoning. In International Conference on Case-Based Reasoning. Springer, 48–62.
  7. Benchmarking and survey of explanation methods for black box models. arXiv preprint arXiv:2102.13076 (2021).
  8. John Brooke. 2013. SUS: a retrospective. Journal of usability studies 8, 2 (2013), 29–40.
  9. John Brooke et al. 1996. SUS-A quick and dirty usability scale. Usability evaluation in industry 189, 194 (1996), 4–7.
  10. Antony Brydon. 2019. Why AI Needs Human Input (And Always Will). Retrieved September 10, 2022 from https://www.forbes.com/sites/forbestechcouncil/2019/10/30/why-ai-needs-human-input-and-always-will/
  11. Kelly Caine. 2016. Local standards for sample size at CHI. In Proceedings of the 2016 CHI conference on human factors in computing systems. 981–992.
  12. Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE winter conference on applications of computer vision (WACV). IEEE, 839–847.
  13. Justin Cheng and Michael S Bernstein. 2015. Flock: Hybrid crowd-machine learning classifiers. In Proceedings of the 18th ACM conference on computer supported cooperative work & social computing. 600–611.
  14. Aila: Attentive interactive labeling assistant for document classification through attention-based deep neural networks. In Proceedings of the 2019 CHI conference on human factors in computing systems. 1–12.
  15. Alexandra Chouldechova and Aaron Roth. 2018. The frontiers of fairness in machine learning. arXiv preprint arXiv:1810.08810 (2018).
  16. Understanding Human-side Impact of Sequencing Images in Batch Labeling for Subjective Tasks. Proceedings of the ACM on Human-Computer Interaction CSCW (2021).
  17. Efficient Elicitation Approaches to Estimate Collective Crowd Answers. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (2019), 1–25. https://doi.org/10.1145/3359164
  18. Dennis Collaris and Jarke J van Wijk. 2020. ExplainExplore: Visual exploration of machine learning explanations. In 2020 IEEE Pacific Visualization Symposium (PacificVis). IEEE, 26–35.
  19. Using Case-Based Reasoning for Capturing Expert Knowledge on Explanation Methods. In International Conference on Case-Based Reasoning. Springer, 3–17.
  20. ERASER: A benchmark to evaluate rationalized NLP models. arXiv preprint arXiv:1911.03429 (2019).
  21. Gordon Diaper. 1990. The Hawthorne effect: A fresh examination. Educational studies 16, 3 (1990), 261–267.
  22. John J Dudley and Per Ola Kristensson. 2018. A review of user interface design for interactive machine learning. ACM Transactions on Interactive Intelligent Systems (TiiS) 8, 2 (2018), 1–37.
  23. Shannon Leigh Eggers and Char Sample. 2020. Vulnerabilities in Artificial Intelligence and Machine Learning Applications and Data. Technical Report. Idaho National Lab.(INL), Idaho Falls, ID (United States).
  24. Jerry Alan Fails and Dan R Olsen Jr. 2003. Interactive machine learning. In Proceedings of the 8th international conference on Intelligent user interfaces. 39–45.
  25. Attention branch network: Learning of attention mechanism for visual explanation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10705–10714.
  26. Going Beyond XAI: A Systematic Survey for Explanation-Guided Learning. arXiv preprint arXiv:2212.03954 (2022).
  27. GNES: Learning to explain graph neural networks. In 2021 IEEE International Conference on Data Mining (ICDM). IEEE, 131–140.
  28. RES: A Robust Framework for Guiding Visual Explanation. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 432–442.
  29. Aligning eyes between humans and deep neural network through interactive attention alignment. Proceedings of the ACM on Human-Computer Interaction 6, CSCW2 (2022), 1–28.
  30. Towards human-guided machine learning. In Proceedings of the 24th International Conference on Intelligent User Interfaces. 614–624.
  31. Yolanda Gil and Bart Selman. 2019. A 20-Year Community Roadmap for Artificial Intelligence Research in the US. arXiv preprint arXiv:1908.02624 (2019).
  32. Why do you think that? exploring faithful sentence-level rationales without supervision. arXiv preprint arXiv:2010.03384 (2020).
  33. SHAI 2023: Workshop on Designing for Safety in Human-AI Interactions. In Companion Proceedings of the 28th International Conference on Intelligent User Interfaces. 199–201.
  34. Ben Green and Yiling Chen. 2019. The principles and limits of algorithm-in-the-loop decision making. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (2019), 1–24.
  35. The WEKA data mining software: an update. ACM SIGKDD explorations newsletter 11, 1 (2009), 10–18.
  36. Gender recognition or gender reductionism? The social implications of embedded gender recognition systems. In Proceedings of the 2018 chi conference on human factors in computing systems. 1–13.
  37. Gillian R Hayes. 2014. Knowing by doing: action research as an approach to HCI. In Ways of Knowing in HCI. Springer, 49–68.
  38. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision. 2961–2969.
  39. Women also snowboard: Overcoming bias in captioning models. In Proceedings of the European Conference on Computer Vision (ECCV). 771–787.
  40. Gender and Racial Bias in Visual Question Answering Datasets. arXiv preprint arXiv:2205.08148 (2022).
  41. Understanding and visualizing data iteration in machine learning. In Proceedings of the 2020 CHI conference on human factors in computing systems. 1–13.
  42. Towards Evaluating Exploratory Model Building Process with AutoML Systems. arXiv preprint arXiv:2009.00449 (2020).
  43. Human Factors in Model Interpretability: Industry Practices, Challenges, and Needs. Proceedings of the ACM on Human-Computer Interaction 4 (2020), 1–26.
  44. Disseminating Machine Learning to domain experts: Understanding challenges and opportunities in supporting a model building process. In CHI 2019 Workshop, Emerging Perspectives in Human-Centered Machine Learning. ACM.
  45. Scott E Hudson and Jennifer Mankoff. 2014. Concepts, values, and methods for technical human–computer interaction research. In Ways of Knowing in HCI. Springer, 69–93.
  46. Improving deep learning interpretability by saliency guided training. Advances in Neural Information Processing Systems 34 (2021), 26726–26739.
  47. ActiVis: Visual exploration of industry-scale deep neural network models. IEEE transactions on visualization and computer graphics 24, 1 (2017), 88–97.
  48. Visual exploration of machine learning results using data cube analysis. In Proceedings of the Workshop on Human-In-the-Loop Data Analytics. 1–6.
  49. A review of recent deep learning approaches in human-centered machine learning. Sensors 21, 7 (2021), 2514.
  50. Discriminative region suppression for weakly-supervised semantic segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 1754–1761.
  51. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences 114, 13 (2017), 3521–3526.
  52. The novelty effect in large display deployments–Experiences and lessons-learned for evaluating prototypes. In Proceedings of 16th European conference on computer-supported cooperative work-exploratory papers. European Society for Socially Embedded Technologies (EUSSET).
  53. A workflow for visual diagnostics of binary classifiers using instance-level explanations. In 2017 IEEE Conference on Visual Analytics Science and Technology (VAST). IEEE, 162–172.
  54. Interacting with predictions: Visual inspection of black-box machine learning models. In Proceedings of the 2016 CHI conference on human factors in computing systems. 5686–5697.
  55. Derek Layder. 1998. Sociological practice: Linking theory and social research. Sage.
  56. Learning debiased representation via disentangled feature augmentation. Advances in Neural Information Processing Systems 34 (2021), 25123–25133.
  57. Railroad is not a train: Saliency as pseudo-pixel supervision for weakly supervised semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5495–5505.
  58. A survey of convolutional neural networks: analysis, applications, and prospects. IEEE transactions on neural networks and learning systems (2021).
  59. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740–755.
  60. TX-CNN: Detecting tuberculosis in chest X-ray images using convolutional neural network. In 2017 IEEE international conference on image processing (ICIP). IEEE, 2314–2318.
  61. Is using deep learning frameworks free? characterizing technical debt in deep learning frameworks. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Society. 1–10.
  62. From local explanations to global understanding with explainable AI for trees. Nature machine intelligence 2, 1 (2020), 56–67.
  63. Yao Ming. 2017. A survey on visualization for explainable classifiers. (2017).
  64. Understanding hidden memories of recurrent neural networks. In 2017 IEEE Conference on Visual Analytics Science and Technology (VAST). IEEE, 13–24.
  65. Rulematrix: Visualizing and understanding classifiers with rules. IEEE transactions on visualization and computer graphics 25, 1 (2018), 342–352.
  66. Embedding human knowledge into deep neural network via attention map. arXiv preprint arXiv:1905.03540 (2019).
  67. A multidisciplinary survey and framework for design and evaluation of explainable AI systems. ACM Transactions on Interactive Intelligent Systems (TiiS) 11, 3-4 (2021), 1–45.
  68. Mohammed Bany Muhammad and Mohammed Yeasin. 2020. Eigen-cam: Class activation map using principal components. In 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, 1–7.
  69. Don Norman. 2013. The design of everyday things: Revised and expanded edition. Basic books.
  70. Continual lifelong learning with neural networks: A review. Neural Networks 113 (2019), 54–71.
  71. Neurocartography: Scalable automatic visual summarization of concepts in deep neural networks. IEEE Transactions on Visualization and Computer Graphics 28, 1 (2021), 813–823.
  72. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019).
  73. Nicola Pezzotti. 2019. Dimensionality-Reduction Algorithms for Progressive Visual Analytics. (2019).
  74. Squares: Supporting interactive performance analysis for multiclass classifiers. IEEE transactions on visualization and computer graphics 23, 1 (2016), 61–70.
  75. An investigation of why overparameterization exacerbates spurious correlations. In International Conference on Machine Learning. PMLR, 8346–8356.
  76. Johnny Saldaña. 2015. The coding manual for qualitative researchers. Sage.
  77. Visus: An interactive system for automatic machine learning model building and curation. In Proceedings of the Workshop on Human-In-the-Loop Data Analytics. 1–7.
  78. Irving Seidman. 2006. Interviewing as qualitative research: A guide for researchers in education and the social sciences. Teachers college press.
  79. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision. 618–626.
  80. Towards understanding and arguing with classifiers: Recent progress. Datenbank-Spektrum 20, 2 (2020), 171–180.
  81. Adobe Photoshop CS3: Complete concepts and techniques. Course Technology Press.
  82. Herbert A Simon. 1981. The sciences of the artificial, 1969. Massachusetts Institute of Technology (1981).
  83. Don’t judge an object by its context: learning to overcome contextual bias. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11070–11078.
  84. Learning to explain: Generating stable explanations fast. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 5340–5355.
  85. Timo Speith. 2022. A Review of Taxonomies of Explainable Artificial Intelligence (XAI) Methods. In 2022 ACM Conference on Fairness, Accountability, and Transparency. 2239–2250.
  86. explAIner: A visual analytics framework for interactive and explainable machine learning. IEEE transactions on visualization and computer graphics 26, 1 (2019), 1064–1074.
  87. Attacking convolutional neural network using differential evolution. IPSJ Transactions on Computer Vision and Applications 11, 1 (2019), 1–16.
  88. EnsembleMatrix: interactive visualization to support machine learning with multiple classifiers. In Proceedings of the SIGCHI conference on human factors in computing systems. 1283–1292.
  89. Human-AI collaboration in data science: Exploring data scientists’ perceptions of automated AI. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (2019), 1–24.
  90. Cost-effective active learning for deep image classification. IEEE Transactions on Circuits and Systems for Video Technology 27, 12 (2016), 2591–2600.
  91. Balanced datasets are not enough: Estimating and mitigating gender bias in deep image representations. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5310–5319.
  92. Dodrio: Exploring transformer models with interactive visualization. arXiv preprint arXiv:2103.14625 (2021).
  93. CNN 101: Interactive visual learning for convolutional neural networks. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems. 1–7.
  94. “Better” Counterfactuals, Ones People Can Understand: Psychologically-Plausible Case-Based Counterfactuals Using Categorical Features for Explainable AI (XAI). In International Conference on Case-Based Reasoning. Springer, 63–78.
  95. The what-if tool: Interactive probing of machine learning models. IEEE transactions on visualization and computer graphics 26, 1 (2019), 56–65.
  96. Visualizing dataflow graphs of deep learning models in tensorflow. IEEE transactions on visualization and computer graphics 24, 1 (2017), 1–12.
  97. FlatMagic: Improving Flat Colorization through AI-Driven Design for Digital Comic Professionals. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–17.
  98. DVERGE: diversifying vulnerabilities for enhanced robust generation of ensembles. Advances in Neural Information Processing Systems 33 (2020), 5505–5515.
  99. Investigating how experienced UX designers effectively work with machine learning. In Proceedings of the 2018 designing interactive systems conference. 585–596.
  100. Grounding interactive machine learning tool design in how non-experts actually build models. In Proceedings of the 2018 designing interactive systems conference. 573–584.
  101. Using “annotator rationales” to improve machine learning for text categorization. In Human language technologies 2007: The conference of the North American chapter of the association for computational linguistics; proceedings of the main conference. 260–267.
  102. How do data science workers collaborate? roles, workflows, and tools. Proceedings of the ACM on Human-Computer Interaction 4, CSCW1 (2020), 1–23.
  103. Dissonance between human and machine understanding. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (2019), 1–23.
  104. Men also like shopping: Reducing gender bias amplification using corpus-level constraints. arXiv preprint arXiv:1707.09457 (2017).
Citations (5)

Summary

We haven't generated a summary for this paper yet.