WatChat: Explaining perplexing programs by debugging mental models (2403.05334v2)
Abstract: Often, a good explanation for a program's unexpected behavior is a bug in the programmer's code. But sometimes, an even better explanation is a bug in the programmer's mental model of the language or API they are using. Instead of merely debugging our current code ("giving the programmer a fish"), what if our tools could directly debug our mental models ("teaching the programmer to fish")? In this paper, we apply recent ideas from computational cognitive science to offer a principled framework for doing exactly that. Given a "why?" question about a program, we automatically infer potential misconceptions about the language/API that might cause the user to be surprised by the program's behavior -- and then analyze those misconceptions to provide explanations of the program's behavior. Our key idea is to formally represent misconceptions as counterfactual (erroneous) semantics for the language/API, which can be inferred and debugged using program synthesis techniques. We demonstrate our framework, WatChat, by building systems for explanation in two domains: JavaScript type coercion, and the Git version control system. We evaluate WatChatJS and WatChatGit by comparing their outputs to experimentally-collected human-written explanations in these two domains: we show that WatChat's explanations exhibit key features of human-written explanation, unlike those of a state-of-the-art LLM.
- “Do developers read compiler error messages?” In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), 2017, pp. 575–585 IEEE
- Gary Bernhardt, 2012 CodeMash URL: https://www.destroyallsoftware.com/talks/wat
- “Domain-specific symbolic compilation” In 2nd Summit on Advances in Programming Languages (SNAPL 2017), 2017 Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik
- “Plan explanations as model reconciliation: moving beyond explanation as soliloquy” In Proceedings of the 26th International Joint Conference on Artificial Intelligence, 2017, pp. 156–163
- “Bonsai: synthesis-based reasoning for type systems” In Proceedings of the ACM on Programming Languages 2.POPL ACM New York, NY, USA, 2017, pp. 1–34
- “Cooperative Explanation as Rational Communication” In In submission, 2024
- “Inferring the future by imagining the past” In Advances in Neural Information Processing Systems 36, 2024
- “Compiler Errors for Humans”, 2015 URL: https://elm-lang.org/news/compiler-errors-for-humans
- Kenneth James Williams Craik “The nature of explanation” CUP Archive, 1967
- Will Crichton, Gavin Gray and Shriram Krishnamurthi “A Grounded Conceptual Model for Ownership Types in Rust” In Proceedings of the ACM on Programming Languages 7.OOPSLA2 ACM New York, NY, USA, 2023, pp. 1224–1252
- Leonardo De Moura and Nikolaj Bjørner “Z3: An efficient SMT solver” In International conference on Tools and Algorithms for the Construction and Analysis of Systems, 2008, pp. 337–340 Springer
- “Bugs as deviant behavior: A general approach to inferring errors in systems code” In ACM SIGOPS Operating Systems Review 35.5 ACM New York, NY, USA, 2001, pp. 57–72
- “Expectations affect physical causation judgments.” In Journal of Experimental Psychology: General 149.3 American Psychological Association, 2020, pp. 599
- Herbert Paul Grice “Logic and conversation” In Speech acts Brill, 1975, pp. 41–58
- Arjun Guha, Claudiu Saftoiu and Shriram Krishnamurthi “The essence of JavaScript” In ECOOP 2010–Object-Oriented Programming: 24th European Conference, Maribor, Slovenia, June 21-25, 2010. Proceedings 24, 2010, pp. 126–150 Springer
- “ECMAScript(R) 2024 Language Specification”, 2023 URL: https://tc39.es/ecma262/
- David Hestenes, Malcolm Wells and Gregg Swackhamer “Force concept inventory” In The physics teacher 30.3 American Association of Physics Teachers, 1992, pp. 141–158
- Denis J Hilton “Conversational processes and causal explanation.” In Psychological Bulletin 107.1 American Psychological Association, 1990, pp. 65
- Denis J Hilton “Mental models and causal explanation: Judgements of probable cause and explanatory relevance” In Thinking & Reasoning 2.4 Taylor & Francis, 1996, pp. 273–308
- Philip N Johnson-Laird “Mental models in cognitive science” In Cognitive science 4.1 Elsevier, 1980, pp. 71–115
- John R Josephson and Susan G Josephson “Abductive inference: Computation, philosophy, technology” Cambridge University Press, 1996
- Lara Kirfel, Thomas Icard and Tobias Gerstenberg “Inference from explanation.” In Journal of Experimental Psychology: General 151.7 American Psychological Association, 2022, pp. 1481
- Amy J Ko and Brad A Myers “Debugging reinvented: asking and answering why and why not questions about program behavior” In Proceedings of the 30th international conference on Software engineering, 2008, pp. 301–310
- Amy J Ko and Brad A Myers “Designing the whyline: a debugging interface for asking questions about program behavior” In Proceedings of the SIGCHI conference on Human factors in computing systems, 2004, pp. 151–158
- Amy J Ko and Brad A Myers “Extracting and answering why and why not questions about Java program output” In ACM Transactions on Software Engineering and Methodology (TOSEM) 20.2 ACM New York, NY, USA, 2010, pp. 1–36
- Amy J Ko and Brad A Myers “Finding causes of program output with the Java Whyline” In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2009, pp. 1569–1578
- Rebecca S Lindell, Elizabeth Peak and Thomas M Foster “Are they all created equal? A comparison of different concept inventory development methodologies” In AIP conference proceedings 883.1, 2007, pp. 14–17 American Institute of Physics
- Peter Lipton “Contrastive explanation” In Royal Institute of Philosophy Supplements 27 Cambridge University Press, 1990, pp. 247–266
- Tania Lombrozo “The structure and function of explanations” In Trends in cognitive sciences 10.10 Elsevier, 2006, pp. 464–470
- “Identifying and Correcting Programming Language Behavior Misconceptions” In OOPSLA, 2024
- John McClure “Goal-based explanations of actions and outcomes” In European review of social psychology 12.1 Taylor & Francis, 2002, pp. 201–235
- Tim Miller “Explanation in artificial intelligence: Insights from the social sciences” In Artificial intelligence 267 Elsevier, 2019, pp. 1–38
- Mitchell J Nathan, Kenneth R Koedinger and Martha W Alibali “Expert blind spot: When content knowledge eclipses pedagogical content knowledge” In Proceedings of the third international conference on cognitive science 644648, 2001, pp. 644–648
- “The power of “why” and “why not”: Enriching scenario exploration with provenance” In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, 2017, pp. 106–116
- “JISET: JavaScript IR-based semantics extraction toolchain” In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020, pp. 647–658
- Falco Peijnenburg “Type Directives in Elm”, 2016
- Gabriel Poesia Reis e Silva and Noah Goodman “Left to the Reader: Abstracting Solutions in Mathematical Reasoning” In Proceedings of the Annual Meeting of the Cognitive Science Society 44.44, 2022 URL: https://escholarship.org/uc/item/0j8753pd
- Anna N Rafferty, Michelle M LaMar and Thomas L Griffiths “Inferring learners’ knowledge from their actions” In Cognitive Science 39.3 Wiley Online Library, 2015, pp. 584–618
- “Faster teaching via pomdp planning” In Cognitive science 40.6 Wiley Online Library, 2016, pp. 1290–1332
- Stuart Reges “The mystery of" b:=(b= false)"” In ACM SIGCSE Bulletin 40.1 ACM New York, NY, USA, 2008, pp. 21–25
- ““That’s (not) the output I expected!” On the role of end user expectations in creating explanations of AI systems” In Artificial Intelligence 298 Elsevier, 2021, pp. 103507
- Nischal Shrestha, Titus Barik and Chris Parnin “It’s like python but: Towards supporting transfer of programming language knowledge” In 2018 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), 2018, pp. 177–185 IEEE
- Rishabh Singh, Sumit Gulwani and Armando Solar-Lezama “Automated feedback generation for introductory programming assignments” In Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation, 2013, pp. 15–26
- “A lightweight symbolic virtual machine for solver-aided host languages” In ACM SIGPLAN Notices 49.6 ACM New York, NY, USA, 2014, pp. 530–541
- “Growing solver-aided languages with Rosette” In Proceedings of the 2013 ACM international symposium on New ideas, new paradigms, and reflections on programming & software, 2013, pp. 135–152
- Kurt VanLehn “Mind bugs: The origins of procedural misconceptions” MIT press, 1990
- Ventero “Answer to “What is the explanation for these bizarre JavaScript behaviours mentioned in the ‘Wat’ talk for CodeMash 2012?”” URL: https://stackoverflow.com/a/9033306