Explaining CLIP through Co-Creative Drawings and Interaction (2306.07429v1)
Abstract: This paper analyses a visual archive of drawings produced by an interactive robotic art installation where audience members narrated their dreams into a system powered by CLIPdraw deep learning (DL) model that interpreted and transformed their dreams into images. The resulting archive of prompt-image pairs were examined and clustered based on concept representation accuracy. As a result of the analysis, the paper proposes four groupings for describing and explaining CLIP-generated results: clear concept, text-to-text as image, indeterminacy and confusion, and lost in translation. This article offers a glimpse into a collection of dreams interpreted, mediated and given form by AI, showcasing oftentimes unexpected, visually compelling or, indeed, the dream-like output of the system, with the emphasis on processes and results of translations between languages, sign-systems and various modules of the installation. In the end, the paper argues that proposed clusters support better understanding of the neural model.
- Mar Canet Sola and Varvara Guljajeva “Dream Painter: Exploring creative possibilities of AI-aided speech-to-image synthesis in the interactive art context” In Proceedings of the ACM on Computer Graphics and Interactive Techniques 5.4 ACM New York, NY, USA, 2022, pp. 1–11
- Paul Cohen “Harold Cohen and AARON” In AI Magazine 37.4 AAAI, 2016, pp. 63–66 DOI: 10.1609/aimag.v37i4.2695
- Kevin Frans, Lisa Soros and Olaf Witkowski “Clipdraw: Exploring text-to-drawing synthesis through language-image encoders” In Advances in Neural Information Processing Systems 35, 2022, pp. 5207–5218
- Varvara Guljajeva “Synthetic Books” In 10th International Conference on Digital and Interactive Arts, 2021, pp. 1–7
- Varvara Guljajeva and Mar Canet Sola “Dream Painter: An Interactive Art Installation Bridging Audience Interaction, Robotics, and Creative AI” In Proceedings of the 30th ACM International Conference on Multimedia (MM ’22) Lisboa, Portugal: Association for Computing Machinery, 2022, pp. 7235–7236 DOI: 10.1145/3503161.3549976
- Aaron Hertzmann “Visual Indeterminacy in GAN Art” In Leonardo 53.4 MIT Press, 2020, pp. 424–428 DOI: 10.1162/leon˙a˙01930
- Roman Jakobson “On linguistic aspects of translation” In The Translation Studies Reader LondonNew York: Routledge, 2002
- Yuri M Lotman “Universe of the mind: A semiotic theory of culture” Indiana University Press, 1990, pp. 143
- “The Pleasure in Drawing” New York, NY: Fordham University Press, 2013
- “Learning Transferable Visual Models From Natural Language Supervision”, https://doi.org/10.48550/ARXIV.2103.00020, 2021
- “Investigating Explainability of Generative AI for Code through Scenario-Based Design” In 27th International Conference on Intelligent User Interfaces, IUI ’22 New York, NY, USA: Association for Computing Machinery, 2022, pp. 212–228 DOI: 10.1145/3490099.3511119
- “Tractatus Logico-Philosophicus” Originally published: London : Routledge & Kegan Paul, 1922. Includes index. Mineola, NY: Dover Publications, 1999
- Varvara Guljajeva (9 papers)
- Isaac Joseph Clarke (2 papers)
- Mar Canet Solà (4 papers)