Wordflow: Social Prompt Engineering for Large Language Models (2401.14447v1)
Abstract: LLMs require well-crafted prompts for effective use. Prompt engineering, the process of designing prompts, is challenging, particularly for non-experts who are less familiar with AI technologies. While researchers have proposed techniques and tools to assist LLM users in prompt design, these works primarily target AI application developers rather than non-experts. To address this research gap, we propose social prompt engineering, a novel paradigm that leverages social computing techniques to facilitate collaborative prompt design. To investigate social prompt engineering, we introduce Wordflow, an open-source and social text editor that enables everyday users to easily create, run, share, and discover LLM prompts. Additionally, by leveraging modern web technologies, Wordflow allows users to run LLMs locally and privately in their browsers. Two usage scenarios highlight how social prompt engineering and our tool can enhance laypeople's interaction with LLMs. Wordflow is publicly accessible at https://poloclub.github.io/wordflow.
- Phi-2: The Surprising Power of Small Language Models. (2023). https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/
- Amazon. 2023a. Amazon API Gateway: API Management. https://aws.amazon.com/api-gateway/
- Amazon. 2023b. Amazon DynamoDB: Fast NoSQL Key-Value Database. https://aws.amazon.com/dynamodb/
- Amazon. 2023c. PartyRock: Everyone Can Build AI Apps. https://partyrock.aws/
- Anthropic. 2023. Introduction to Prompt Design. https://docs.anthropic.com/claude/docs/introduction-to-prompt-design
- Apple. 2023. Use Safari Web Apps on Mac. https://support.apple.com/en-us/104996
- ChainForge: A Visual Toolkit for Prompt Engineering and LLM Hypothesis Testing. arXiv 2309.09128 (2023). http://arxiv.org/abs/2309.09128
- A General Language Assistant as a Laboratory for Alignment. arXiv 2112.00861 (2021). http://arxiv.org/abs/2112.00861
- PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. https://doi.org/10.18653/v1/2022.acl-demo.9
- Tuba Bakici. 2020. Comparison of Crowdsourcing Platforms from Social-Psychological and Motivational Perspectives. International Journal of Information Management 54 (2020). https://doi.org/10.1016/j.ijinfomgt.2020.102121
- On the Opportunities and Risks of Foundation Models. arXiv 2108.07258 (2022). http://arxiv.org/abs/2108.07258
- Language Models Are Few-Shot Learners. In Advances in Neural Information Processing Systems, Vol. 33. https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
- Harrison Chase. 2022. LangChain: Building Applications with LLMs through Composability. https://github.com/langchain-ai/langchain
- ChatX. 2023. ChatX: ChatGPT, DALL⋅⋅\cdot⋅E & Stable Diffusion Prompt Marketplace. https://chatx.ai/
- TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). https://www.usenix.org/conference/osdi18/presentation/chen
- Prompt Sapper: A LLM-Empowered Production Tool for Building AI Chains. arXiv 2306.12028 (2023). http://arxiv.org/abs/2306.12028
- Scaling Instruction-Finetuned Language Models. arXiv 2210.11416 (2022). http://arxiv.org/abs/2210.11416
- Dom Eccleston and Steven Tey. 2022. ShareGPT: Share Your Wildest ChatGPT Conversations with One Click. https://sharegpt.com
- CoPrompt: Supporting Prompt Sharing and Referring in Collaborative Natural Language Programming. arXiv 2310.09235 (2023). http://arxiv.org/abs/2310.09235
- Programming without a Programming Language: Challenges and Opportunities for Designing Developer Tools for Prompt Programming. In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3544549.3585737
- Neil Fraser. 2012. Diff-Match-Patch: Hgh-performance Library in Multiple Languages That Manipulates Plain Text. https://github.com/google/diff-match-patch
- Perceptions of Virtual Reward Systems in Crowdsourcing Games. Computers in Human Behavior 70 (2017). https://doi.org/10.1016/j.chb.2017.01.006
- Google. 2020. Lit: Simple Fast Web Components. https://lit.dev/
- Google. 2022. Add and Open Chrome Apps. https://support.google.com/chrome_webstore/answer/3060053?hl=en
- Google. 2023. Google AI Studio. https://makersuite.google.com/app/prompts/new_freeform
- Hans W. A. Hanley and Zakir Durumeric. 2023. Machine-Made Media: Monitoring the Mobilization of Machine-Generated Articles on Misinformation and Mainstream News Websites. arXiv 2305.09820 (2023). http://arxiv.org/abs/2305.09820
- Drew Harwell. 2023. Tech’s Hottest New Job: AI Whisperer. No Coding Required. https://www.washingtonpost.com/technology/2023/02/25/prompt-engineers-techs-next-big-job/
- Model Compression in Practice: Lessons Learned from Practitioners Creating On-device Machine Learning Experiences. arXiv 2310.04621 (2023). http://arxiv.org/abs/2310.04621
- David Holz. 2022. Midjourney: Exploring New Mediums of Thought and Expanding the Imaginative Powers of the Human Species. https://www.midjourney.com
- PromptMaker: Prompt-based Prototyping with Large Language Models. In CHI Conference on Human Factors in Computing Systems Extended Abstracts. https://doi.org/10.1145/3491101.3503564
- Large Language Models Are Zero-Shot Reasoners. Advances in Neural Information Processing Systems 35 (2022). https://proceedings.neurips.cc/paper_files/paper/2022/hash/8bb0d291acd4acf06ef112099c16f326-Abstract-Conference.html
- Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. arXiv 2005.11401 (2021). http://arxiv.org/abs/2005.11401
- Pre-Train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. Comput. Surveys 55 (2023). https://doi.org/10.1145/3560815
- Comparing the Structures and Characteristics of Different Game Social Networks - The Steam Case. In 2021 IEEE Conference on Games (CoG). https://doi.org/10.1109/CoG52621.2021.9619130
- Guidance: A Guidance Language for Controlling Large Language Models. guidance-ai. https://github.com/guidance-ai/guidance
- MDN. 2021. Web Components - Web APIs. https://developer.mozilla.org/en-US/docs/Web/API/Web_components
- MDN. 2023. WebGPU API - Web APIs. https://developer.mozilla.org/en-US/docs/Web/API/WebGPU_API
- PromptAid: Prompt Exploration, Perturbation, Testing and Iteration Using Visual Analytics for Large Language Models. arXiv 2304.01964 (2023). http://arxiv.org/abs/2304.01964
- Eugene W Myers. 1986. An O (ND) Difference Algorithm and Its Variations. Algorithmica. An International Journal in Computer Science 1 (1986).
- Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine. arXiv 2311.16452 (2023). http://arxiv.org/abs/2311.16452
- OpenAI. 2023a. GPT-4 Technical Report. arXiv 2303.08774 (2023). http://arxiv.org/abs/2303.08774
- OpenAI. 2023b. OpenAI Playground. https://platform.openai.com/playground
- Jonas Oppenlaender. 2022. A Taxonomy of Prompt Modifiers for Text-To-Image Generation. arXiv 2204.13988 (2022). http://arxiv.org/abs/2204.13988
- Training Language Models to Follow Instructions with Human Feedback. arXiv 2203.02155 (2022). http://arxiv.org/abs/2203.02155
- Revisiting Prompt Engineering via Declarative Crowdsourcing. arXiv 2308.03854 (2023). http://arxiv.org/abs/2308.03854
- PromptBase. 2023. PromptBase: Prompt Marketplace: Midjourney, ChatGPT, DALL⋅⋅\cdot⋅E, Stable Diffusion & More. https://promptbase.com
- PromptHero. 2023. PromptHero: Search Prompts for Stable Diffusion, ChatGPT & Midjourney. https://prompthero.com/
- Promptstacks. 2023. Promptstacks: Your Prompt Engineering Community. https://www.promptstacks.com/
- Reddit. 2023. R/ChatGPTPromptGenius. https://www.reddit.com/r/ChatGPTPromptGenius/
- Item-Based Collaborative Filtering Recommendation Algorithms. In Proceedings of the 10th International Conference on World Wide Web. https://doi.org/10.1145/371920.372071
- Exploring Gamers’ Crowdsourcing Engagement in Pokémon Go Communities. The TQM Journal (2021). https://doi.org/10.1108/TQM-05-2021-0131
- Beyond the Imitation Game: Quantifying and Extrapolating the Capabilities of Language Models. arXiv preprint arXiv:2206.04615 (2022). http://arxiv.org/abs/2206.04615
- Learning to Summarize from Human Feedback. arXiv:2009.01325 [cs] (2020). http://arxiv.org/abs/2009.01325
- Hendrik Strobelt. 2023. Prompt Tester - Quick Prompt Iterations for Ad-Hoc Tasks. https://prompt-tester.vizhub.ai/blog
- Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation With Large Language Models. IEEE Transactions on Visualization and Computer Graphics (2022). https://doi.org/10.1109/TVCG.2022.3209479
- Prompterator: Iterate Efficiently towards More Effective Prompts. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. https://doi.org/10.18653/v1/2023.emnlp-demo.43
- Gemini: A Family of Highly Capable Multimodal Models. arXiv preprint arXiv:2312.11805 (2023). https://arxiv.org/abs/2312.11805
- MLC team. 2023. MLC-LLM. https://github.com/mlc-ai/mlc-llm
- Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv 2307.09288 (2023).
- DiffusionDB: A Large-Scale Prompt Gallery Dataset for Text-to-Image Generative Models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). https://aclanthology.org/2023.acl-long.51
- Finetuned Language Models Are Zero-Shot Learners. arXiv 2109.01652 (2022). http://arxiv.org/abs/2109.01652
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In Advances in Neural Information Processing Systems, Vol. 35. https://proceedings.neurips.cc/paper_files/paper/2022/file/9d5609613524ecf4f15af0f7b31abca4-Paper-Conference.pdf
- Sociotechnical Safety Evaluation of Generative AI Systems. arXiv 2310.11986 (2023). http://arxiv.org/abs/2310.11986
- Brandon T. Willard and Rémi Louf. 2023. Efficient Guided Generation for Large Language Models. arXiv 2307.09702 (2023). http://arxiv.org/abs/2307.09702
- Stable Diffusion Breaks the Internet. https://changelog.com/podcast/506
- Max Woolf. 2023. The Problem With LangChain. https://minimaxir.com/2023/07/langchain-problem/
- PromptChainer: Chaining Large Language Model Prompts through Visual Programming. In CHI Conference on Human Factors in Computing Systems Extended Abstracts. https://doi.org/10.1145/3491101.3519729
- AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts. In CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3491102.3517582
- Wei Wu and Xiang Gong. 2020. Motivation and Sustained Participation in the Online Crowdsourcing Community: The Moderating Role of Community Commitment. Internet Research 31 (2020). https://doi.org/10.1108/INTR-01-2020-0008
- Why Johnny Can’t Prompt: How Non-AI Experts Try (and Fail) to Design LLM Prompts. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3544548.3581388
- TinyLlama: An Open-Source Small Language Model. arXiv 2401.02385 (2024). http://arxiv.org/abs/2401.02385
- InstructPipe: Building Visual Programming Pipelines with Human Instructions. arXiv 2312.09672 (2023). http://arxiv.org/abs/2312.09672