Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 37 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 10 tok/s Pro
GPT-5 High 15 tok/s Pro
GPT-4o 84 tok/s Pro
Kimi K2 198 tok/s Pro
GPT OSS 120B 448 tok/s Pro
Claude Sonnet 4 31 tok/s Pro
2000 character limit reached

ProAgent: From Robotic Process Automation to Agentic Process Automation (2311.10751v2)

Published 2 Nov 2023 in cs.RO, cs.AI, and cs.CL

Abstract: From ancient water wheels to robotic process automation (RPA), automation technology has evolved throughout history to liberate human beings from arduous tasks. Yet, RPA struggles with tasks needing human-like intelligence, especially in elaborate design of workflow construction and dynamic decision-making in workflow execution. As LLMs have emerged human-like intelligence, this paper introduces Agentic Process Automation (APA), a groundbreaking automation paradigm using LLM-based agents for advanced automation by offloading the human labor to agents associated with construction and execution. We then instantiate ProAgent, an LLM-based agent designed to craft workflows from human instructions and make intricate decisions by coordinating specialized agents. Empirical experiments are conducted to detail its construction and execution procedure of workflow, showcasing the feasibility of APA, unveiling the possibility of a new paradigm of automation driven by agents. Our code is public at https://github.com/OpenBMB/ProAgent.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. Towards intelligent robotic process automation for bpmers. arXiv preprint arXiv:2001.00804, 2020.
  2. Do as i can, not as i say: Grounding language in robotic affordances. ArXiv preprint, abs/2204.01691, 2022.
  3. Graph of thoughts: Solving elaborate problems with large language models. arXiv preprint arXiv:2308.09687, 2023.
  4. Large language models as tool makers. arXiv preprint arXiv:2305.17126, 2023.
  5. D3ba: a tool for optimizing business processes using non-deterministic planning. In Business Process Management Workshops: BPM 2020 International Workshops, Seville, Spain, September 13–18, 2020, Revised Selected Papers 18, pp.  181–193. Springer, 2020a.
  6. From robotic process automation to intelligent process automation: –emerging trends–. In Business Process Management: Blockchain and Robotic Process Automation Forum: BPM 2020 Blockchain and RPA Forum, Seville, Spain, September 13–18, 2020, Proceedings 18, pp.  215–228. Springer, 2020b.
  7. Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors in agents. arXiv preprint arXiv:2308.10848, 2023.
  8. Yiru Chen. Monte carlo tree search for generating interactive data analysis interfaces. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp.  2837–2839, 2020.
  9. Mary Cummings. Automation bias in intelligent time critical decision support systems. In AIAA 1st intelligent systems technical conference, pp.  6313, 2004.
  10. On the evaluation of intelligent process automation. arXiv preprint arXiv:2001.02639, 2020.
  11. Automation bias: a systematic review of frequency, effect mediators, and mitigators. Journal of the American Medical Informatics Association, 19(1):121–127, 2012.
  12. Automatic business process structure discovery using ordered neurons lstm: a preliminary study. arXiv preprint arXiv:2001.01243, 2020.
  13. Reasoning with language model is planning with world model. arXiv preprint arXiv:2305.14992, 2023.
  14. Robotic process automation. Electronic markets, 30(1):99–106, 2020.
  15. Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu, and Sivan Sabato (eds.), International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, pp.  9118–9147. PMLR, 2022.
  16. Robotic process automation: systematic literature review. In Business Process Management: Blockchain and Central and Eastern Europe Forum: BPM 2019 Blockchain and CEE Forum, Vienna, Austria, September 1–6, 2019, Proceedings 17, pp.  280–295. Springer, 2019.
  17. Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):1–38, 2023.
  18. Robotic process and cognitive automation: the next phase. SB Publishing, 2018.
  19. Automated discovery of data transformations for robotic process automation. arXiv preprint arXiv:2001.01007, 2020.
  20. Interactive task and concept learning from natural language instructions and gui demonstrations. arXiv preprint arXiv:1909.00031, 2019.
  21. On faithfulness and factuality in abstractive summarization. arXiv preprint arXiv:2005.00661, 2020.
  22. Multipurpose intelligent process automation via conversational assistant. arXiv preprint arXiv:2001.02284, 2020.
  23. n8n. n8n.io - a powerful workflow automation tool. URL https://n8n.io/.
  24. Webgpt: Browser-assisted question-answering with human feedback. ArXiv preprint, abs/2112.09332, 2021.
  25. OpenAI. OpenAI: Introducing ChatGPT, 2022. URL https://openai.com/blog/chatgpt.
  26. OpenAI. Gpt-4 technical report, 2023.
  27. Generative agents: Interactive simulacra of human behavior. arXiv preprint arXiv:2304.03442, 2023.
  28. Gorilla: Large language model connected with massive apis. arXiv preprint arXiv:2305.15334, 2023.
  29. Communicative agents for software development. arXiv preprint arXiv:2307.07924, 2023a.
  30. Creator: Disentangling abstract and concrete reasonings of large language models through tool creation. arXiv preprint arXiv:2305.14318, 2023b.
  31. Webcpm: Interactive web search for chinese long-form question answering. arXiv preprint arXiv:2305.06849, 2023a.
  32. Tool learning with foundation models. arXiv preprint arXiv:2304.08354, 2023b.
  33. Toolllm: Facilitating large language models to master 16000+ real-world apis. arXiv preprint arXiv:2307.16789, 2023c.
  34. Business process automation. ARIS in practice, 2004.
  35. Toolformer: Language models can teach themselves to use tools. ArXiv preprint, abs/2302.04761, 2023.
  36. Algorithm of thoughts: Enhancing exploration of ideas in large language models. arXiv preprint arXiv:2308.10379, 2023.
  37. Reflexion: Language agents with verbal reinforcement learning, 2023.
  38. Cognitive architectures for language agents. arXiv preprint arXiv:2309.02427, 2023.
  39. A review of business process mining: state-of-the-art and future trends. Business Process Management Journal, 14(1):5–22, 2008.
  40. Process mining: from theory to practice. Business Process Management Journal, 18(3):493–512, 2012.
  41. unipath. The uipath business automation platform. URL https://www.uipath.com/.
  42. Wil Van Der Aalst. Process mining: Overview and opportunities. ACM Transactions on Management Information Systems (TMIS), 3(2):1–17, 2012.
  43. Voyager: An open-ended embodied agent with large language models. arXiv preprint arXiv:2305.16291, 2023a.
  44. A survey on large language model based autonomous agents. arXiv preprint arXiv:2308.11432, 2023b.
  45. Emergent abilities of large language models. arXiv preprint arXiv:2206.07682, 2022.
  46. Robotic process automation–a systematic literature review and assessment framework. arXiv preprint arXiv:2012.11951, 2020.
  47. The rise and potential of large language model based agents: A survey. arXiv preprint arXiv:2309.07864, 2023.
  48. Webshop: Towards scalable real-world web interaction with grounded language agents. Advances in Neural Information Processing Systems, 35:20744–20757, 2022a.
  49. React: Synergizing reasoning and acting in language models. ArXiv preprint, abs/2210.03629, 2022b.
  50. Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601, 2023.
  51. Large language model as autonomous decision maker. arXiv preprint arXiv:2308.12519, 2023.
  52. Zapier. Zapier — automation makes you move forward. URL https://zapier.com/.
  53. Siren’s song in the ai ocean: A survey on hallucination in large language models. arXiv preprint arXiv:2309.01219, 2023.
Citations (15)

Summary

  • The paper introduces Agentic Process Automation (APA) as an evolution of RPA by leveraging LLMs to achieve dynamic decision-making and workflow management.
  • It presents a novel Agentic Workflow Description Language that integrates JSON for data flow and Python for control flow, facilitated by specialized DataAgent and ControlAgent.
  • The proof-of-concept experiment demonstrates ProAgent’s capability to autonomously generate and execute adaptive workflows in business scenarios.

ProAgent: From Robotic Process Automation to Agentic Process Automation

Introduction

The paper presents a transformative shift in the automation paradigm through the introduction of Agentic Process Automation (APA) as an evolution from traditional Robotic Process Automation (RPA). RPA, while effective at mechanizing routine digital manipulations, is inherently limited in areas demanding human-like cognitive capabilities, such as intricate workflow construction and dynamic decision-making during execution. The emergence of LLMs with cognitive capabilities commensurate with basic human intelligence offers a promising avenue to transcend the limitations of RPA by leveraging LLM-based agents. "ProAgent" is proposed as an implementation model under the APA paradigm, designed to autonomously construct and manage workflows by leveraging LLM intelligence. Figure 1

Figure 1: The comparison between Robotic Process Automation and Agentic Process Automation.

Methodology

Agentic Workflow Description Language

The core proposition of the paper is the Agentic Workflow Description Language, which centralizes workflow data flow standardization in a JSON format while utilizing Python for control flow articulation. Figure 2

Figure 2: Illustration of Agentic Workflow Description Language.

The agentic language format realizes the orchestration of both data flow and control logic, facilitating comprehensive workflow execution. It is particularly advantageous given LLMs' familiarity with programming syntaxes such as Python, thereby enabling proficient understanding and generation of workflows.

Agent-Integrated Workflow

The paper introduces specialized agents, DataAgent and ControlAgent, to incorporate dynamic decision-making capabilities into workflows, enhancing their flexibility and intelligence. Figure 3

Figure 3: Illustration of Agentic Workflow Description Language with DataAgent and ControlAgent.

DataAgent is responsible for complex data processing tasks, extending the workflow's data handling capabilities beyond the scope achievable by conventional rule-based systems. ControlAgent addresses complex control logic that exceeds the deterministic flow capabilities of standard workflows by making dynamic decisions on branch execution.

Workflow Construction and Execution

ProAgent utilizes OpenAI's GPT-4 to facilitate workflow construction as a code generation task. The iterative process involves defining, implementing actions, and orchestrating workflows using a main workflow function encapsulation in Python to manage the execution control logic. Figure 4

Figure 4: The Illustration of the workflow construction procedure of ProAgent.

Dynamic elements such as Testing-on-Constructing, Function Calling, and Chain-of-Thought reasoning are systematically integrated to refine the generation of workflow codes, enhancing ProAgent's ability to construct and execute workflows precisely from human instructions.

Proof-of-Concept Experiment

The paper details a practical application scenario involving a business context, demonstrating ProAgent's utility in constructing and executing workflows autonomously for diverse business operations. Figure 5

Figure 5:

The task involves adaptive data extraction and communications based on business line analysis, leveraging ProAgent's agentic capabilities to automate decisions within the workflow based on dynamic business evaluations.

Discussion and Implications

The paper positions APA and ProAgent within the broader landscape of tool learning and process mining. It emphasizes the agent's dual capabilities as tool users and creators, enhancing RPA's efficiency and adaptability. Additionally, it explores how ProAgent synergizes with process mining to discover, analyze, and refine workflows, emphasizing the need for balance in human and machine task allocation. Nevertheless, concerns about automation bias underscore the importance of developing systems that ensure reliability and transparency, prompting further research into safety and interpretability in AI-driven automation.

Conclusion

ProAgent signifies a pivotal initiative in process automation, amplified by LLM-based agents. By integrating these agents into the workflow construction and execution phases, the paper demonstrates a holistic approach to enhancing automation infrastructure, shifting AI's utility beyond mere tool utilization to deep integration and dynamic decision-making. Such advancements underscore the potential for a future where intelligent automation augments human capacity, fostering greater operational efficiency across different scales and domains. The APA framework anticipates an evolving landscape where AI and humans collaborate, leveraging their respective strengths for optimal outcomes.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Github Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com