Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
GPT-5.1
GPT-5.1 29 tok/s
Gemini 2.5 Flash 127 tok/s Pro
Gemini 2.5 Pro 51 tok/s Pro
Kimi K2 184 tok/s Pro
Claude Sonnet 4.5 33 tok/s Pro
2000 character limit reached

How LLMs Aid in UML Modeling: An Exploratory Study with Novice Analysts (2404.17739v2)

Published 27 Apr 2024 in cs.SE

Abstract: Since the emergence of GPT-3, LLMs have caught the eyes of researchers, practitioners, and educators in the field of software engineering. However, there has been relatively little investigation regarding the performance of LLMs in assisting with requirements analysis and UML modeling. This paper explores how LLMs can assist novice analysts in creating three types of typical UML models: use case models, class diagrams, and sequence diagrams. For this purpose, we designed the modeling tasks of these three UML models for 45 undergraduate students who participated in a requirements modeling course, with the help of LLMs. By analyzing their project reports, we found that LLMs can assist undergraduate students as novice analysts in UML modeling tasks, but LLMs also have shortcomings and limitations that should be considered when using them.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (18)
  1. J. Liu, C. S. Xia, Y. Wang, and L. Zhang, “Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation,” Advances in Neural Information Processing Systems, vol. 36, 2024.
  2. P. Vaithilingam, T. Zhang, and E. L. Glassman, “Expectation vs. experience: Evaluating the usability of code generation tools powered by large language models,” in Chi conference on human factors in computing systems extended abstracts, pp. 1–7, 2022.
  3. A. Ahmad, M. Waseem, P. Liang, M. Fahmideh, M. S. Aktar, and T. Mikkonen, “Towards human-bot collaborative software architecting with chatgpt,” in Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering (EASE), pp. 279–285, 2023.
  4. D. Zimmermann and A. Koziolek, “Automating gui-based software testing with gpt-3,” in 2023 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW), pp. 62–65, IEEE, 2023.
  5. A. v. Lamsweerde, Requirements Engineering: From System Goals to UML Models to Software Specifications. John Wiley & Sons, Ltd, 2009.
  6. B. Wang, C. Wang, P. Liang, B. Li, and C. Zeng, “Case Study for the Paper: How LLMs Aid in UML Modeling: An Exploratory Study with Novice Analysts,” January 2024. https://zenodo.org/doi/10.5281/zenodo.10532600.
  7. Z. Zheng, K. Ning, J. Chen, Y. Wang, W. Chen, L. Guo, and W. Wang, “Towards an understanding of large language models in software engineering tasks,” arXiv preprint arXiv:2308.11396, 2023.
  8. D. Luitel, S. Hassani, and M. Sabetzadeh, “Improving requirements completeness: Automated assistance through large language models,” arXiv preprint arXiv:2308.03784, 2023.
  9. J. White, S. Hays, Q. Fu, J. Spencer-Smith, and D. C. Schmidt, “Chatgpt prompt patterns for improving code quality, refactoring, requirements elicitation, and software design,” arXiv preprint arXiv:2303.07839, 2023.
  10. J. Zhang, Y. Chen, N. Niu, and C. Liu, “Evaluation of chatgpt on requirements information retrieval under zero-shot setting,” Available at SSRN 4450322, 2023.
  11. D. Xie, B. Yoo, N. Jiang, M. Kim, L. Tan, X. Zhang, and J. S. Lee, “Impact of large language models on generating software specifications,” arXiv preprint arXiv:2306.03324, 2023.
  12. J. Jeuring, R. Groot, and H. Keuning, “What skills do you need when developing software using chatgpt? (discussion paper),” arXiv preprint arXiv:2310.05998, 2023.
  13. M. Waseem, T. Das, A. Ahmad, M. Fehmideh, P. Liang, and T. Mikkonen, “Chatgpt as a software development bot: A project-based study,” in Proceedings of the 19th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE), 2024.
  14. C. Arora, J. Grundy, and M. Abdelrazek, “Advancing requirements engineering through generative ai: Assessing the role of llms,” arXiv preprint arXiv:2310.13976, 2023.
  15. A. R. Sadik, S. Brulin, and M. Olhofer, “Coding by design: Gpt-4 empowers agile model driven development,” arXiv preprint arXiv:2310.04304, 2023.
  16. H. Kanuka, G. Koreki, R. Soga, and K. Nishikawa, “Exploring the chatgpt approach for bidirectional traceability problem between design models and code,” arXiv preprint arXiv:2309.14992, 2023.
  17. J. Cámara, J. Troya, L. Burgueño, and A. Vallecillo, “On the assessment of generative ai in modeling tasks: an experience report with chatgpt and uml,” Software and Systems Modeling, pp. 1–13, 2023.
  18. C. Larman, Applying UML and Patterns: An Introduction to Object Oriented Analysis and Design and Interative Development. Pearson Education, 2012.
Citations (3)

Summary

  • The paper demonstrates that LLMs support UML model drafting, achieving 88.89% correctness in identifying use cases and 82.22% in sequencing messages.
  • The study found that while LLMs perform moderately in class diagram creation (66.67% for classes, 75.56% for operations), they struggle with identifying relationships (24.44% correctness).
  • Hybrid-created diagrams, which combine AI generation with human refinement, outperformed other formats, underscoring the importance of human oversight.

How LLMs Aid in UML Modeling: An Exploratory Study with Novice Analysts

Introduction

The paper "How LLMs Aid in UML Modeling: An Exploratory Study with Novice Analysts" explores the capacity of LLMs to assist undergraduate students in creating UML models, specifically use case diagrams, class diagrams, and sequence diagrams. This investigation comes in the context of a requirements modeling course involving 45 participants, aiming to understand the practical impact of LLMs in software engineering tasks.

Experimentation and Design

The experimental design involved a structured task where students used LLMs, predominantly ChatGPT, to aid the creation of UML diagrams for a given case paper. Each participant submitted a project report comprising the UML models generated and the transcript of interactions with LLMs. Figure 1

Figure 1: The process of the experiment.

Results of UML Model Creation

Use Case Modeling

In evaluating the use case models generated with LLM assistance, several insights were evident:

  • LLMs excelled in identifying use cases correctly with a high success rate, as seen in 88.89% correctness.
  • However, the identification of actors and their relationships was notably less effective, achieving only 31.11% and 17.78% correctness, respectively.

Class Diagram Modeling

For class diagram creation, LLMs demonstrated good performance in identifying classes and operations, with correctness rates of 66.67% and 75.56%, respectively. Figure 2

Figure 2: Distribution of the participants with/without experience of using LLMs.

  • The recognition of relationships among classes presented challenges, with a correctness rate of merely 24.44%.

Sequence Diagram Modeling

Sequence diagrams benefitted from LLM assistance in recognizing objects and sequencing messages, where the correctness of object identification reached 73.33%.

  • Correct sequence ordering achieved 82.22% correctness, indicating a capacity for LLMs to comprehend and arrange chronological activities effectively.

Output Formats and Analysis

The research further delved into the output formats utilized in UML creation:

  • Hybrid-created diagrams performed best, with an average score of 8.20, showcasing the significant role of human intervention and optimization.
  • PlantUML-based diagrams had a moderate performance (average score 6.94), benefitting from auto-generated code but still requiring manual correction.
  • Simple wireframe outputs were the least effective, with an average score of 5.5, often lacking the necessary detail and accuracy. Figure 3

    Figure 3: Distribution of the LLMs used in UML modeling tasks.

Discussion

This paper highlights that while LLMs are capable of aiding in software modeling, substantial limitations persist. LLMs often struggle with identifying complex relationships, underscoring a need for further enhancement in understanding relational constructs. These findings are pivotal for educators and industry professionals, suggesting that while LLMs serve as useful tools, reliance on them for complete accuracy without human intervention is premature.

Implications for Software Engineering

The implications for software engineering education are profound. LLMs can be integrated as supplementary tools in teaching UML modeling, leveraging their capacity to generate initial drafts of models but requiring critical human oversight. Educators and professionals must focus on training students to effectively collaborate with LLMs, enhancing their understanding while avoiding blind reliance on AI-generated outputs. Figure 4

Figure 4: Distribution of the languages used in the human-LLM interaction.

Conclusion

The exploratory paper demonstrates that LLMs hold potential in assisting novice engineers with UML modeling tasks but still have significant shortcomings in relational analysis and diagram precision. As AI continues to evolve, ongoing research and refinements are essential to transform LLMs into reliable partners in software engineering practices.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 3 tweets and received 6 likes.

Upgrade to Pro to view all of the tweets about this paper: