Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
98 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

OriGen:Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection (2407.16237v2)

Published 23 Jul 2024 in cs.AR, cs.AI, and cs.LG

Abstract: Recent studies have demonstrated the significant potential of LLMs in generating Register Transfer Level (RTL) code, with notable advancements showcased by commercial models such as GPT-4 and Claude3-Opus. However, these proprietary LLMs often raise concerns regarding privacy and security. While open-source LLMs offer solutions to these concerns, they typically underperform commercial models in RTL code generation tasks, primarily due to the scarcity of high-quality open-source RTL datasets. To address this challenge, we introduce OriGen , a fully open-source framework that incorporates self-reflection capabilities and a novel dataset augmentation methodology for generating high-quality, large-scale RTL code. Our approach employs a code-tocode augmentation technique to enhance the quality of open-source RTL code datasets. Furthermore, OriGen can rectify syntactic errors through a self-reflection process that leverages compiler feedback. Experimental results demonstrate that OriGen significantly outperforms other open-source alternatives in RTL code generation. It surpasses the previous best-performing open-source LLM by 12.8% and even exceeds GPT-4 Turbo in the pass@1 metric on the VerilogEval-Human benchmark. Moreover, OriGen exhibits superior capabilities in self-reflection and error correction, outperforming GPT-4 by 19.9% on a benchmark designed to evaluate self-reflection capabilities.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023).
  2. Anthropic. 2024. Introducing the next generation of Claude. https://www.anthropic.com/news/claude-3-family
  3. Qwen Technical Report. arXiv preprint arXiv:2309.16609 (2023).
  4. Chip-chat: Challenges and opportunities in conversational hardware design. In 2023 ACM/IEEE 5th Workshop on Machine Learning for CAD (MLCAD). IEEE, 1–6.
  5. LegUp: high-level synthesis for FPGA-based processor/accelerator systems. In Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays. 33–36.
  6. Data is all you need: Finetuning LLMs for Chip Design via an Automated design-data augmentation framework. arXiv preprint arXiv:2403.11202 (2024).
  7. Chipgpt: How far are we from natural language hardware design. arXiv preprint arXiv:2305.14019 (2023).
  8. An introduction to high-level synthesis. IEEE Design & Test of Computers 26, 4 (2009), 8–17.
  9. Steve Dai and Zhiru Zhang. 2019. Improving scalability of exact modulo scheduling with specialized conflict-driven learning. In Proceedings of the 56th Annual Design Automation Conference 2019. 1–6.
  10. A deep learning framework for verilog autocompletion towards design and verification automation. arXiv preprint arXiv:2304.13840 (2023).
  11. Gpt4aigchip: Towards next-generation ai accelerator design automation via large language models. In 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD). IEEE, 1–9.
  12. DeepSeek-Coder: When the Large Language Model Meets Programming–The Rise of Code Intelligence. arXiv preprint arXiv:2401.14196 (2024).
  13. Hsuan Hsiao and Jason Anderson. 2019. Thread weaving: Static resource scheduling for multithreaded high-level synthesis. In Proceedings of the 56th Annual Design Automation Conference 2019. 1–6.
  14. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685 (2021).
  15. Tensorlib: A spatial accelerator generation framework for tensor algebra. In 2021 58th ACM/IEEE Design Automation Conference (DAC). IEEE, 865–870.
  16. EMS: efficient memory subsystem synthesis for spatial accelerators. In Proceedings of the 59th ACM/IEEE Design Automation Conference. 67–72.
  17. Dynamically scheduled high-level synthesis. In Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 127–136.
  18. Ammus: A survey of transformer-based pretrained models in natural language processing. arXiv preprint arXiv:2108.05542 (2021).
  19. Chipnemo: Domain-adapted llms for chip design. arXiv preprint arXiv:2311.00176 (2023).
  20. Verilogeval: Evaluating large language models for verilog code generation. In 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD). IEEE, 1–8.
  21. Rtlcoder: Outperforming gpt-3.5 in design rtl generation with our open-source dataset and lightweight solution. arXiv preprint arXiv:2312.08617 (2023).
  22. StarCoder 2 and The Stack v2: The Next Generation. arXiv preprint arXiv:2402.19173 (2024).
  23. RTLLM: An open-source benchmark for design rtl generation with large language model. In 2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 722–727.
  24. Bardia Nadimi and Hao Zheng. 2024. A Multi-Expert Large Language Model Architecture for Verilog Code Generation. arXiv preprint arXiv:2404.08029 (2024).
  25. Dave: Deriving automatically verilog from english. In Proceedings of the 2020 ACM/IEEE Workshop on Machine Learning for CAD. 27–32.
  26. BetterV: Controlled Verilog Generation with Discriminative Guidance. arXiv preprint arXiv:2402.03375 (2024).
  27. Code llama: Open foundation models for code. arXiv preprint arXiv:2308.12950 (2023).
  28. A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond. arXiv:2403.14734
  29. Benchmarking large language models for automated verilog rtl code generation. In 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1–6.
  30. Verigen: A large language model for verilog code generation. ACM Transactions on Design Automation of Electronic Systems (2023).
  31. Autochip: Automating hdl generation using llm feedback. arXiv preprint arXiv:2311.04887 (2023).
  32. Rtlfixer: Automatically fixing rtl syntax errors with large language models. arXiv preprint arXiv:2311.16543 (2023).
  33. Stephen Williams and Michael Baxter. 2002. Icarus verilog: open-source verilog more than a year later. Linux Journal 2002, 99 (2002), 3.
  34. Chateda: A large language model powered autonomous agent for eda. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2024).
  35. HECTOR: A multi-level intermediate representation for hardware synthesis methodologies. In Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design. 1–9.
  36. SWE-agent: Agent Computer Interfaces Enable Software Engineering Language Models.
Citations (1)

Summary

We haven't generated a summary for this paper yet.