Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

BibSonomy Meets ChatLLMs for Publication Management: From Chat to Publication Management: Organizing your related work using BibSonomy & LLMs (2401.09092v1)

Published 17 Jan 2024 in cs.IR and cs.HC

Abstract: The ever-growing corpus of scientific literature presents significant challenges for researchers with respect to discovery, management, and annotation of relevant publications. Traditional platforms like Semantic Scholar, BibSonomy, and Zotero offer tools for literature management, but largely require manual laborious and error-prone input of tags and metadata. Here, we introduce a novel retrieval augmented generation system that leverages chat-based LLMs to streamline and enhance the process of publication management. It provides a unified chat-based interface, enabling intuitive interactions with various backends, including Semantic Scholar, BibSonomy, and the Zotero Webscraper. It supports two main use-cases: (1) Explorative Search & Retrieval - leveraging LLMs to search for and retrieve both specific and general scientific publications, while addressing the challenges of content hallucination and data obsolescence; and (2) Cataloguing & Management - aiding in the organization of personal publication libraries, in this case BibSonomy, by automating the addition of metadata and tags, while facilitating manual edits and updates. We compare our system to different LLM models in three different settings, including a user study, and we can show its advantages in different metrics.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. PaLM 2 Technical Report. CoRR abs/2305.10403 (2023). https://doi.org/10.48550/ARXIV.2305.10403 arXiv:2305.10403
  2. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?. In FAccT ’21: 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual Event / Toronto, Canada, March 3-10, 2021, Madeleine Clare Elish, William Isaac, and Richard S. Zemel (Eds.). ACM, 610–623. https://doi.org/10.1145/3442188.3445922
  3. The social bookmark and publication management system bibsonomy. The VLDB Journal 19, 6 (01 Dec 2010), 849–875. https://doi.org/10.1007/s00778-010-0208-4
  4. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html
  5. Evaluating Large Language Models Trained on Code. CoRR abs/2107.03374 (2021). arXiv:2107.03374 https://arxiv.org/abs/2107.03374
  6. Transformers as Soft Reasoners over Language. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020, Christian Bessiere (Ed.). ijcai.org, 3882–3890. https://doi.org/10.24963/IJCAI.2020/537
  7. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171–4186. https://doi.org/10.18653/v1/n19-1423
  8. Peter Fernandez. 2011. Zotero: information management software 2.0. Library Hi Tech News 28, 4 (01 Jan 2011), 5–7. https://doi.org/10.1108/07419051111154758
  9. Hallucinations in Large Multilingual Translation Models. arXiv:2303.16104 [cs.CL]
  10. The Semantic Scholar Open Data Platform. ArXiv abs/2301.10140 (2023). https://api.semanticscholar.org/CorpusID:256194545
  11. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinform. 36, 4 (2020), 1234–1240. https://doi.org/10.1093/bioinformatics/btz682
  12. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 9459–9474. https://proceedings.neurips.cc/paper_files/paper/2020/file/6b493230205f780e1bc26945df7481e5-Paper.pdf
  13. TruthfulQA: Measuring How Models Mimic Human Falsehoods. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Linguistics, 3214–3252. https://doi.org/10.18653/V1/2022.ACL-LONG.229
  14. OpenAI. 2023. ChatGPT plugins. https://openai.com/blog/chatgpt-plugins Accessed: 2023-10-30.
  15. Okapi at TREC-3. In Proceedings of The Third Text REtrieval Conference, TREC 1994, Gaithersburg, Maryland, USA, November 2-4, 1994 (NIST Special Publication, Vol. 500-225), Donna K. Harman (Ed.). National Institute of Standards and Technology (NIST), 109–126. http://trec.nist.gov/pubs/trec3/papers/city.ps.gz
  16. Toolformer: Language Models Can Teach Themselves to Use Tools. arXiv:2302.04761 [cs.CL]
  17. ScholarAI. 2023. ScholarAI. https://scholarai.io Accessed: 2023-10-30.
  18. OpenAgents: An Open Platform for Language Agents in the Wild. arXiv:2310.10634 [cs.CL]
  19. XLNet: Generalized Autoregressive Pretraining for Language Understanding. arXiv:1906.08237 [cs.CL]
  20. You.com. 2023. You.com. https://you.com Accessed: 2023-10-30.
  21. Zotero. 2023. Zotero Translation Server. https://github.com/zotero/translation-server Accessed: 2023-10-30.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Tom Völker (1 paper)
  2. Jan Pfister (5 papers)
  3. Tobias Koopmann (2 papers)
  4. Andreas Hotho (49 papers)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets