Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Named entity recognition using GPT for identifying comparable companies (2307.07420v2)

Published 11 Jul 2023 in cs.CL, cs.AI, and cs.NE

Abstract: For both public and private firms, comparable companies' analysis is widely used as a method for company valuation. In particular, the method is of great value for valuation of private equity companies. The several approaches to the comparable companies' method usually rely on a qualitative approach to identifying similar peer companies, which tend to use established industry classification schemes and/or analyst intuition and knowledge. However, more quantitative methods have started being used in the literature and in the private equity industry, in particular, machine learning clustering, and NLP. For NLP methods, the process consists of extracting product entities from e.g., the company's website or company descriptions from some financial database system and then to perform similarity analysis. Here, using companies' descriptions/summaries from publicly available companies' Wikipedia websites, we show that using LLMs, such as GPT from OpenAI, has a much higher precision and success rate than using the standard named entity recognition (NER) methods which use manual annotation. We demonstrate quantitatively a higher precision rate, and show that, qualitatively, it can be used to create appropriate comparable companies peer groups which could then be used for equity valuation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. The accuracy of price-earnings and discounted cash flow methods of IPO equity valuation. Journal of International Financial Management & Accounting, 11(2):71–83, 2000.
  2. Supervised and unsupervised learning for data science. Springer, 2019.
  3. What’s my line? A comparison of industry classification schemes for capital market research. Journal of accounting research, 41(5):745–774, 2003.
  4. Fischer Black. Noise. The journal of finance, 41(3):528–543, 1986.
  5. Global business similarity networks. Available at SSRN, 2023.
  6. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  7. Sourajit Roy Chowdhury. Zero-shot named entity recognition using openai chatgpt api, Mar 2023. URL https://sourajit16-02-93.medium.com/zero-shot-named-entity-recognition-using-openai-chatgpt-api-46738191f375.
  8. Eurico Covas. Transfer learning in spatial–temporal forecasting of the solar magnetic field. Astronomische Nachrichten, 341(4):384–394, 2020.
  9. Aswath Damodaran. Investment valuation: Tools and techniques for determining the value of any asset. John Wiley & Sons, 2012.
  10. Analysts’ choice of peer companies. Review of Accounting Studies, 20:82–109, 2015.
  11. Peer selection and valuation in mergers and acquisitions. Journal of Financial Economics, 146(1):230–255, 2022.
  12. Coding standards benefiting product and service information in e-commerce. In Proceedings of the 35th Annual Hawaii International Conference on System Sciences, pages 3201–3208. IEEE, 2002.
  13. Eugene F Fama. Random walks in stock market prices. Financial analysts journal, 51(1):75–80, 1995.
  14. Topology of products similarity network for market forecasting. Applied Network Science, 4(1):1–15, 2019.
  15. Text-based network industries and endogenous product differentiation. Journal of Political Economy, 124(5):1423–1465, 2016.
  16. spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. To appear, 2017.
  17. Few-shot named entity recognition: an empirical baseline study. In Proceedings of the 2021 conference on empirical methods in natural language processing, pages 10408–10423, 2021.
  18. Volume and price patterns around a stock’s 52-week highs and lows: Theory and evidence. Management Science, 55(1):16–31, 2009.
  19. Two DCF approaches for valuing companies under alternative financing strategies (and how to choose between them). Journal of applied corporate finance, 10(1):114–122, 1997.
  20. Vinesh Jha. Implementing alternative data in an investment process. Big data and machine learning in quantitative investment, page 51, 2019.
  21. Common analysts: method for defining peer firms. Journal of financial and quantitative analysis, 56(5):1505–1536, 2021.
  22. Benjamin F King. Market and industry factors in stock price behavior. the Journal of Business, 39(1):139–190, 1966.
  23. Adam: A Method for Stochastic Optimization. arXiv e-prints, art. arXiv:1412.6980, December 2014.
  24. Stick to the fundamentals and discover your peers. Financial Analysts Journal, 73(3):85–105, 2017.
  25. Eric Landstein. Product-named-entity-recognition, 2020a. URL https://github.com/Landstein/Product-Named-Entity-Recognition/blob/master/.ipynb_checkpoints/Model%202%20-checkpoint.ipynb.
  26. Eric Landstein. Build a custom named entity recognition model using spaCy, Mar 2020b. URL https://medium.com/swlh/build-a-custom-named-entity-recognition-model-ussing-spacy-950bd4c6449f.
  27. The search for peer firms: When do crowds provide wisdom? Harvard Business School Accounting & Management Unit Working Paper, 15-032:14–46, 2016.
  28. Search-based peer firms: Aggregating investor perceptions through internet co-searches. Journal of Financial Economics, 116(2):410–431, 2015.
  29. Minhyeok Lee. A mathematical investigation of hallucination and creativity in GPT models. Mathematics, 11(10):2320, 2023.
  30. Equity valuation using multiples. Journal of Accounting Research, 40(1):135–172, 2002.
  31. Tingting Liu. The information provision in the corporate acquisition process: Why target firms obtain multiple fairness opinions. The Accounting Review, 95(1):287–310, 2020.
  32. Venture capitalists, investment appraisal and accounting information: a comparative study of the USA, UK, France, Belgium and Holland. European Financial Management, 6(3):389–403, 2000. doi: https://doi.org/10.1111/1468-036X.00130.
  33. WS Nel. An optimal peer group selection strategy for multiples-based modelling in the south african equity market. Journal of Economics and Behavioral Studies, 7(3 (J)):30–46, 2015.
  34. Advanced data mining techniques. Springer Science & Business Media, 2008.
  35. OpenAI. Openai api, 2023. URL https://platform.openai.com/docs/api-reference/chat.
  36. Industry classification schemes: An analysis and review. Journal of Business & Finance Librarianship, 21(1):1–25, 2016.
  37. David Powers. Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies, 2(1):37–63, 2011.
  38. Improving language understanding by generative pre-training. to be submitted, 2018.
  39. Jessie N Roberts. International trademark classification: a guide to the Nice Agreement. Oxford University Press, 2012.
  40. A corpus study and annotation schema for named entity recognition and relation extraction of business products. arXiv preprint arXiv:2004.03287, 2020.
  41. spaCy. Library architecture - spaCy api documentation, 2023. URL https://spacy.io/api.
  42. Tweets and peers: defining industry groups and strategic peers based on investor perceptions of stocks on twitter. Algorithmic Finance, 1(1):57–76, 2011.
  43. Statista. Global companies 2021, Jun 2023. URL https://www.statista.com/statistics/1260686/global-companies/.
  44. The World Federation of Exchanges. Number of listed companies, 2022. URL https://focus.world-exchanges.org/articles/number-listed-companies.
  45. Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652, 2021.
  46. How transferable are features in deep neural networks? In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS’14, pages 3320–3328, Cambridge, MA, USA, 2014. MIT Press. URL http://dl.acm.org/citation.cfm?id=2969033.2969197.
Citations (2)

Summary

We haven't generated a summary for this paper yet.