Foundation Model Transparency Reports (2402.16268v1)
Abstract: Foundation models are critical digital technologies with sweeping societal impact that necessitates transparency. To codify how foundation model developers should provide transparency about the development and deployment of their models, we propose Foundation Model Transparency Reports, drawing upon the transparency reporting practices in social media. While external documentation of societal harms prompted social media transparency reports, our objective is to institutionalize transparency reporting for foundation models while the industry is still nascent. To design our reports, we identify 6 design principles given the successes and shortcomings of social media transparency reporting. To further schematize our reports, we draw upon the 100 transparency indicators from the Foundation Model Transparency Index. Given these indicators, we measure the extent to which they overlap with the transparency requirements included in six prominent government policies (e.g., the EU AI Act, the US Executive Order on Safe, Secure, and Trustworthy AI). Well-designed transparency reports could reduce compliance costs, in part due to overlapping regulatory requirements across different jurisdictions. We encourage foundation model developers to regularly publish transparency reports, building upon recommendations from the G7 and the White House.
- 2010. Greater transparency around government requests. https://googleblog.blogspot.com/2010/04/greater-transparency-around-government.html.
- 2010. Internal Report Finds Flagrant National Security Letter Abuse By FBI. https://www.aclu.org/press-releases/internal-report-finds-flagrant-national-security-letter-abuse-fbi.
- 2014. 2014 Transparency Report. https://extfiles.etsy.com/Press/reports/Etsy_TransparencyReport_2014.pdf.
- 2020. WFA and platforms make major progress to address harmful content. https://wfanet.org/knowledge/item/2020/09/23/WFA-and-platforms-make-major-progress-to-address-harmful-content.
- A Safe Harbor for Platform Research. https://knightcolumbia.org/content/a-safe-harbor-for-platform-research.
- Persistent anti-muslim bias in large language models. arXiv preprint arXiv:2101.05783 (2021).
- Access Now. 2023. Transparency Reporting Index. https://www.accessnow.org/campaign/transparency-reporting-index/.
- Mike Ananny and Kate Crawford. 2018. Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability. New Media & Society 20, 3 (2018), 973–989. https://doi.org/10.1177/1461444816676645 arXiv:https://doi.org/10.1177/1461444816676645
- Frontier AI Regulation: Managing Emerging Risks to Public Safety. arXiv:2307.03718 [cs.CY]
- Aspen Institute. 2021. Commission on Information Disorder Final Report. https://www.aspeninstitute.org/wp-content/uploads/2021/11/Aspen-Institute_Commission-on-Information-Disorder_Final-Report.pdf.
- Case Study #3: Transparency Reporting. https://www.newamerica.org/in-depth/getting-internet-companies-do-right-thing/case-study-3-transparency-reporting/.
- Socially meaningful transparency in data-based systems: reflections and proposals from practice. Journal of Documentation (2023). https://doi.org/10.1108/JD-01-2023-0006
- Emily M Bender and Batya Friedman. 2018. Data statements for natural language processing: Toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics (TACL) 6 (2018), 587–604.
- On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. 610–623.
- Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency (Chicago, IL, USA) (FAccT ’23). Association for Computing Machinery, New York, NY, USA, 1493–1504. https://doi.org/10.1145/3593013.3594095
- Clare Birchall. 2021. Radical secrecy: The ends of transparency in datafied America. Vol. 60. U of Minnesota Press.
- On the Opportunities and Risks of Foundation Models. arXiv preprint arXiv:2108.07258 (2021).
- The Foundation Model Transparency Index. arXiv:2310.12941 [cs.LG]
- Ecosystem Graphs: The Social Footprint of Foundation Models. ArXiv abs/2303.15772 (2023). https://api.semanticscholar.org/CorpusID:257771875
- Improving Transparency in AI Language Models: A Holistic Evaluation. Foundation Model Issue Brief Series (2023). https://hai.stanford.edu/foundation-model-issue-brief-series
- Danah Boyd. 2016. Algorithmic Accountability and Transparency. Open Transcripts. http://opentranscripts.org/transcript/danah-boyd-algorithmic-accountability-transparency/ Presented by danah boyd in Algorithmic Accountability and Transparency in the Digital Economy.
- Sparks of Artificial General Intelligence: Early experiments with GPT-4. arXiv:2303.12712 [cs.CL]
- The Transparency Reporting Toolkit: Survey & Best Practice Memos for Reporting on U.S. Government Requests for User Information. https://www.newamerica.org/oti/policy-papers/the-transparency-reporting-toolkit/.
- European Commission. 2022. The Digital Services Act: ensuring a safe and accountable online environment. European Commission (2022). https://commission.europa.eu/strategy-and-policy/priorities-2019-2024/europe-fit-digital-age/digital-services-act-ensuring-safe-and-accountable-online-environment_en
- United States Congress. 2023. AI Foundation Model Transparency Act. https://beyer.house.gov/uploadedfiles/ai_foundation_model_transparency_act_text_118.pdf
- Kate Crawford. 2021. The atlas of AI: Power, politics, and the planetary costs of artificial intelligence. Yale University Press.
- Who Has Your Back? https://www.eff.org/files/2019/06/11/whyb_2019_report.pdf
- Universal Digital Ad Transparency. In TPRC49: The 49th Research Conference on Communication, Information and Internet Policy. Available at SSRN: https://ssrn.com/abstract=3898214 or http://dx.doi.org/10.2139/ssrn.3898214.
- European Commission. 2023. Commission launches public consultation on the Implementing Regulation on transparency reporting under the DSA. https://digital-strategy.ec.europa.eu/en/news/commission-launches-public-consultation-implementing-regulation-transparency-reporting-under-dsa
- European Council. 2024. Proposal for a Regulation of the European Parliament and of the Council laying down harmonised rules on artificial intelligence (Artificial Intelligence Act) and amending certain Union legislative acts. https://data.consilium.europa.eu/doc/document/ST-5662-2024-INIT/en/pdf
- Facebook. 2023. Facebook Transparent Reports. https://transparency.fb.com/reports/
- U.S. Food and Drug Administration. 2018. Questions and Answers on FDA’s Adverse Event Reporting System (FAERS). https://www.fda.gov/drugs/surveillance/questions-and-answers-fdas-adverse-event-reporting-system-faers.
- U.S. Food and Drug Administration. 2021. FDA Adverse Event Reporting System (FAERS): Latest Quartely Data Files. https://catalog.data.gov/dataset/fda-adverse-event-reporting-system-faers-latest-quartely-data-files.
- U.S. Food and Drug Administration. 2023. FDA Adverse Event Reporting System (FAERS) Public Dashboard. https://www.fda.gov/drugs/questions-and-answers-fdas-adverse-event-reporting-system-faers/fda-adverse-event-reporting-system-faers-public-dashboard.
- What’s going on with the Open LLM Leaderboard? https://huggingface.co/blog/evaluating-mmlu-leaderboard
- Datasheets for datasets. Commun. ACM 64, 12 (2021), 86–92.
- Datasheets for Datasets. arXiv preprint arXiv:1803.09010 (2018).
- Ritwick Ghosh and Hilary Oliva Faxon. 2023. Smart corruption: Satirical strategies for gaming accountability. Big Data & Society 10, 1 (2023), 20539517231164119. https://doi.org/10.1177/20539517231164119 arXiv:https://doi.org/10.1177/20539517231164119
- Robert Gorwa and Timothy Garton Ash. 2020. Democratic Transparency in the Platform Society. Cambridge University Press, 286–312.
- Mary L Gray and Siddharth Suri. 2019. Ghost work: How to stop Silicon Valley from building a new global underclass. Eamon Dolan Books.
- Glenn Greenwald. 2013. NSA collecting phone records of millions of Verizon customers daily. https://www.theguardian.com/world/2013/jun/06/nsa-phone-records-verizon-court-order.
- Group of Seven. 2023. Hiroshima Process International Code of Conduct for Organizations Developing Advanced AI Syste. https://www.mofa.go.jp/files/100573473.pdf
- AI Regulation Has Its Own Alignment Problem: The Technical and Institutional Feasibility of Disclosure, Registration, Licensing, and Auditing. George Washington Law Review, Symposium on Legally Disruptive Emerging Technologies (2023).
- Byung-Chul Han. 2015. The transparency society. Stanford University Press.
- Karen Hao and Deepa Seetharaman. 2023. Cleaning Up ChatGPT Takes Heavy Toll on Human Workers. The Wall Street Journal (24 July 2023). https://www.wsj.com/articles/chatgpt-openai-content-abusive-sexually-explicit-harassment-kenya-workers-on-human-workers-cf191483 Photographs by Natalia Jidovanu.
- Woodrow Hartzog. 2023. Oversight of A.I.: Legislating on Artificial Intelligence. Prepared Testimony and Statement for the Record before the U.S. Senate Committee on the Judiciary, Subcommittee on Privacy, Technology, and the Law. https://www.judiciary.senate.gov/imo/media/doc/2023-09-12_pm_-_testimony_-_hartzog.pdf
- Measuring massive multitask language understanding. In International Conference on Learning Representations (ICLR).
- The White House. 2023. Ensuring Safe, Secure, and Trustworthy AI. https://www.whitehouse.gov/wp-content/uploads/2023/07/Ensuring-Safe-Secure-and-Trustworthy-AI.pdf
- Innovation, Science and Economic Development Canada. 2023. Voluntary Code of Conduct on the Responsible Development and Management of Advanced Generative AI Systems. https://ised-isde.canada.ca/site/ised/en/voluntary-code-conduct-responsible-development-and-management-advanced-generative-ai-systems
- Sayash Kapoor and Arvind Narayanan. 2023. Licensing is neither feasible nor effective for addressing AI risks. https://www.aisnakeoil.com/p/licensing-is-neither-feasible-nor
- Daphne Keller. 2021. Some Humility About Transparency. https://cyberlaw.stanford.edu/blog/2021/03/some-humility-about-transparency.
- Daphne Keller. 2022. Hearing on Platform Transparency: Understanding the Impact of Social Media. Technical Report. United States Senate Committee on the Judiciary, Subcommittee on Privacy, Technology and the Law. https://www.judiciary.senate.gov/imo/media/doc/Keller%20Testimony1.pdf Statement of Daphne Keller, Stanford University Cyber Policy Center.
- Jeremy Kessel. 2016. Advancing #transparency with more insightful data. https://blog.twitter.com/official/en_us/a/2016/advancing-transparency-with-more-insightful-data.html.
- Atul Kumar. 2018. The Newly Available FAERS Public Dashboard: Implications for Health Care Professionals. Issue 2.
- Seth Lazar. 2023. Governing the Algorithmic City. Tanner Lectures (2023). https://write.as/sethlazar/
- BLOOM: A 176B-Parameter Open-Access Multilingual Language Model. (2022). https://doi.org/10.48550/ARXIV.2211.05100
- Emma Llansó and Caitlin Vogus. 2021. Transparency Reports. https://cdt.org/wp-content/uploads/2022/01/2021-12-20-FX-Transparency-Framework-brief-Transparency-Reports-final.pdf.
- Stable Bias: Analyzing Societal Representations in Diffusion Models. arXiv:2303.11408 [cs.CY]
- Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model. ArXiv abs/2211.02001 (2022). https://api.semanticscholar.org/CorpusID:253265387
- 2019 RDR Corporate Accountability Index. https://rankingdigitalrights.org/index2019/assets/static/download/RDRindex2019report.pdf
- By the Numbers: Tracking The AI Executive Order. https://hai.stanford.edu/news/numbers-tracking-ai-executive-order
- Gabby Miller. 2023. Tracking the First Digital Services Act Transparency Reports. https://www.techpolicy.press/tracking-the-first-digital-services-act-transparency-reports/.
- Model Cards for Model Reporting. Proceedings of the Conference on Fairness, Accountability, and Transparency (2018).
- Brent Mittelstadt. 2019. Principles alone cannot guarantee ethical AI. Nature Machine Intelligence 1, 11 (November 2019), 501–507. https://doi.org/10.1038/s42256-019-0114-4
- NAIAC. 2023. RECOMMENDATION: Improve Monitoring of Emerging Risks from AI through Adverse Event Reporting. https://ai.gov/wp-content/uploads/2023/12/Recommendation_Improve-Monitoring-of-Emerging-Risks-from-AI-through-Adverse-Event-Reporting.pdf
- Arvind Narayanan and Sayash Kapoor. 2023. Generative AI companies must publish transparency reports. https://knightcolumbia.org/blog/generative-ai-companies-must-publish-transparency-reports
- NYT. 2024. THE NEW YORK TIMES COMPANY v. MICROSOFT CORPORATION, OPENAI, INC., OPENAI LP, OPENAI GP, LLC, OPENAI, LLC, OPENAI OPCO LLC, OPENAI GLOBAL LLC, OAI CORPORATION, LLC, and OPENAI HOLDINGS, LLC. https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf
- Conference on Neural Information Processing Systems. 2022. NeurIPS 2022 Paper Checklist Guidelines. https://neurips.cc/Conferences/2022/PaperInformation/PaperChecklist
- Conference on Neural Information Processing Systems. 2023. Call for Main Conference Papers. https://2023.emnlp.org/calls/main_conference_papers/
- OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL]
- Carbon emissions and large neural network training. arXiv preprint arXiv:2104.10350 (2021).
- M. Perino. 2010. The Hellhound of Wall Street: How Ferdinand Pecora’s Investigation of the Great Crash Forever Changed American Finance. Penguin Publishing Group. https://books.google.com/books?id=VJZPEAAAQBAJ
- Billy Perrigo. 2022. Exclusive: OpenAI Used Kenyan Workers on Less Than 2 Per Hour to Make ChatGPT Less Toxic. Time (2022). https://time.com/6247678/openai-chatgpt-kenya-workers
- Sundar Pichai and Demis Hassabis. [n. d.]. Introducing Gemini: our largest and most capable AI model.
- Inioluwa Deborah Raji and Joy Buolamwini. 2019. Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society (Honolulu, HI, USA) (AIES ’19). Association for Computing Machinery, New York, NY, USA, 429–435. https://doi.org/10.1145/3306618.3314244
- Jan Rydzak. 2023. The Stalled Machines of Transparency Reporting. https://carnegieendowment.org/2023/11/29/stalled-machines-of-transparency-reporting-pub-91085.
- Santa Clara Principles. 2023. The Santa Clara Principles: On Transparency and Accountability in Content Moderation. https://santaclaraprinciples.org/.
- Amy Schatz. 2006. Tech Firms Defend China Web Policies. https://www.wsj.com/articles/SB114002162437674809.
- Collaborative Governance of the EU Digital Single Market established by the Digital Services Act. University of Luxembourg Law Research Paper 2023, 09 (4 September 2023). https://ssrn.com/abstract=4561010
- U.S. Secutiries and Exchange Commision. 2024a. About the SEC. https://www.sec.gov/strategic-plan/about.
- U.S. Secutiries and Exchange Commision. 2024b. Form 10-K. https://www.investor.gov/introduction-investing/investing-basics/glossary/form-10-k.
- U.S. Secutiries and Exchange Commision. 2024c. Form 8-K. https://www.investor.gov/introduction-investing/investing-basics/glossary/form-8-k.
- U.S. Secutiries and Exchange Commision. 2024d. Generally Accepted Accounting Principles (GAAP). https://www.investor.gov/introduction-investing/investing-basics/glossary/generally-accepted-accounting-principles-gaap.
- U.S. Secutiries and Exchange Commision. 2024e. Generally Accepted Accounting Principles (GAAP). https://www.investor.gov/introduction-investing/investing-basics/glossary/generally-accepted-accounting-principles-gaap.
- Katie Stoughton and Paul Rosenzweig. 2022. Toward Greater Content Moderation Transparency Reporting. Lawfare. https://www.lawfaremedia.org/article/toward-greater-content-moderation-transparency-reporting
- Trust and Safety Professional Association. 2023. Transparency Reporting. https://www.tspa.org/curriculum/ts-fundamentals/transparency-report/.
- UK CMA. 2023. AI Foundation Models: Initial Report. https://assets.publishing.service.gov.uk/media/65081d3aa41cc300145612c0/Full_report_.pdf
- United States Executive Office of the President. 2023. Executive Order on Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. https://www.federalregister.gov/documents/2023/11/01/2023-24283/safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence
- Aleksandra Urman and Mykola Makhortykh. 2023. How transparent are transparency reports? Comparative analysis of transparency reporting across online platforms. Telecommunications Policy 47, 3 (2023), 102477. https://doi.org/10.1016/j.telpol.2022.102477
- Valerie C. Brannon and Victoria L. Killion and Whitney K. Novak and L. Paige Whitaker. 2023. First Amendment Limitations on Disclosure Requirements. https://crsreports.congress.gov/product/pdf/IF/IF12388
- Mathias Vermeulen. 2021. The Keys to the Kingdom. https://knightcolumbia.org/content/the-keys-to-the-kingdom
- Jai Vipra and Anton Korinek. 2023. Market concentration implications of foundation models: The Invisible Hand of ChatGPT. The Brookings Institution (2023). https://www.brookings.edu/articles/market-concentration-implications-of-foundation-models-the-invisible-hand-of-chatgpt
- Caitlin Vogus and Emma Llansó. 2021. Making Transparency Meaningful: A Framework for Policymakers. Center for Democracy and Technology (2021). https://cdt.org/insights/report-making-transparency-meaningful-a-framework-for-policymakers/
- Emergent Abilities of Large Language Models. Transactions on Machine Learning Research (2022). https://openreview.net/forum?id=yzkSU5zdwD Survey Certification.
- Taxonomy of Risks Posed by Language Models. In 2022 ACM Conference on Fairness, Accountability, and Transparency (Seoul, Republic of Korea) (FAccT ’22). Association for Computing Machinery, New York, NY, USA, 214–229. https://doi.org/10.1145/3531146.3533088
- De Anima: On the Soul. https://www.noemamag.com/the-exploited-labor-behind-artificial-intelligence/
- X. 2023. An update on Twitter Transparency Reporting. https://blog.twitter.com/en_us/topics/company/2023/an-update-on-twitter-transparency-reporting.
- Monika Zalnieriute. 2021. “Transparency-Washing” in the Digital Age : A Corporate Agenda of Procedural Fetishism. Technical Report. http://hdl.handle.net/11159/468588
- Jenny Zhu. 2015. A perfect EFF score! We’re proud to have your back. https://wordpress.com/blog/2015/06/17/a-perfect-eff-score-were-proud-to-have-your-back/.