Independent evaluation and red teaming are critical for identifying the risks posed by generative AI systems. However, the terms of service and enforcement strategies used by prominent AI companies to deter model misuse have disincentives on good faith safety evaluations. This causes some researchers to fear that conducting such research or releasing their findings will result in account suspensions or legal reprisal. Although some companies offer researcher access programs, they are an inadequate substitute for independent research access, as they have limited community representation, receive inadequate funding, and lack independence from corporate incentives. We propose that major AI developers commit to providing a legal and technical safe harbor, indemnifying public interest safety research and protecting it from the threat of account suspensions or legal reprisal. These proposals emerged from our collective experience conducting safety, privacy, and trustworthiness research on generative AI systems, where norms and incentives could be better aligned with public interests, without exacerbating model misuse. We believe these commitments are a necessary step towards more inclusive and unimpeded community efforts to tackle the risks of generative AI.
The paper proposes establishing legal and technical safe harbors for researchers to conduct safety and security studies on generative AI systems without facing legal or account-related consequences.
It emphasizes the need for independent evaluation of AI systems to identify risks such as bias, privacy breaches, disinformation, and copyright infringement, which are currently hampered by AI companies' restrictive practices.
The proposed legal safe harbor would protect researchers engaged in good-faith vulnerability testing from legal retaliation, akin to practices in cybersecurity research.
Technical safe harbor involves creating allowances for researchers to prevent account suspension or technical enforcement, possibly through verification and endorsement by third parties like universities or nonprofits.
With the rapid deployment of generative AI systems and their significant societal impact, there is an urgent need for independent evaluation and red-team exercises to ensure the safety, security, and trustworthiness of these systems. Current practices and terms of service by AI companies, however, pose significant barriers to such essential research activities. To address these issues, this paper proposes the establishment of legal and technical safe harbors for public interest safety research on generative AI systems. These safe harbors aim to indemnify and technically protect researchers from potential legal and account-related repercussions that may arise from their work in identifying risks and vulnerabilities in AI systems.
Generative AI systems have become pervasive, presenting risks ranging from bias and privacy breaches to disinformation and copyright infringement. Despite these concerns, transparent and independent evaluations of these systems remain scarce due to the limited access provided by AI companies and the fear of legal consequences among researchers. This lack of independent research into AI systems' safety and vulnerabilities undermines public trust and hampers efforts to mitigate potential harms.
AI companies' terms of service and usage policies restrict unauthorized interactions with their systems that may be crucial for identifying vulnerabilities and risks. These restrictions serve to curb misuse but inadvertently also disincentivize and criminalize necessary safety and security research. Researchers face a dilemma: the essential activities for ensuring AI safety might violate terms of service, risking legal action or account suspension. This environment has created a chilling effect, stifling the advancement and dissemination of knowledge about AI systems' safety and limitations.
This paper advocates for two primary interventions by AI companies:
Realizing these safe harbors requires careful consideration to prevent misuse. The paper suggests involving reputable third parties such as universities or nonprofit organizations to vet and endorse researchers, ensuring that only qualified individuals benefit from these protections. This approach could facilitate broader, more inclusive research efforts without compromising the integrity and security of AI systems.
By instituting legal and technical safe harbors, AI companies can foster a more collaborative and transparent research environment. Such steps would not only enhance the safety and reliability of AI systems but also reinforce public trust in these technologies. The proposal represents a necessary evolution in the governance of AI research, encouraging a proactive approach to understanding and mitigating the risks associated with generative AI systems.