Researchers, legal experts want AI firms to open up for safety checks

More than 150 leading artificial intelligence (AI) researchers, ethicists and others have signed an open letter calling on generative AI (genAI) companies to submit to independent evaluations of their systems, the lack of which has led to concerns about basic protections.

The letter, drafted by researchers from MIT, Princeton, and Stanford University, called for legal and technical protections for good-faith research on genAI models, which they said is hampering safety measures that could help protect the public.

The letter, and a study behind it, was created with the help of nearly two dozen professors and researchers who called for a legal “safe harbor” for independent evaluation of genAI products.

The letter was sent to companies including OpenAI, Anthropic, Google, Meta, and Midjourney, and asks them to allow researchers to investigate their products to ensure consumers are protected from bias, alleged copyright infringement, and non-consensual intimate imagery.

“Independent evaluation of AI models that are already deployed is widely regarded as essential for ensuring safety, security, and trust,” two of the researchers responsible for the letter wrote in a blog post. “Independent red-teaming research of AI models has uncovered vulnerabilities related to low resource languages, bypassing safety measure, and a wide range of jailbreaks.

“These evaluations investigate a broad set of often unanticipated model flaws, related to misuse, bias, copyright, and other issues,” they said.

Last April, a who’s who of technologists called for AI labs to stop training the most powerful systems for at least six months, citing “profound risks to society and humanity.”

That open letter now has more than 3,100 signatories, including Apple co-founder Steve Wozniak; tech leaders called out San Francisco-based OpenAI Lab’s recently announced GPT-4 algorithm in particular, saying the company should halt further development until oversight standards were in place.

The latest letter said AI companies, academic researchers, and civil society “agree that generative AI systems pose notable risks and that independent evaluation of these risks is an essential form of accountability.”

The signatories include professors from Ivy League schools and other prominent universities, including MIT, as well as executives from companies such as Hugging Face and Mozilla. The list also includes researchers and ethicists such as Dhanaraj Thakur, research director at the Center for Democracy and Technology, and Subhabrata Majumdar, president of the AI Risk and Vulnerability Alliance.

While the letter acknowledges and even praises the fact that some genAI makers have special programs to give researchers access to their systems, it also calls them out for being subjective about who can or cannot see their tech.

In particular, the researchers called out AI companies Cohere and OpenAI as exceptions to the rule, “though some ambiguity remains as to the scope of protected activities.”

Cohere allows “intentional stress testing of the API and adversarial attacks” provided appropriate vulnerability disclosure (without explicit legal promises). And OpenAI expanded its safe harbor to include “model vulnerability research” and “academic model safety research” in response to an early draft of our proposal.

In other cases, genAI firms have already suspended researcher accounts and even changed their terms of service to deter some types of evaluation, according to the researchers, “disempowering independent researchers is not in AI companies’ own interests.”

Independent evaluators who do investigate genAI products fear account suspension (without an opportunity for appeal) and legal risks, “both of which can have chilling effects on research,” the letter argued.

To help protect users, the signatories want AI companies to provide two levels of protection to research:

A legal safe harber to ensure good faith independent AI safety, security, and trustworthiness research that’s conducted with well-established vulnerability disclosure.
A corporate commitment to more equitable access by using independent reviewers to moderate researchers’ evaluation applications.

Computerworld reached out to OpenAI and Google for a response, but neither company had immediate comment.

READ SOURCE