Nathan Strauss, a spokesperson for Amazon said the company is closely reviewing the index. “Titan Text is still in private preview, and it would be premature to gauge the transparency of a foundation model before it’s ready for general availability,” he says. Meta declined to comment on the Stanford report and OpenAI did not respond to a request for comment.
Rishi Bommasani, a PhD student at Stanford who worked on the study, says it reflects the fact that AI is becoming more opaque even as it becomes more influential. This contrasts greatly with the last big boom in AI, when openness helped feed big advances in capabilities including speech and image recognition. “In the late 2010s, companies were more transparent about their research and published a lot more,” Bommasani says. “This is the reason we had the success of deep learning.”
The Stanford report also suggests that models do not need to be so secret for competitive reasons. Kevin Klyman, a policy researcher at Stanford, says the fact that a range of leading models score relatively highly on different measures of transparency suggests that all of them could become more open without losing out to rivals.
As AI experts try to figure out where the recent flourishing of certain approaches to AI will go, some say secrecy risks making the field less of a scientific discipline than a profit-driven one.
“This is a pivotal time in the history of AI,” says Jesse Dodge, a research scientist at the Allen Institute for AI, or AI2. “The most influential players building generative AI systems today are increasingly closed, failing to share key details of their data and their processes.”
AI2 is trying to develop a much more transparent AI language model, called OLMo. It is being trained using a collection of data sourced from the web, academic publications, code, books, and encyclopedias. That data set, called Dolma, has been released under AI2’s ImpACT license. When OLMo is ready, AI2 plans to release the working AI system and also the code behind it too, allowing others to build upon the project.
Dodge says widening access to the data behind powerful AI models is especially important. Without direct access, it is generally impossible to know why or how a model can do what it does. “Advancing science requires reproducibility,” he says. “Without being provided open access to these crucial building blocks of model creation we will remain in a ‘closed’, stagnating, and proprietary situation.”
Given how widely AI models are being deployed—and how dangerous some experts warn they might be—a little more openness could go a long way.