Today in Singapore, MLCommons® and AI Verify signed a memorandum of intent to collaborate on developing a set of common safety testing benchmarks for generative AI models for the betterment of AI safety globally.
A mature safety ecosystem includes collaboration across AI testing companies, national safety institutes, auditors, and researchers. The aim of the AI Safety benchmark effort that this agreement advances is to provide AI developers, integrators, purchasers, and policy makers with a globally accepted baseline approach to safety testing for generative AI.
“There is significant interest in the generative AI community globally to develop a common approach towards generative AI safety evaluations,” said Peter Mattson, MLCommons President and AI Safety working group co-chair. “The MLCommons AI Verify collaboration is a step-forward towards creating a global and inclusive standard for AI safety testing, with benchmarks designed to address safety risks across diverse contexts, languages, cultures, and value systems.”
The MLCommons AI Safety working group, a global group of academic researchers, industry technical experts, policy and standards representatives, and civil society advocates recently announced a v0.5 AI Safety benchmark proof of concept (POC). AI Verify will develop interoperable AI testing tools that will inform an inclusive v1.0 release which is expected to deliver this fall. In addition, they are building a toolkit for interactive testing to support benchmarking and red-teaming.
“Making first moves towards globally accepted AI safety benchmarks and testing standards, AI Verify Foundation is excited to partner with MLCommons to help our partners build trust in their models and applications across the diversity of cultural contexts and languages in which they were developed. We invite more partners to join this effort to promote responsible use of AI in Singapore and the world,” said Dr Ong Chen Hui, Chair of the Governing Committee at AI Verify Foundation.
The AI Safety working group encourages global participation to help shape the v1.0 AI Safety benchmark suite and beyond. To contribute, please join the MLCommons AI Safety working group.