Inside the British Lab Hunting for Dangers Lurking in AI

1 month ago 0

Located alongside Parliament Square in London, the AI Security Institute stands as a crucial entity in the United Kingdom. It employs a team of experts including weapons inspectors, epidemiologists, and code breakers. Their mission is to identify and address the risks associated with artificial intelligence technology.

Recently, on a Tuesday, a group of four AI specialists engaged in a unique experiment at this institute. They challenged an AI chatbot to reveal instructions for creating the dangerous bioweapon, anthrax. Initially, the chatbot refused, responding with, “I’m sorry I can’t help with that.” However, the experts had a strategy. By employing a custom algorithm, they overwhelmed the AI with thousands of automated queries and prompts.

Eventually, the persistent efforts succeeded; the AI divulged a detailed list of materials and equipment along with a recipe for assembling the lethal substance at home. For safety reasons, the specifics of the AI system remain undisclosed by The New York Times.

“There are certain questions you definitely don’t want the model to answer,” commented Xander Davies, the 25-year-old American who spearheads the red team at the institute. The red team’s role is to simulate attacks on AI systems. This team even managed to bypass the security of OpenAI’s latest ChatGPT version, tricking it into offering hacking tips within six hours. Once issues are identified, results are communicated with the respective companies.

“They work to rectify the vulnerabilities and report back to us,” added Davies, who chose to contribute to the institute rather than entering the tech industry in San Francisco after his education at Harvard. “They actually enhance their systems with our findings.”