5 Critical AI Safety Insights Leaders Can't Ignore
Hello everyone, it's great to be back! I've been busy with my new role at the Gradient Institute, a non-profit research organization focused on Responsible AI and AI Safety. Last week, we organized an AI Safety Forum in Sydney, and I want to share some key insights with you.
---
Day 1 of the AI Safety Forum made me reflect on some important tensions and opportunities in AI governance. Here are some of the key takeaways and questions we discussed.
1. AI Safety vs. General AI Governance
There are several differences between AI safety and general AI governance. In a nutshell, AI safety is about preventing severe risks from advanced AI systems, focusing on managing specific high-stakes risks as AI becomes more powerful. This area is still developing compared to broader governance. A common issue discussed was "safety creep"—when risks that aren't strictly safety-related get included, making it harder to stay focused. However, this might be a practical approach, given how interconnected these challenges are.
2. GenAI Is A Developing Science
Daniel Murfet — a mathematician and pioneer of “developmental interpretability” — pointed out that GenAI is still in its early stages, unlike more mature technologies. Training AI models is not as straightforward as manufacturing a car. Instead, we’re still learning about surprising structures within AI—internal algorithms that seem to evolve during training. This shows that AI development is more complex and emergent than we fully understand.
AI systems can perform cognitive tasks—changing information in ways that have a real impact—at superhuman speed and scale. The risks are not hypothetical. They include things like manipulation through bots or losing our ability to guide an increasingly AI-driven world. These challenges go beyond technology—they affect human control and decision-making.
3. Rules, Incentives, and Global Differences in AI Governance
Professor of law at USyd and member of the Commonwealth Government's AI Expert Group Kimberlee Weatherall kicked off her talks by emphasising that just because people could, in theory, reduce or prevent a risk of harm doesn't mean they will. Effective governance isn’t just about having the right tools—it’s also about the right incentives, rules, and global cooperation. But there’s no global agreement yet on how to regulate AI effectively. Different regions are taking different approaches—the EU has its AI Act, China has specific registration requirements, and the US is still debating. As a middle power, Australia could play a key role in helping to bring these efforts together.
4. The Role of Red Teaming and Socio-Technical Evaluations
Hoda Heidari, Assistant Professor in Machine Learning and Societal Computing at CMU, talked about how we need to rethink how we assess the risks of GenAI. Traditional red teaming—where we test systems for weaknesses—is important, but it’s not enough. We need to look at both the technical weaknesses and the broader social impacts. Red teaming often focuses narrowly on finding specific issues, but we need a more comprehensive approach that takes into account how AI affects society.
5. “All of This Is Sociotechnical”
Professor of Philosophy at the ANU and leader of the Machine Intelligence and Normative Theory Lab Seth Lazar summed it up nicely:
“All of this needs to be sociotechnical, based on understanding that all AI systems are sociotechnical systems.”
AI is not just about code. It’s about how technology interacts with people—how we use it, how it shapes our lives, and how we shape its development.
A Call for Coherent Action
The discussions made one thing clear: regulating AI is a shared global challenge that requires understanding both technology and human values. There’s no simple solution or a purely technical fix—we need collaboration across disciplines, international cooperation, and a deep understanding of how AI affects society.
Where does this leave us? It leaves us with both urgency and opportunity. As AI continues to evolve, we need to ensure it serves humanity in the best way possible. We might not have all the answers yet, but asking the right questions is the first step toward meaningful solutions.
What are your thoughts on these challenges? Is your organization focused on AI safety, or are you looking at broader governance issues?
I’d love to hear how these ideas resonate with the work you’re doing or the concerns you have.