In early August, Eric Schmidt, former CEO of Google, spoke at Stanford Engineering about AI.
The video caused some controversy, so it is no longer on the Stanford site, but it can still be found.
The one detail that attracted my attention was Eric's inclusion of "Red Teaming AI" as one of his Top 3 emerging needs in the next few years.
Red Teaming is a crucial part of all advanced cybersecurity programmes and certifications.
And I've written previously about the vulnerabilities AI faces via malicious prompt engineering.
It makes sense that red-teaming AI is important, but is it in the Top 3?
During a Red Team exercise or adversarial test, cybersecurity experts act like hackers to break into a computer system.
Often, the Red Team are given a target to access specific sensitive data or gain access to a particular system.
Red team exercises can be highly emotive - as a single vulnerability can appear to dismantle months of hard work.
However, I have always seen the exercise as a valuable critique of my work.
A fresh set of eyes - like asking a trusted colleague for feedback on a crucial presentation or document.
Red team exercises can be highly effective.
One example was in the mid-2000s - during the dying days of Windows XP.
Microsoft realised they couldn't continue with bug-by-bug security fixes.
Hackers were automating finding vulnerabilities - and they were discovering a lot of problems.
So Microsoft created a team of security specialists to break Windows XP and Windows Vista before the bad guys did.
This work became the foundation of new security features, ultimately making Windows 7 a watershed product that started to rebuild IT teams' trust in Microsoft's product security.
Fundamental security techniques became included in the Windows operating system.
So, to make the future of AI "secure," there will inevitably need to be some Red Teaming.
The whole point of Red Teaming is to learn from our mistakes.
To improve processes and configurations.
To take action and address the criticism.
Unfortunately, with AI, the ability to take action seems limited.
We have no fundamental understanding of architecture or an approach to fixing issues.
So, the industry is using firewall-like permit and block rules to filter inputs based on specific words, which can be bypassed in interesting and devious ways.
For example, some of the ways Red Teams have got around these rules include:
Each one of these techniques is being fixed.
One by one.
This reminds me of Microsoft's bug-by-bug security fixes in the mid-2000s.
I suggest that understanding the inner processes of AI agents and models should be a top-three priority.
So, when a problem is found, a robust security foundation can be applied... rather than a bugfix.