Fresh Thoughts #108: Using AI: Over Confident & Over Generalised

    Newsletter
Walker At Night

You wouldn't be standing there if it was daylight.
Why not?
Because there's a 400ft drop right next to your left foot.

That was the conversation at 10:45 pm on the fourth night of my Summer Mountain Leader training.
While practising night navigation, I was told to stand within 2 meters of an exact grid reference.

Counting steps.
Using compass bearings.
And using the shape of small hillocks, I could just about see by torchlight...
I got to the location.

I knew there was a cliff.
It was marked on the map...

But in all the detail work, I had missed the contours.
The 510m and 250m contour lines were just 150m apart.
With the cloud cover, I couldn't see a thing.
It looked like a sea of black.

That's a pretty long way to fall...
Maybe you should step away from the edge.

This was a personal near-miss resulting from over-confidence.

Overconfident AI

I remembered this experience while watching the excellent video A cautionary tale about ChatGPT for advanced developers.

https://youtu.be/6CGtwF_5kzY

The video is an insightful, experience-based story about how ChatGPT has profound strengths - but also significant weaknesses that will catch the unwary.

The story mirrors my own experiences working with ChatGPT and other AI systems.

Filling Knowledge Gaps

I have been impressed by ChatGPT's ability to accelerate daily tasks and learning.
Once you know how to interact with it, ChatGPT is great at boring, mundane work:

  • Fixing those annoying syntax issues
  • Understanding the meaning of error messages
  • Writing boilerplate to set the structure of a project

But ChatGPT excels in its capabilities to smooth out small knowledge gaps.
And provides excellent introductions and pointers on topics you aren't familiar with.

I love asking questions.

  • I'm trying to [describe objective]. How should I approach the problem?
  • What are the most common ways to solve [describe problem]?
  • Tell me more about [insert topic].

But here's the catch...

What is a "small" knowledge gap?

In the video, the developer asked ChatGPT to fix a bug.
The first suggestion didn't work...
And then GPT came out with the usual suggestion...
Make sweeping changes and a complete rewrite of crucial parts of the project.

I have made the same request of ChatGPT many times, and the response is very familiar.
When working on the details, it would be easy to follow ChatGPT's lead and focus on rewriting the work.
And, for the unwary, it would be time to walk off the cliff.

However, the developer better understood the context and intended outcome.
He chose a different path.
Crafting a halfway solution and asked for a review.
ChatGPT provided sycophantic praise.
Your proposed solution is an elegant approach...

But when challenged on why ChatGPT didn't come up with the answer, it suggested the solution was too specialised, and its answers are focused on more generalised, less complex solutions.

Analysing Malware

I have experienced this type of over-generalised response from ChatGPT in many situations.
Most recently, I asked ChatGPT to decode some obfuscated Javascript.

Unsurprisingly, it couldn't decode the Javascript malware.
But the way it responded was interesting...

It started by reformatting the whitespace of the code to make it readable.
Then, provided a generic summary of the code and explained that the code was obfuscated or "packed" code.
Then it stopped.

No matter what I tried, I couldn't nudge it into decoding any aspect of the malware.

Moreover, it actively disagreed with approaches I suggested - that I later proved to be a step in the right direction.
No matter what I tried, ChatGPT could not take the endless meandering dead-ends that eventually led to a solution.

Final Thoughts

It is essential to remember, and I fully understand...
ChatGPT is doing nothing more than predicting the next few characters in a sentence...
It has no understanding...

But it is, nonetheless, a very impressive statistical engine.
And for the general or generic it is a valuable tool.

But for the specific, specialist, and opinionated tasks...
Beware of the cliff.

March 5, 2024
3 Minutes Read

Related Reads

salt marsh

Fresh Thoughts #96: Robust or Resilient?

Robust. Resilient. Words sprinkled into IT presentations and marketing, but what do they actually mean?

Fresh Thoughts to Your Inbox

Fresh perspectives on cybersecurity every Tuesday. Real stories, analytical insights, and a slash through buzzwords.

We'll never share your email.

Subscribe to Fresh Thoughts

Our weekly newsletter brings you cybersecurity stories and insights. The insights that help you cut through the bull.

We'll never share your email.

Resources

Fresh Security Support

Your Questions

Blog

Fresh Sec Limited

Call: +44 (0)203 9255868