Last week, I wrote about how many of the threats businesses and schools face using Generative AI are the same as using any other web-based application.
This is true for most threats...
However, some aspects of Generative AI will cause problems for the uninformed.
Threat 6: Data Leakage Via Prompts and Inputs
In 2010, Andrew Lewis first posted the phrase, "If you are not paying for it, you're not the customer; you're the product being sold.".
And it's almost a decade since research found Facebook knows you better than your friends do.
Will the generative AI boom be different from social media privacy practices?
Well… it depends on which version of the product you are using.
OpenAI (ChatGPT)
If you're using the consumer-grade version of ChatGPT, by default, everything you send to OpenAI is used to train new OpenAI models.
You are gifting your data, just like on social media…
However, you can opt-out and prevent OpenAI from using your data like this by following the instructions in How do I turn off model training (i.e. "Improve the model for everyone")?
The situation for customers subscribing to the OpenAI Enterprise plan is different.
By default, no data is used to train OpenAI models. Moreover, you can agree on how long your data will be retained.
This is the ideal situation, but few organisations have this agreement in place outside of Fortune 2000 enterprises.
Anthropic (Claude)
Typically, no prompts or input data is used to train Anthropic models. That is unless you provide feedback on the quality of the generated output or your prompt is flagged for trust and safety review.
While this sounds like an ideal situation, when reading the Anthropic website, I am left with the question - What are you trying to hide from me?
It may be the writing style, but I ended up using ChatGPT o1-preview to dissect the logic in some sentences and find counterfactual loopholes.
Threat Mitigation:
Threat 7: Compliance and Legal Risks (e.g., GDPR Violations)
If you're not using customer data in your generative AI work, compliance - especially GDPR is a relatively minor threat.
However, if you plan to use customer data in your generative AI projects, I strongly urge you to complete a full legal review. Four significant GDPR-related issues you may run into are:
Threat Mitigation:
Threat 8: Misuse of LLMs for Plagiarism
The US court system is working through the backlog of high-profile copyright claims relating to Generative AI.
The outcomes of these cases will take several years to resolve.
However, I am pessimistic about the success of several of these cases, which are based on the false understanding that the copyrighted works are stored within the AI model.
In the meantime, there is a stark difference between the legal protections different licenses offer customers.
Notably, Anthropic have two different terms of service.
Anthropic's Commercial Terms of Service indemnify its customers against copyright claims. However, six areas of exclusion drastically constrain the scope of the indemnity.
However, as a consumer, Anthropic offers no warranties, and liability is limited to £100 or the cost of a 6-month subscription.
While OpenAI uses a single Service Terms document, the situation is essentially the same.
If you have an OpenAI Enterprise or Team account, you have some indemnity with similar constraints to Anthropic's. But if you are on the consumer plan, there is no indemnity.
Threat Mitigation:
Generative AI offers crucial opportunities to increase productivity and create new products and services in the years ahead.
But... please... don't put customer data anywhere near it.
So many potential barbs of GDPR violations are waiting to snare you; at present, it's not worth the danger.
Should you use the cheaper, consumer-grade versions of the service? They are fine for learning about the capabilities of generative AI.
But for serious businesses, the value proposition of the vendors' Enterprise plans should read - "All of the legal and compliance protections you must have ...and a little more capacity."