AI generated content and charities

This article outlines the state of art for AI written content, scans near-horizon capability and threat and sets out some discussion points for AI created content and how it sits in the charity world, both for fundraising and topic information.

1. Introduction

This article outlines the state of art for AI written content, scans near-horizon capability and threat and sets out some discussion points for AI created content and how it sits in the charity world, both for fundraising and topic information.

I rewrote a white paper that I have been circulating internally at work to make it more generic.

2. Why now?

I have been watching this space for a while and follow organisations like the Lovelace Institute and the ODI as well as reading longer-term analyses of this landscape (eg beyond the immediate marketing horizon).

What compelled me to write this was the fact that the  SEO lead and I were pitched a pilot project from a supplier that would use our search data (‘graph’) to feed a language model that would then create paragraphs and selections of paragraphs that could then be edited, rewritten as so on. The idea was this would generate content that would then be rewritten – a ‘helping hand’ model that combined data analysis and something like the GTA3 language model.

And this guy (an obvious 'data double' for my demographic) keeps turning up in my instagram feed:

I did initially read this as new AL creates content for you...

So if we are getting pitches that want to pilot this with us from savvy suppliers then there will be a barrage of these kind of pitches and services coming in the next year and then beyond.

3. Current state of the play

Let’s take a very quick tour of where we are with the capability of AI created content without thinking about the charity context.

So let’s set the bar: AI content doesn’t need to be real time (though some is) the test is the classic Turing test; if content generated by a digital entity can not be distinguished from a biological entity then it passes the Turing test.

For our purposes it doesn’t matter if the AI is in any way ‘intelligent’ it just has to produce results that will work (this is the ‘Chinese room’ thought experiment).

In short can a digital thing produce content that will ‘pass’?

Textual content

Let’s start with real world examples of text related tasks: First generic advertising copy generation, then a summary, then a generated outline:

Many more examples here: https://beta.openai.com/examples

These are from a is a paid for service already in use via an API that capitalises on a vast generalised Language Model called GTP3; GTP4 is due soon.

Straight forward uses include content moderation, keyword indexing, FAQ writing, chat, but this is already real time and it is conceivable that sites already use AI not only to generate text but to generate on the fly customised content for segments. I think I read somewhere that football results in the UK are AI generated - you can see how that sort of domain would provide a great use case for AI content; defined players with long histories, a limited number of outcomes, masses of stats. That, coupled with a model of your readership, could generate different versions of football results for readers, say, or The Mirror and The Guardian using languge, grammar and analogies to suit.

It will become common place for companies to use this tech to keep content ‘fresh’ or to generate pages and pages of keyword enhanced search-friendly pages into ‘SEO Bombs’.

As this makes gaming search trivial, Google and other search providers are onto this and we will come back to the response later.

Images

I took three minutes to generate these from a publicly available, copyright free image AI bot called ‘Stable Diffusion’:

I don’t know why I chose that phrase (!) but never the less you can see than in a couple of minutes I can create a batch of fantasy book-cover worthy images that I could use free of charge in any medium. If you look closely you can see extra legs and feet pointing the wrong way and so on but a five-legged horse is a nice addition given the search team - of course the angel of death has a mutant horse.

These are not composites of existing images they are combined ideas of images rendered in a particular style or aesthetic model – for copyright purposes they are original images under a fair use licence (controversial).

Try it here: https://huggingface.co/spaces/stabilityai/stable-diffusion]

Let’s try another one, something closer to home (I work for a cancer support charity):

Again, copyright free, and free. If you look at the details you can see that things go awry fairly quickly but for free and in seconds this is pretty extraordinary. A couple wouldn't bother anyone at thumbnail size.

This is an open-source project which aims for inclusion and has built in safety checks and combines an image meaning and aesthetic ‘judgements’ to the model. There are ethical arguments already surfacing about artists’ images that have been fed into the model, but these probably wouldn’t trouble most organisations or businesses.

Which is all to say that the generation of images by AI will be commonplace soon. If rendering was fast enough you could expect to be able to serve custom images in close to real time (and think of the marketing potential of that – Localised pictures of fundraising based on the location and demographic of web user). Commercial versions are close to market.

Note that in the five days since I started this Meta has showcased a (fairly crude) text to video service. And DALL-E, has opened for general use (a higher resolution text to image service).

Conversations with the dead

Ok this is really interesting a company is already selling a service online where you are interviewed by a bot then, when you die, people can ask the virtual you questions that it will answer using content trained on the content model of your stories so you can 'chat with grandma' when she's gone.

No, really: https://www.hereafter.ai/

Taste does not stand in the way of making money.

Potential scenarios when AI content becomes common

Enough of the capabilities - they will just keep coming. Lets look at the ways this could play out in a charity looking to fundraise, provide end-user information or grab share of attention for influencing:

1. Market share drops due to increased pace of targeted content production

Other charities or businesses use AI written content to create content very quickly based on current search trends and social ‘ideas’. Eg seeing spike in search around cannabis oil I use AI to create a whole bunch of content around this using search-bait titles. Why you should consider cannabis oil for your breast cancer recovery plan.

The impact of this on search is huge. If someone is creating content daily, automatically, they will eat your monthly blog posts for breakfast.

2. Someone starts creating core content trained on specialist information language models

If I was a small charity or (worse) someone selling pseudo cures I could create search-bait content using the above methods but use (or hire) language models based on existing cancer information to ‘train’ against. Cancer models, models that target pet lovers, language models designed to appeal to legacy donors who like the National Trust. I could use all of that to write content specifically to that audience that just happens to mention my product, or indirectly recommend it.

3. Someone uses domain information to create 'copyalike' information language models

To be fair Google already does this for the means of search, but I am thinking about a commercial entity that takes all the text off, say, NHS, Macmillan, CRUK, Breatscancernow, Prostate UK and creates a language model trained on those datasets and then either sells, rents or uses that model to train content with biases towards sales vectors or political views.

It's pretty easy to imagine a radical group creating content that uses an audience segments social for targetting abnd profiling, takes their views, 'writes' about a domain that is influential and then shapes the story - Cambridge Analytica but with automatic content creation. With enough computing cycles you could approach real time - or at least fit comfortably into the current news/attention cycle.

4. High quality topic based information becomes effectively free and endlessly repeatable

If someone created this model they could use it to sell content models back to orgs of all kinds (via API for a fee).

I am expecting this to happen across all domains, sport, lifestyle etc so why is charity info exempt?

In this context it becomes trivial to create content that will rank well but will not necessarily be accurate, or will recommend something that we would consider harmful or unproven. Thinking here of the interest of Turmeric in relation to cancer treatment, but there are multiple ways you could bend this information to sell nutritional products, exercise kit, orthotics, counselling… pretty much anything.

In this case the *only* value topic based information has online is the context of use and reputation  – and what else you add to it in terms of services and authority (we’ll come back to this).

Let’s leave aside ethical issues for a moment.  AI content could be used for:

· Summarising content
· Run chatbots
· Generate images to segments on the fly
· Generate test copy for marketing campaigns
· Generate web page text using SEO principles
· Generate news articles
· Create topic outlines for writers and bloggers
· Write proactive content pieces, sales pitches.

In the conscious effort to create an acronym so that I can sell a book :-) I am going to call this FMCC - Fast moving consumer content, a pun on Fast moving consumer goods.

Should AI created content be used by your charity?

Should it? Maybe

Fundraising

In this world it is hard to see moral or ethical complexities for generic images or texts. Doing things like using A/B texts written by AI as a starting point for marketing optimisation present no difficulties. Perhaps advertising agencies already have similar tools in-house to generate ideas and text and they just don't tell us!

Quality topic information

In the informing and educating world it is more nuanced. You can see that the creation of generic images could be beneficial, particularly for inclusion purposes. I think that stock image companies will have to get on board with this pretty quickly or go out of business, so it’s conceivable that we could use AI image creation through a third party like a super-stock service. Ah, I see that there is one AI stock image service already.

Clinical health or financial guidance content

This kind of content is obviously problematic. Assuming that language models gained significant complexity and breadth to generate good content there is the ethical question; should such content be used?  The questions are practical, SEO related and mission/brand based.

Practically speaking – it doesn’t make much difference where text comes from if it is good and accurate. Clinical and grammar checks would ensure quality so it is conceivably practical. My suspicion is that it would be less time consuming to generate the content directly and any efficiency in content writing time would simply be taken up with rewriting and checking but I have no proof of that.

Brand and reputation – If your charity is the human face of it's niche and relies on empathy marketing and brand I suspect that it would trouble people to think of your content as being machine written.

SEO and AI content

I have pulled this into it’s own section because this gets really interesting.

Google most surely know there is an AI tsunami coming (they contribute to it) and a lot of their recent work makes sense when seen in that context, particularly the growing emphasis on EAT (Expertise, Authority and Trust) in search. As search is their money maker they are in a tight spot – if you can’t tell if content is written by a person or an AI then how do you know the information you are linking to is good or not?

EAT extends Google’s search graph by mapping and boosting content with personal and organisational relationships. While it might just look good at the moment that organisations have recognised academic connections if they are publishing health related information in a year or two it will become essential to know. These ‘signals’ to Google will be the only guarantee of content quality. To look at this the other away around, Google has been working on the assumption that all content is human generated but clearly in the (very near) future that will not be the case.

As an example of the power of graph, both as a social and informational construct, just consider the signals that we consume when we go to see a doctor, signals that (hopefully) engender and warrant trust: The NHS logo, the terrible reception system, the smell of clinical cleanliness, the consulting room itself with a bed and sharps bin and that stethoscope. We might map all of that those signals and narrative into a 'semantic graph' that 'means' the NHS.

Online, and in context of digital search, the quality and therefore ranking of pages will be enhanced with similar signals and these will largely be relationships that establish authority. Note that authority has as it’s root word ‘author’. What we will want to show in order to make sure our content is ‘really real’ are things like:

· Names and affiliations of content writers, such as training, qualifications and professional pages, author websites and so on.

· Expertise and academic chops of specialists such as links to academic profiles, Wikipedia pages, memberships of professional bodies.

· Key reference citations in information with links to academic journals where possible.

All of this will enhance the user experience and the EAT of pages and will be essential for search engines to prioritise your content over content which is effectively the same but may be tweaked to create monetisation possibilities.

Given threats above EAT is best way to protect the ranking of your content. Of course brand reputation and so on count but these can change. Of course you can wage your own AI content war instead but this could very quickly bite you on the ethical, reputational and regulatory backside if you are providing information in certain sectors.

Discussion points

As this article is derived from an internal discusison paper I made a number of statements as starting points:

1. AI content will saturate all markets within a couple of years

2. Readers will start to be confused about the provenance of content

3. Readers and search bots will need more indications that content is ‘really real’ and of  high quality

4. EAT is important in the effort to keep our valuable content high up search ranking but it will become vital when AI content starts to flood the market

Fundraising - things to look at

1. We could look at AI created content for ideas to plug into content testing

2. We could accept pitches from AI driven marketing agencies for solutions

Quality information

1. We could look at AI generated content for ‘stubs’ for ideas that get edited or written out by authors who are subject experts – eg as ‘idea generators’.

2. We should make EAT a number one priority for SEO with perhaps the only thing coming higher being Core Web Vitals (AKA site speed checks)

3. We should not consider AI written content in a regulatory context – including translation – until the tech is settled and proven and governance is in place.

4. We must keep an active watching brief on AI content wave about to come in terms of search and user effects.

Influencing

It's conceivable that influenceing and advocacy could turn into the equivilent of contempoary algorhythm driven financial trading; automated, super fast, bubbletastic, prone to 'black swan' effects.

  1. We should be monitoring the space and offers in it

2. EAT is vital in this space too

Conclusion

Its's on the way. Prepare. Get EAT sorted.