Mumsnet sues OpenAI over copyright infringement

It's the first British human-made media org to take legal action against the generative AI poster child

Jul 22, 2024

WHAT’S HAPPENED?

MUMSNET, the influential parenting network, has launched legal action against OpenAI accusing it of scraping its six-billion-word website without consent and breaching its copyright. In a post Mumsnet CEO Justine Roberts said the legal action — the first taken by a British human-media org against the generative AI poster child — followed talks with OpenAI that failed to progress towards a licensing agreement.

Roberts felt there were “very good reasons” why the AIs should train their large language models (LLMs) on Mumsnet’s conversational data, saying the website provided a “unique record of 24 years of female conversation about everything from global politics to fashion to relationships with in-laws”. “By contrast the majority of the content on the web was written by and for men. AI models have misogyny baked in and we’d love to help counter the gender bias likely to be present in many of them and raise women’s voices. Their response was that they were more interested in datasets that are not easily accessible online,” said Roberts.

Roberts contrasted “the theft of online content for model-training” with crawling conducted by search engines. While Google offered a “clear value exchange” by delivering traffic to websites the LLMs were powering chatbots such as ChatGPT using “scraped content from the websites they are poised to replace”.

Roberts said Mumsnet was “in a stronger position than most” media orgs since much of its traffic — it had 34.6 million visits last month, according to Similarweb — came direct. “If these trillion-dollar giants are simply allowed to pillage content from online publishers — and get away with it — they will destroy many of them. Everything that’s unique and brilliant about sites like ours will be lost, and a handful of Silicon Valley giants will be left with even more control over the world’s content and commerce,” said Roberts, who co-founded Mumsnet in 2000 after working in finance and later as a sports journalist.

“We know that taking on a multinational giant like OpenAI, with its $3 billion of revenues, is not an easy task in the face of the huge resources they’ll throw at us, but this is too important an issue to simply roll over. Not just for Mumsnet but for every website you’ve ever landed on for news, advice or simply to ask if you’re being unreasonable,” she added. According to The Times, Mumsnet is demanding that OpenAI stops using its content and deletes it from its training data and is claiming for a breach in its terms of use as well as database rights infringement.

In January, OpenAI said in evidence to a House of Lords inquiry that “it would be impossible to train today’s leading AI models without using copyrighted materials”. In March, OpenAI CTO Mira Murati admitted Sora, its hyper-realistic video generator, was trained on “publicly available data” as well as licensed content. And in June Microsoft AI chief Mustafa Suleyman claimed that since the 1990s a “social contract” had existed giving anyone a “fair use” defence under copyright law to copy, recreate or reproduce content on the open web. “That has been freeware, if you like,” he said. No such contract has ever existed.

WHY SHOULD WE CARE?

✨ Bravo Justine Roberts! This is EXACTLY the stance that aggrieved online publishers should be taking against the AIs and their breathtaking arrogance. The AIs are desperate to talk up the capabilities of their latest shiny models and the progress they’re making towards artificial general intelligence (AGI – the point at which AI has or exceeds the capabilities of a human being). But they’re less keen to talk about how those models have been trained, even though they’ve frequently said the quiet part out loud. Roberts’ principled stance makes clear the AIs present an existential threat to web publishers since they’ve plundered their copyrighted material to create substitutive products — the precise charge levied against OpenAI and its principal backer Microsoft by The New York Times in its lawsuit. OpenAI’s dismissal of Roberts’ approach exhibits a callous disregard towards a unique and important perspective (its female audience creates 8 million posts a year), and ignorance of the power and influence that Mumsnet has (it’s made several UK prime ministers quake in their boots). Mumsnet could have been the perfect partner for OpenAI. Instead it’s created a formidable adversary.

More on this and other generative AI developments that threaten to reshape the human-made media landscape in Friday’s Weekly Newsletter.

💯HUMAN MADE. No hallucinations, confabulations or fabrications here! 😇

🙏 Please help us grow the Charting Gen AI community by sharing with friends, colleagues and industry contacts.

Mumsnet sues OpenAI over copyright infringement

It's the first British human-made media org to take legal action against the generative AI poster child

WHAT’S HAPPENED?

WHY SHOULD WE CARE?

💯HUMAN MADE. No hallucinations, confabulations or fabrications here! 😇

Discussion about this post