
I have a confession to make. I still subscribe to a local newspaper. The kind made from trees and dropped on doorsteps. It arrives each morning with stories about city council debates, high school soccer tournaments, and sometimes even a lost dog reunion photo that makes my coffee taste better. This habit marks me as increasingly old fashioned I know. But the recent lawsuit filed by the Chicago Tribune against artificial intelligence company Perplexity made me realize something terrifying. That humble newspaper isn't just competing with digital media anymore. It's fighting for its life against algorithms engineered to extract its value without contributing a dime toward creating it.
The legal battle centers on a technology called retrieval augmented generation. Let's unpack that mouthful. Imagine you hire a research assistant who secretly photocopies entire chapters of books from the library, memorizes them, and then writes reports blending that stolen content with their own words. That's essentially what critics claim RAG systems do. They vacuum up copyrighted material from paywalled news sites, academic journals, or subscription databases, then repackage the information in chatbot responses or summaries. The company behind the technology profits. The original creators get nothing.
What makes this court case particularly revealing is Perplexity's alleged two step shuffle. According to the Tribune's lawyers, the company first denied directly training its artificial intelligence models on the newspaper's articles. Then admitted its system might generate summaries of the facts within them. We've seen this dance before. Tech companies leaning on carefully worded denials while building empires atop unlicensed content. It echoes the early years of music streaming, when platforms argued that playing song clips counted as transformation rather than theft.
Here's where it gets personal. When Perplexity's so called Comet browser allegedly circumvents newspaper paywalls to summarize entire articles, it isn't just copying words. It's dismantling a fragile economic model that keeps reporters in jobs. Think about what happens next. Why would anyone pay for a subscription if artificial intelligence can pirate the meat of the story for free? Yet without those subscriptions, newspapers vanish. The uncomfortable truth is that computer generated summaries depend entirely on the expensive human reporting they're rendering obsolete. It's like burning down an orchard to enjoy the warmth, then wondering why there's no fruit next season.
Consumer convenience often blinds us to these consequences. We love asking a chatbot for quick news digests without considering whose labor funded the original journalism. But there's a hidden quality tax here too. Human reporters don't just transmit facts. They verify them through multiple sources, contextualize events with historical knowledge, and occasionally detect when a press release smells fishy. Algorithms tuned for engagement often amplify errors or strip away nuance. I've tested this myself. Ask several AI search engines about recent medical studies or municipal policy debates, and they'll confidently deliver answers that miss critical caveats contained in the original articles. The summaries feel authoritative but scan more like educated guesses.
This isn't just about newspapers versus technology. We're witnessing a fundamental shift in how society gathers and compensates for reliable information. Remember Napster? The music piracy service argued it was simply helping people discover artists, but musicians rightly saw it as theft. The industry eventually adapted with streaming royalties. Yet today's content scraping feels even more dangerous. At least songs remain intact when pirated. News summaries distort and dilute. Imagine a future where local government meetings go uncovered because newspapers folded. Where school board decisions happen in darkness because AI companies extracted all the value from journalism without supporting its survival. That's not hypothetical. Hundreds of local newsrooms vanished in the past decade, creating what researchers call news deserts communities starved of credible information.
Here's what the artificial intelligence companies miss in their race to dominate search. Journalism isn't data. It's a social contract. When you read an investigative report about contaminated water or corrupt contracting, you're seeing the result of months spent filing public records requests, cultivating sources, and yes, paying journalists enough to keep them from fleeing to public relations jobs. Machine learning models treat these stories as free fuel, like oxygen in the atmosphere. They don't account for the cost of creating that oxygen.
The legal questions raised here extend beyond newsrooms. Think about medical journals, scientific research repositories, even recipe databases. Any industry that relies on paid expertise or specialized content now faces algorithmic strip mining. The stakes couldn't be higher. If courts allow artificial intelligence companies to freely ingest and repackage copyrighted material under the banner of innovation, it concentrates power dangerously. Only the wealthiest publishers might survive, emerging as walled gardens while smaller players perish. This would leave artificial intelligence summaries as the dominant way most people access information, even though those systems have no mechanism to fund the content's creation. The internet's promise of democratized knowledge could flip into a reality where knowledge is controlled by three or four summarizing engines powered by yesterday's dying media.
Solutions exist, but they require tech companies to abandon their digital colonialism mindset. Imagine a world where artificial intelligence tools work like radio stations once did, paying licensing fees through collective rights organizations whenever they summarize a story. Platforms could integrate micropayments into their interfaces, allowing users to seamlessly compensate publishers while enjoying the convenience of summarized news. These models already work in music and photography. But implementing them demands something tech giants hate. Sharing profits.
There's a historical echo here worth noting. When Google launched twenty five years ago, publishers initially feared its search results would cannibalize their audiences. Instead, it drove traffic their way. Search became the internet's front door, and publishers learned to game search engine algorithms for clicks. This lawsuit reveals how generative artificial intelligence flips that dynamic completely. Rather than sending users to news sites, summarizing engines keep them inside walled gardens where every query answered means one less reason to visit the source. It's the difference between a librarian recommending books and a clerk photocopying the best chapters so you never need to check them out.
As this legal battle unfolds, watch for three key developments. First, whether courts recognize retrieval augmented generation systems as fundamentally different from conventional search engines in their economic impact. Second, whether artificial intelligence companies will establish licensing deals preemptively to avoid lawsuits, as some have started doing with image generators. Third, and most crucially, whether consumers start valuing human journalism enough to reject convenient plagiarism. Because in the end, this isn't just about copyright law. It's about whether we want a future where truth has guardians, or just aggregators.
My local newspaper arrived this morning with a front page story about overflowing homeless shelters and a profile of a retiring fourth grade teacher. Before artificial intelligence, I might have skimmed the headlines online. Now I linger on each paragraph, knowing someone spent days reporting these stories, and that their work could become training data for some distant server farm. That's not nostalgia. It's resistance. Supporting real journalism with our attention and dollars remains the most powerful algorithm of all.
By Emily Saunders