My previous post on the demise of webpages and the need for AI-native content has elicited good feedback and questions. I wanted to elaborate more on how publishers will need to take greater ownership of AI applications as users visit webpages less and less.
Some questions concerned how consumers will access AI-native content. Many folks imagined that customers would access the content through a third-party AI platform such as ChatGPT, Google, Claude, Perplexity, X, or Microsoft Bing Copilot. That’s certainly possible, but it is not what I envision as the default.
The goal of AI-native content is for publishers to take ownership of their AI pipeline rather than delegate that responsibility to a third party. The result is first-party AI tools, where the process and outcomes are entirely under the control and supervision of the publisher.
In the current era, third parties such as Google scrape webpages, extract information, rewrite the content, and publish it themselves. Most of the web traffic goes directly to Google rather than to the publisher, which is why traffic levels are down.
But numerous risks are associated with the third-party extraction of webpage content. The major one is that the third party won’t represent the content in the same way that the original publishers would. The third party is interpreting your content based on their bot’s internal (often opaque) criteria.
No one will care more about your content than you will. What’s good enough for a third party may be damaging for your organization in some cases. Consider how a third party might get their summary wrong, even if their technology is generally robust and popular with users:
- Leaving out information you or your customers would consider essential
- Using the wrong tone of voice
- Substituting words that have specific meaning to your customers
- Providing misleading information by drawing on similar products or different timeframes that aren’t relevant to the user’s needs
All these potential issues can be quality checked, but only when the AI bot is overseen by the publisher who understands these nuances.
But today, even enterprises that are developing their own AI tools tend to rely on general-purpose third-party platforms that have generic settings, which pretend to provide everything needed in a single platform. Results, unsurprisingly, have been disappointing.
Few publishers have yet invested in the foundations necessary for AI-native content:
- a stable LLM controlled by the publisher that can be tuned if necessary;
- organization of resources according to their role in content generation;
- mappings to other resources the AI engine must access (for example, RAG and MCP connectors); or
- libraries of repeatable prompts, output patterns, and rule engines.
With AI-native content, there are no webpages for third parties to crawl and misinterpret. Third parties can’t mislead customers because they are starved of the source material on which to base their summaries.
Instead, customers will get content directly from the source organization using first-party AI tools.
First-party AI will be a radical shift from past decades, where Google supplied answers directly and was always the first, and sometimes only, port of call. In the post-webpage era, users will interact with many AI bots, both directly and indirectly.
If your enterprise is an airline or a major retailer that customers regularly use, they will access your AI tools via an app. If they are an infrequent or first-time customer, they may start with a traditional search, but instead of getting a full website, they get a URL that is a portal to your AI tools.
It may also be possible for the publisher to directly supply AI-native content to third parties, such as Google or ChatGPT, as a feed. What’s important is that the publishers retain control over how AI-generated content is provided in this scenario. This scenario is unlike the current wave of licensing deals, where certain publishers grant permission for their content to be crawled by third parties in exchange for payment, and where the third party assumes responsibility for generating the summary.
With first-party AI, publishers can gate access to content in terms of topics, details, and quantity.
Already, we see examples in the market of vendors such as Cloudflare offering “pay per crawl” tools to limit AI platforms from using publishers’ content unless those platforms pay a license fee to access the content. This kind of contractual arrangement can easily be extended to AI-native content. And the growing availability of AI connection protocols will support controlling access much the same way APIs do.
For high-value content and interactions, firms will want to steer customers directly to their AI tools, and they will limit third parties from intermediating these interactions.
But for lower-value content and interactions, firms may allow AI platforms limited direct access to their AI-native content. The publishers retain control over how the content is offered but gain wider exposure through the third-party platform’s reach.
For content that is entirely promotional in nature, firms may supply AI-native content to third-party platforms on a fee basis, paying the platforms to show this content in generic queries, similar to how search ads work today. Despite the reliance on the platform for visibility, the publisher retains control over how messages appear, instead of allowing third parties to decide for themselves.
AI-native content enables publishers to provide first-party AI experiences. Publishers can control many parameters of content to ensure that generated content aligns with publisher goals.
In my previous post, I mentioned the need for a new kind of schema to support AI-native content. This schema will be richer than a traditional content model or data model. It will allow the mixing of structured data within semistructured narrative (text, video, audio). It will describe recurring word patterns that should appear in an exact way, while allowing for adaptable text that must merely conform in a general way to style or other governance guidelines. It will allow defined content variables to be referenced by prompts or agents. It may have factual rules against which statements must agree.
While we are still in the earliest days of this transition, I am impressed by how quickly language models have become commoditized and open-sourced, and RAG and MCP tools have become widespread. Medium and large-sized firms now have the opportunity to build first-party AI tools without outsourcing their customers’ AI experience to third parties.
— Michael Andrews