Categories
Content Engineering

The death of the webpage and rise of AI-native content

The internet is undergoing its most fundamental shift since the rise of the World Wide Web in the 1990s. The shift is so significant that many content professionals don’t appreciate how radically it will change current practices. The webpage is dying, yet what will replace it has yet to be defined.

Organizations have been building web pages for three decades — only a few of us remember the internet before webpages. Webpages are all that most of us have ever known.

AI promises to make building webpages even easier. Some experts imagine AI will trigger an explosion of webpages, increasing their number manyfold. According to this thinking, AI will make it easier to build personalized webpages. We will finally realize the dream of having webpages designed for an audience of one.

That vision is one embraced by developers, for whom building webpages has been the major preoccupation.

But AI isn’t revolutionary because it makes doing the same thing easier. AI is disruptive because it changes user behavior — not developer behavior.

Already, we see evidence that visitor traffic to webpages is down significantly. Users aren’t that into webpages anymore.

It would be a mistake to assume that if webpages became more personalized, people would visit them more often. The website is a declining channel. There is little possibility that it will regain its historic status.

AI bots and agents can provide information more directly than a webpage. Publishers are working on how AI can:

  • Answer questions and provide updates
  • Book travel or tickets
  • Plan tasks
  • Find and buy the best product
  • Solve customer problems

These topics are the bread and butter of webpages. As attention spans get ever shorter, information must be delivered immediately to be used. Hardly anyone wants to scroll through a webpage anymore if they don’t have to. That’s especially true for users whose expectations have been conditioned by algorithmic feeds such as TikTok.

Some will doubt that webpages will be displaced. They believe that many people prefer webpages over other channels. Or else they believe that webpages will remain necessary.

Sceptics imagine AI bots and agents will be just another channel, and that webpages will remain vital in the future.

In the short run, as the internet undergoes its dramatic transition, we can expect a mixed environment, with AI and webpages coexisting. Organizations will need to support both.

But in the longer term, webpages will become unnecessary for most topics currently addressed by websites. This could happen faster than many people expect.

Three factors will influence how quickly webpages disappear:

  1. How quickly user behavior shifts to the adoption of AI tools
  2. How effectively AI bots and agents can address both customer and enterprise tasks
  3. How readily organizations can support AI bots and agents without creating webpages.

The first two factors are somewhat interrelated. User adoption depends partly on the quality of AI tools in addition to their perceived convenience. The evolution of tools will depend on practicalities relating to AI infrastructure, such as models, orchestration, and ecosystems. While uncertainties remain with both, the amazing strides realized already and phenomenal investments underway suggest that progress on both will continue.

However, the third factor — webpage-free content for AI — remains largely unaddressed.

Unfortunately, the AI engineers developing tools have little expertise in content management. Their tools assume webpages, at least as an initial input. Webpages are published to be crawled, before being tokenized.

But it makes little sense to publish webpages when the webpage’s main audience will be bots seeking to crawl them. It would be more ideal to create content in a format that AI tools can use without needing to convert it.

AI methods and protocols require access to information in machine-readable formats. While they process human-readable text, they also rely on parameters that convey information about the text.

What’s been missing is a definition of what constitutes AI-native content. So far, most efforts have been focused on retrofitting webpage content so that AI tools understand it. Examples include:

  • Creating parallel LLM.txt pages
  • Adding additional schema.org structured data to webpages to help orient AI bots
  • Training bots to understand XML tags, such as the NISO Standards Tag Suite, used to exchange or assemble content that becomes webpages

None of these approaches is truly AI-native, because they still presume the creation of webpages before making content available to AI tools.

Once legacy webpages have been made AI-ready, the focus will shift to how to create new content efficiently — how best to create AI-native content.

Some original content can be database-generated. But much narrative content will still require editorial oversight. Writers will need to decide on the messages to use, the emphasis of information, and the best phrasing. And some new content will be unique in the sense that it isn’t derived from prior content, and must be drafted by humans.

We are still missing the content authoring and management tools that support the development of AI-native content. Human writers need guidance on what bots need so they make good decisions and don’t get confused. Bots require predictability that the information needed to address a question or task is available.

The current approach of creating more webpages and expecting bots to untangle them and find what’s needed won’t be sustainable. Bots are thrown off by duplication. And crunching through repetitive webpages is wastes time, money, and environmental resources.

We have many of the pieces to build an AI-native content development system. (I’m not calling a CMS, since CMSs are intrinsically linked to the website era we are leaving.)

What will be needed is to combine:

  • AI editorial writing tools such as prompts, guidelines, and information QA
  • Schemas to shape message fragments, roles, descriptions, and provenance information (and also build on the granular structures developed by the translation industry)
  • Connection points to protocols for agents and other databases

AI-native content will be very different from the webpage era. Content creators who are comfortable thinking in terms of small pieces will be well-positioned to make the transition.

— Michael Andrews

Categories
Content Experience

The context of content

As content becomes increasingly fragmented and modularized, its context gets lost. Many people advocate for understanding the context in which content is used, but they have different ideas about what the context of content represents. Current technological approaches to managing content pieces don’t address the full range of contextual dimensions.

Distinguish the delivery and the discovery contexts

Much discussion of the content context revolves around what the user is seeking. The concern is to get the right content to the user when they seek it. I refer to this as the delivery context.

The delivery context is about matching what is known about the user with the dimensional variables of the content. For example, we ideally want to know who the user is, what they know and have done already, their goal, and so on. With that information, a publisher can select appropriate content according to the topic, level of detail, formats, and perhaps even messaging.

An entire industry of content orchestration has emerged to develop insights and practices to address the user’s delivery context. It’s by no means a simple problem to solve, but it is one that promises great profits to marketers and others who want to push content to customers.

But another contextual dimension of content gets less attention: the discovery context. Users aren’t always waiting for the right content to find them. In many cases, a stream of predecided content is antithetical to what they seek. They need to define their own journey to discover what they should know more about.

Discovery supplies the context of meaning

Content does not always convey its meaning clearly, especially when statements are skimmed or read in isolation. Users may not understand something about the content, or they may read meanings into the content that aren’t explicitly stated. The discovery context concerns what users may want to know beyond the statement seen.

As content is increasingly decontextualized — appearing as snippets rather than as long-form articles — users must discover the missing context themselves. They must supply explanations beyond what is conveyed by the string of text.

The discovery context consists of three dimensions:

  1. The context of collective understanding
  2. The authorial context
  3. The associative context

Collective understanding is the layer of knowledge the author might presume the reader knows, whether or not that is true for a specific individual reader. A statement might refer to people, places, dates, metaphors, or other things that are not described, only referenced. The reader is expected to understand these items and look them up if they don’t.

The authorial context refers to the broader context from which a statement was lifted. Authors can be quoted out of context. Bots pull snippets or will paraphrase the source material. The danger is that such text selections end up vague or misleading.

Even when a text snippet is an honest summary of what the author wished to convey, it may not be clear what point they were trying to make with the statement. Was the statement a claim or assertion, a warrant or evidentiary rationale, or the grounds or justification for their argument? In other words, what was the role of the statement, and what did the author assume the reader already knew when they made it?

The associative context concerns the broader context in which the user evaluates statements. What else is similar to this statement? How does it fit together with other statements?

The associative context becomes important as users rely on content that’s abstracted from its original source. They utilize snippets of content that have been compiled by others or by themselves. The associative context is a defined layer of curation, collecting related items together. Such curation provides meaning to users by allowing them to recontextualize the content fragments.

A simple example of an associative context is the highlights from a book. These highlights can be kept together to recall the key points of a book, but they can also be combined to compare how different books and authors discuss similar or related topics. The snippets by themselves convey limited information, but collectively they tell much more.

Discovery remains undersupported

Although there is an evident trend in content toward providing direct “answers” to users, it is also clear that such approaches can’t be “zero-shot” ones, where users must settle with the predefined answer the bot offers. Digital content is becoming more conversational and dialogical, allowing users to ask follow-up questions and get clarification.

Bots offer opportunities for discovering information, but the situation currently remains fragmented. Generative AI seems split into global solutions that can supply information relating to collective understanding (but not specific sources), and localized solutions that can answer questions about specific sources (but not general knowledge). To a large extent, the split seems driven by vendor advocacy of preferred models and technologies (big vs small models, and KG vs vector RAG) — debates that interest only engineers, not ordinary users.

Users have limited opportunities to build their own knowledge base and define their own associative context with these tools, which largely lack memory of what users have told them in the past.

Rather than expect a single technical solution to solve everything, we would be better served by having the freedom to compose our own suite of tools. Ebook devices provide one inspiration: they allow users to add their own dictionaries, notes, and highlights, and export snippets elsewhere. Google’s NotebookLM paradigm also points to ways to bridge local and global capabilities. Eventually, we may have many AI capabilities built into our browser.

Personal knowledge management may eventually succeed the list-making technologies of personal information management. Before we feel comfortable delegating tasks to AI agents, we will need to be confident we understand what we want and what’s available.

— Michael Andrews