Categories
Content Engineering

How will bots see your content?

Your customers aren’t that into your website anymore. Most websites have noticed a drop in traffic as users query bots and bots supply answers. Bots generate few clicks to web pages, and the proportion of referral clicks seems to be falling.

Web publishers are aware of the existential threat they face. So far, they’ve tried to make themselves more lovable for bots. They scheme to get noticed by bots (GEO – generated engine optimization). Or they try to make their pages “friendlier” for bots (Google’s WebMCP is the latest example). The legacy thinking still frames the problem as one of visibility — getting noticed in a crowd.

Yet bots aren’t people, and don’t need to be wooed. The old psychology of wooing is no longer relevant. If bots need something, they will take it from your website, whether you invite them or not. In many cases, they will take content even if you don’t want them to.

The problem websites must solve now is how to ensure bots extract the right content from your site. And your existing HTML content, built for web browsers and surfers, isn’t what bots need, if your organization cares about ensuring the accuracy and relevance of what bots provide. JavaScript, the foundation of websites, is a liability for bots.

AI platforms are evolving quickly. They are pivoting away from indiscriminate web scraping for “training” and towards RAG, where they search first for information before generating answers. AI platforms have also embraced the Model Context Protocol (MCP) standard, which, when enabled, allows them to access enterprise content directly. Already, third-party MCP platforms such as Scite and Tollbit have emerged to connect content publishers with AI platforms.

Publishers will continue to publish webpages for human readers, but they need to ensure that AI platforms access the right content for bot users. The best practices for doing this are still emerging, and several initiatives are underway to define protocols and standards.

What’s becoming apparent is that MCP will play an important role in controlling bot access and content governance. The diagram below illustrates a potential content pipeline for a scholarly publisher. A similar pipeline might be adopted by a website publisher — but some additional steps are needed to transform HTML-centric content into bot-ready content.

Example pipeline. Source: Scholarly Kitchen

How are publishers getting ready? Let’s look at how Tollbit helps web publishers. Tollbit works with the Associated Press and other publishers to make their content ready for AI platforms.

The first task is to “clean” the web content to remove material that’s not relevant or canonical. This can be done through DOM filtering to exclude certain classes of content, such as navigation text, promotional assets, or customer comments.

Additional filtering can be done by excluding pages or directories that are procedural or administrative rather than substantive in focus.

Next, the content should be transformed by removing clunky HTML tags to convert the content into a bot-readable format. Many organizations opt to convert content into Markdown, which preserves heading hierarchies (useful for bots) while striping away extraneous markup that bots don’t need.

Bots benefit from metadata, but need help identifying it. The content transformation process should address metadata that’s not visible to human readers. This includes descriptive metadata (such as schema.org) about the content for external systems like search engines, and internal administrative and technical metadata (such as geolocation coordinates) used for web page delivery. This conversion, known as re-serialization, makes the metadata queriable. The metadata can be “hydrated” into the bot’s payload.

AI platforms, ever motivated to increase the sophistication of their products, will take advantage of these content enhancements.

Getting content “bot-ready” will become crucial as AI platforms expand their agentic capabilities. Publishers will need to define access rights and permissions. What materials can bots read, re-publish, or process?

Publishers will shape these affordances through both explicit statements and implicit decisions that influence the ease with which bots can perform actions.

— Michael Andrews

Categories
Content Operations

Developing a new mental model for AI content technology

This post is motivated by irony and frustration. AI is supposed to make work simpler.  But disagreement about what AI does and how it works is more common. 

Prospective users of AI technology are bewildered as their capabilities become more elaborate, described by ever more specialized jargon.  AI tools invoke a confounding array of metaphors: contexts, arguments, commands, plans, recipes, skills, subagents, and negotiations. There’s no common information architecture defining AI capabilities. Vendors and initiatives propose their own terminologies as they introduce new features and options.  I haven’t yet come across a vendor talking about AI wallets, but I won’t be surprised to encounter that term soon.

AI-managed content does represent a new paradigm, but we don’t yet have a shared mental model of how it works. 

What is clear is that prior mental models for content management aren’t aligned with the new paradigm.

Evidence of a paradigm shift in content technology

Advanced web content technology has, for over 25 years, been dominated by two paradigms, both focused on automating content.  The first, structured content, was introduced by IBM’s development of the XML-based DITA and updated for the API era by a wave of headless CMS vendors ten years later.  The second, the Semantic Web, debuted with the development of RDF standards at W3C and was later operationalized by Google’s championing of the schema.org structured data vocabulary for web content. Only a small subset of writers — self-described content engineers — worked with these technical details.

The debut of ChatGPT in 2022 introduced a new paradigm centered on natural-language prompts with LLMs.  Language, rather than code, took center stage. Overnight, every writer was directly engaged with the latest content technology.

Momentum has quickly shifted toward natural language technologies: LLMs to generate content and agentic AI to coordinate workflows. IBM long ago dropped DITA, but has lately reinvented itself as an agentic AI powerhouse with soaring stock values. Google has downsized its schema.org operations and reinvented itself as a leader in LLMs and cloud AI. 

The vanguard of content technology is now LLMs, not content automation. Two venerable technology giants, pioneers of the older paradigms, have pivoted dramatically (and successfully) to the new paradigm. 

How LLMs change content structure and semantics

Computers depend on instructions to do tasks. Text and code are two distinct ways of representing instructions.  Because LLMs can understand plain-language instructions, they provide an alternative to computer code to direct computers.

By relying on language, rather than code, to develop content, LLMs are far more human-centric than machine-centric automation.  That means that LLMs approach the structure (organization) and semantics (meaning) of a piece of content in a way that’s similar to how humans do. LLMs treat text as knowledge.

The human-centric and the machine-centric approaches to content involve distinct mental models. 

A human’s mental model of content involves editorial structure and the meaning of words. Writers develop and draw on resources to help them develop new content.  These include examples of other content, templates, style guides, message patterns, and so on.  Because LLMs process content as words, they can use these same resources when generating content – but at scale. The foundation is what we will call the “text base.”

The mental model that engineers have when dealing with machines is vastly different, centered on the “code base.”  Engineers use a range of tools (databases, code procedures, APIs, etc) to manipulate content. Because these tools don’t understand plain language, the content is translated into machine-interpretable objects, such as a Document Object Model and entities. Engineers’ mental model is to treat content as data.  

As LLM-based technologies continue to develop, we see increasing overlap between what LLMs can do and the tasks handled by code-centric technologies. LLMs can generate a document structure based on existing document examples rather than relying on code-based assembly logic.  LLMs can recognize the meaning of words without needing to check a schema entity reference.

This means that LLMs are a partial substitute for traditional code. LLMs can assemble text and evaluate words, tasks previously handled by conventional code.

LLMs handle these processes very differently from traditional content automation approaches, which brings both benefits and drawbacks. And while LLMs look at text in ways that are analogous to humans, their processes are quite different.

The most significant difference between humans and LLMs, on the one hand, and content automation, on the other, lies in their scope of action. Humans and LLMs can generate novel content, while traditional content automation can’t.  Humans can give LLMs open-ended instructions to create something that isn’t a variation of something past, but a distinctively new output that takes inspiration from many sources and ideas.  Traditional content automation can only accept closed-ended instructions that generate routine decisions. Whether this determinism is a virtue depends on the use case. 

Different ways of looking at parts and wholes

Humans, LLMs, and traditional code approach content at different levels of granularity.  Humans absorb information based on prior knowledge and Gestalt.  LLMs simulate these approaches through vector distance.  Traditional programming, by contrast, evaluates through decomposition, where each item is assessed within a procedural routine of varying complexity; sometimes routines are short, and at other times they call external routines.

For many years, content professionals have used the term “content chunking,” but this metaphor can gloss over differences.  Humans chunk content based on units of recognition: grouped information they perceive and recognize as related. Traditional code encodes structure into content to support computer operations.  The encoded content structure may not match the cognitive structure most humans perceive.  LLMs also rely on chunking to break text into segments that reveal word context.  How LLMs chunk text (by sentence, paragraph, or document, for example) will influence the performance of the LLM.

The mental models of writers and engineers differ in whether they think in terms of narratives or data. Narratives tend to be holistic, while data is atomic.  This divergence has implications for how ambiguous the context may be to different parties.  

Ambiguity is not an absolute quality that’s always bad, but a relative, contextual issue. A statement could be clear to an insider with contextual knowledge but baffling to an outsider who lacks that knowledge.  

Eliminating potential ambiguity can be costly when it results in redundant information that isn’t needed by consuming parties. Determining whether content is unambiguous depends on knowing the audience. Misjudging the audience results in either instructions that are overspecified or underspecified. 

Insiders understand information and concepts that outsiders don’t. The relevance of text depends on a determination: Is the content intended for people with insider knowledge, or should it assume that outsiders with no prior knowledge will rely on it?

Narratives often assume prior, insider knowledge.  Data also needs context, but it will often be more explicit.  The table shows examples of narratives and data that presume insider or outsider knowledge. 

Narrative-focused informationData-focused information
Insider knowledge: Ambiguous to a random reader, an LLM, or an autonomous IT systemA complex sentence with a referent to something said earlier, or assumed to be understood by the intended readerPair values (field name and value) where the reader, LLM, or machines don’t understand the meaning of the field and/or the value (mystery schema).  
Contractual statements: Unambiguous to outsidersA simple declarative sentence with a clear subject, verb, and object, such as many legal contracts.Structured data based on a declared, referenceable schema and resolvable entities or values

LLMs, like people, may not understand a statement outside of its context. But it’s demanding for everyone, writers and readers alike, to develop and consume context-free text. Legal contracts are tedious because they attempt to define every term. Such documents must stand on their own.  

Yet, contrary to some engineers’ beliefs, the root problem is not that language is inherently ambiguous, whereas data is not.  Data can also be ambiguous. Machines often can’t understand the structure of content or the meaning of data without an engineer’s oversight.  The API era assumed that engineers would write queries directly after reading and understanding API docs. If they couldn’t find the answer in a “read me” doc, they would post a comment or question in GitHub or Slack.

LLMs can’t understand data schemas or interpret the meaning of data fields and values because they lack context about what that information represents. This problem places a greater burden on documentation for APIs, data schemas, IT systems, and protocols.  Can LLMs access this information, understand its relevance, and use it to guide how they perform tasks?

Another issue with instructions is precision.  Developers assume that code is more precise than language.  Just because an instruction is expressed in natural language does not mean it is not deterministic. Highly prescriptive instructions can be written as text, though they are prone to being verbose. 

Most recently, agents have emerged as brokers between people and machines. Can they make content work more frictionless?

How agents deal with people and machines

AI agents have cemented the new content technology paradigm by making coding in plain language possible. Using plain words to change outcomes at scale is both exciting and problematic. 

Agents answer many problems with the IT Tower of Babel, but also create new ones. They promise to act on our behalf autonomously, making decisions and taking actions. They beg the question: who are they working for? 

Some agents are for writers. Yet most agents are for developers and deal with issues that writers should not need to worry about.  Writers should be wary of the suggestion that everyone will now become an engineer and be responsible for the debugging process glitches.  It’s more likely that agents will elevate some non-writer roles into content contributors.  

Even though agents rely on natural language, only a subset of agents handle content management activities such as evaluating content performance, checking quality, and preparing content for distribution.  Most agents handle the arcane details of business processes and IT systems that fall outside the scope of content professionals’ responsibilities.  When AI engineers talk about “context,” they aren’t necessarily talking about context that relates to content, but rather to business process and IT system context. 

Agents support many kinds of goals. It’s best to break down agents by whom they interact and their roles. 

Agents act as an intermediary between humans and machines. They can be human-directed, in which an individual specifies what the agent should or should not do, or autonomous, in which the agent makes decisions independently of explicit instructions. 

Agents interact with:

  • humans (humans-to-agents or H2A)
  • other agents (agents-to-agents or A2A)
  • machines (agents-to-machines or A2M)

The diagram below illustrates the kinds of interactions. 

Agents rely on plain language instructions, but the scope of those instructions varies widely. A key issue is how well the agent is matched with the requisites and responses it receives.   

H2A instructions differ from A2M ones.  Writers aren’t likely to instruct agents to process files or invoke code routines, but engineers will often develop agents that do those things.

Agents interact in a chain. Writers will craft prompts that become agents and read the agents’ outputs. Agents can have conversations with other agents.  They can instruct applications and backend systems to execute tasks, then evaluate and interpret the results, and create a message indicating next steps. 

Writers might write a prompt telling an agent to do something general (find the best-performing blog post) and expect the agent to figure out how to do it.  Alternatively, the writer might write a prompt that includes a detailed procedure, telling the agent where to access information, the order for tasks, and the criteria to use. 

When writers use procedural instructions, they may need to understand the specifics of the options available. Some agents log in to SaaS applications and act as users. In such cases, the writer can base their prompt on the SaaS application’s UI, noting which options they want the agent to access.  

But many agents are not acting as proxy users of SaaS applications.  They are either coordinating with other agents (A2A) or accessing backend systems and data that have no UI and whose organization isn’t self-evident (A2M).  These kinds of agents require developers to create because they depend on opaque knowledge that ordinary users (such as writers) or LLMs can’t discover on their own.

Source: Goose

The above agent illustrates how agents can break down a task into subtasks, each of which has dependencies on various data and systems. 

Now let’s return to the problem of ambiguity and ignorance of context.  Has the writer expressed things clearly?  Have they left out important information?  Have they included too many prescriptive details that might confuse?  

Ambiguity can relate to word meaning, but also to systems’ responsibilities.  The context of machines is frequently more ambiguous than confusion over wording. Backend issues are the responsibility of the engineer, not the writer.  Agents don’t know how to talk to databases or APIs.  They are unaware of protocols (assumptions) or interoperability conformity. They aren’t prepared for various situations. They can’t cope with edge cases. 

These failures have little to do with how clearly or precisely writers draft prompts. Rather, they reflect inadequate engineering testing and overambitious automation.  Agents are given “skills,” but those skills don’t match the environment.

Agents can fail for multiple reasons.  They may crash because they are unable to complete a step.  Or they may return the wrong response because their decision-making was flawed. Those decisions may be made using procedural code or LLM “reasoning.” 

Agents promise to remove tedious work, but getting them to deliver that work can be stressful.

Having agents perform piecemeal tasks is more likely to succeed than complex, interrelated ones, but piecemeal tasks are less useful. 

Giving agents directed tasks may offer more control over decisions but might increase the likelihood of crashes compared with giving agents autonomy to decide how to respond to a request.  

What to delegate to agents

Large consultancies and systems integrators imply that AI agents are your new employees and teammates.  But it is not obvious what role they have.

Are AI agents like an intern on whom you foist a backlog of non-urgent tasks?  Are they like a coach or mentor who can advise you, filling the role you wish your boss did but never has time to?

Agents are a blank slate; organizations must decide how to use them.

Given the intricacies of agents, how much oversight do they need, and when should they be involved?  

Who will be delegating work to whom?  Do people always task agents, or will agents sometimes task people?

There are no simple answers to these questions, because they involve many variables and are subject to revision as people and bots learn from each other.

It’s useful to look at possibilities through various lenses:  

  • Agents that complete tasks faster
  • Agents that complete tasks better
  • Agents that complete tasks more cheaply
  • Agents that complete tasks that are not immediately relevant

Agents are often faster humans. But not always, if they lack critical information or are poorly guided by prompts.  The speed of agents is most noticeable on large procedural tasks that involve many steps or batch actions. Many such tasks are unrewarding to people, who are happy to delegate them to agents.  They are considered “low value” because they don’t require special thought, even though they are important. 

Many writers hope agents will handle the tedious, time-consuming procedural tasks so they can focus on the important stuff.  But agents can play other roles, too.

Another possibility is to use agents to perform tasks that they are better at.  A common example is proofreading: while agents can make mistakes, they often detect small errors that would otherwise go unnoticed. Yet agents can also address higher-value tasks.  They can make decisions about the best information to incorporate in content or even the most relevant topics to write about. Because they can scan across high volumes of content and information, they can spot opportunities that wouldn’t be apparent to an individual writer.

Even though agents can be better at some tasks, they are not poised to replace the judgment of writers. Yet they can perform many tasks more cheaply than manual work or custom automations.

The cost-effectiveness of agents is a hot topic, as LLM use becomes a noticeable expense in organizations.  This issue has brought token cost efficiency into focus.

Token costs are reduced by eliminating verbosity. Shorter prompts and limiting the scope of relevant text to access lessen costs in many cases. But cheaper agents may be less flexible. Over-pruning – removing too much context – can be counterproductive, as agents struggle to match instructions to available resources. Token efficiency involves the balancing of the precision of outcomes, costs, and flexibility.  

Finally, agents may take over more content-adjacent tasks.  For example, agents are now participating in meetings.  Many meetings are time-wasting for content professionals because most of the agenda isn’t directly relevant to their responsibilities. Agents may be surrogates, telling content professionals what they need to know from an all-hands meeting, or provide a 30-second status update for a division-wide project check-in.

Agents don’t have a fixed role

AI agents are moving in many directions. How they will be used will vary according to the organization’s priorities and maturity.  

Some will expect the agent’s output to be used to generate content, for example, by retrieving data to be incorporated into a narrative.  Others will see agents as inputs to another human-directed process by providing a status message indicating what has changed.  Still others seek to use agents to eliminate human involvement in content tasks as much as possible.

Given the diverging expectations for agents, it’s little surprise that content professionals have difficulty forming a clear mental model of how LLMs and agentic AI operate.  I hope this discussion helps make those contours more visible.

– Michael Andrews