Categories
Content Operations

Reusing ever-changing content in the AI era

Relevance is linked to change.  Any published content will need changes at some point to stay relevant.  Content management practices similarly need to be open to change to remain relevant. The time has come to rethink these practices.

This post examines how the concept of reusing content is changing with the rise of LLMs.  I’ll argue that these changes will necessitate a rethinking of practices such as content models and content structuring. It will prod abandoning long-standing “best practices” around single sourcing and content reuse, but will ultimately simplify content development and make content more valuable. 

The fragility of single-source reuse and revision

For a long time, the advice about content reuse has been simple: just do it.  Reuse your content rather than creating duplicates, and update your content in one place.  Approaches such as DITA embodied this philosophy.

Reuse is encapsulated by the slogan “Don’t Repeat Yourself”, known as the DRY principle. Yet its superficial simplicities belie its practical complications.

It’s hard to keep DRY.  Many large organizations are unaware of how often they repeat text online because their staff routinely copy and paste previous content when creating new content. Few employees draft content in their CMS, and even fewer bother to locate and use the original version in the repository. DRY content management is more of an ideal than a norm.

Copying and pasting your own content is perfectly legal under copyright law, but it can create problems if the repeated content becomes so inconsistent that various pages present factually conflicting statements. Automation can detect trouble spots. Software can help organizations find where they repeat text within pages or screens. Multiple code libraries can uncover text reuse. 

If you must repeat your text, then it makes sense to only have one copy of that text.  The intent behind single sourcing is sound. But putting it into practice depends on rigorous planning, processes, infrastructure, and buy-in. The experience of numerous organizations shows it’s a heavy lift. And it’s hard to fault enterprises for having trouble.

In practice, the rationale for single sourcing was never as straightforward as advertised. Content repetition shouldn’t always be consistent, or else you risk forcing a fixed text to fit all circumstances (as if written by a committee) rather than allowing circumstances to guide what wording is needed. Content changes won’t necessarily occur in tandem everywhere; an update won’t be applied everywhere simultaneously. 

The DRY attitude presumes you can anticipate the changes you will need to make in the future, which often isn’t realistic. Circumstances are sometimes sloshy and don’t change in an orderly manner. Sometimes you need to create new versions and variations to accommodate diversifying contexts; other times, a shift in context requires content changes so substantial that the revised content has drifted from its original intent

The false dichotomy of single-source versus single-use content

Single-sourcing practices make an implicit assumption about content reusability.  Either the content is intended to be reusable and will be continuously updated, or it will never be reused and therefore has a short shelf life.  

Content is sorted into two types: either a single canonical version (single source) or a throwaway (single-use). Non-canonical content has no enduring value. 

The philosophy of single-source reuse and revision assumes a logical content model in which content is composed of recurring patterns of distinct pieces.  The content can be divided into chunks and strings that can be swapped out as needed. While some technical content is sufficiently formulaic to follow this behavior, most other content is not. 

Not all content changes can be reduced to discrete, predictable variables. Frequently, writers seek to reshape existing content rather than merely revise a few words. But the content model, by design, doesn’t allow them to make sweeping changes. It exists to enforce consistency in content and to prevent authors from improvising with it. 

The centrality of content modelling has been disrupted by the growing use of LLMs.  Many writers question the value of single-sourcing when LLMs can make on-the-fly changes of most any kind. 

Content models bind content, while LLMs make content elastic and adaptable.  LLMs have exposed the limitations of structured content. 

Structured content practices that made sense in the pre-LLM era now look antiquated.  Structured content can facilitate simple, discrete changes and updates, but can’t, by itself, address the kinds of sweeping changes that LLMs are capable of. 

Exact repetition can be tedious

Reusing content can be beneficial when governance is paramount, such as for legal compliance purposes. Organizations don’t want inconsistent legal disclaimers, for example.

But too often, exact text reuse is lazy and unimaginative, and it hurts the user’s experience with the content. People don’t pay attention when they encounter the same message repeatedly. Airlines struggle to enliven their safety videos because they know passengers tune out of messages they’ve already encountered.

Repeated text is evocatively described as boilerplate. It is standardized text you are not expected to read closely, if at all. It’s faceless text.

The problem arises when you repeat verbatim text that’s meant to express something original or novel, and you expect users to notice it.  You treat the text like boilerplate. Doing so can sound like a tiresome advertising jingle. 

Reader attention is linked to a source’s credibility. In many situations, readers prefer content that reflects the perspective of a named author over content written by an anonymous corporate body. Readers expect more originality from individual authors. 

If people don’t have to read the content, then they need to want to read it. An author who repeats the same content again and again in blog posts won’t gain much traction. 

The growth of the subdiscipline of content design in enterprise applications shows that even anonymously created content should embody personality, delight users, and express empathy. Content of all kinds must hold the reader’s attention.

Academic content, like corporate content, is serious business. Academics prioritize originality in their content. Novelty attracts attention. Researchers don’t accrue influence or reputational points for republishing the same material repeatedly. They must publish new material.

There’s a taboo against what’s called “self-plagiarism”: reusing text you’ve already published elsewhere without acknowledging it.

But when is it appropriate to repeat statements made previously? Even the US government is interested in this issue. The National Science Foundation (NSF) funds substantial research and is interested in identifying which content in reports is new versus old. NSF funded the Text Recycling Project to develop guidelines on when it’s appropriate to reuse prior content.

The Text Recycling Project acknowledges that reusing text is sometimes necessary and even desirable, particularly when explaining prior work to provide background for new information.  If the exact same text is reused, such as a table from an earlier study, the original source should be cited so that readers understand what’s legacy information and what’s new. 

The Text Recycling Project developed a taxonomy of recycling. Its terminology is specific to academic researchers who release materials informally before formal publication, but it is suggestive nonetheless.

Source: Text Recycling Project

What’s worth noting is the distinction drawn between verbatim copying (duplication) and other kinds of reuse involving transformation:

  • Adaptation (shifting intent)
  • Generative recycling (reusing facts)
  • Developmental recycling (releasing in new ways)

Reusing doesn’t have to result in duplication. It can entail making changes, building new items from old ones, and even using prior work as a template for substantially new work.

Exact repetition is not the only, or even the best, way to reuse content.

The emerging paradigm: repeat yourself, but with variation

LLMs have reshaped possibilities. Both writers and readers can now transform existing content in numerous ways. 

Since the launch of the generative AI era, authors have had tools to modify content. Now readers are beginning to gain these capabilities, which are providing new insights into what readers want. One such tool is Elicit, which offers generative AI features to help scientists discover published research.

Elicit offers community-developed prompts for users, such as “Explain it Like I’m 14.”  That prompt represents a specific intent that diverges from how the content was presented in the original source.  LLMs let readers change the intent of a source to reflect their specific interests.

Source: Elicit

Elicit also offers tools that let readers convert text into other formats or discover related material.

Source: Elicit

Tools like Elicit show that readers want content in alternative ways from the default presentation, with:

  • Different focus (broader, narrower, adjacent)
  • Different pacing (incremental updates)
  • Different media (diagram, video)

As readers increasingly use such AI tools, the role of editors is called into question. If editors are still needed, they must add value beyond what readers can achieve on their own. 

Writers must assume a more editorial role, focused on developing a body of content rather than individual articles. They must actively leverage AI tools to shape the editorial perspective when developing content for readers. 

As I have argued previously, content professionals shouldn’t delegate editorial control to third-party AI platforms; they need to take ownership of these tools so that outputs faithfully reflect their organization’s editorial perspective. Unless they do so, AI-generated outputs will be mind-numbingly dull and won’t be read.

AI tools, on their own, don’t create a great editorial experience.  They won’t persuade readers to do something they weren’t already intending to do (sorry vendors).  AI tools require human oversight by experienced writers to produce outputs that people want to consume.

AI tools for readers provide fresh evidence that existing content, even when of interest, won’t necessarily be in the desired format or rendering. They highlight a gap between what’s available and how readers want to consume the information.  

Writers should identify which formats and renderings are in demand and create them for readers, without requiring readers to generate them themselves.  Writers must apply their editorial expertise to deliver a superior experience compared to autogenerated outputs based on third-party or community-contributed LLM prompts.  

The emerging approach to reusing existing content considers how to capitalize on content that’s proven successful in the past. If certain content is successful with certain groups in specific situations, how can that insight be extended to other groups and situations?

Human oversight remains critical. LLMs have a dubious reputation for being undisciplined. They can commit what’s known as unconscious plagiarism: they aren’t aware that they are lifting and repeating text verbatim.  LLMs become a “Plagiarius,” a Latin term meaning “kidnapper” that was historically used to refer to a literary thief.  Plagiarism is taking text as-is and claiming it as original.

But suppose writers use LLMs to appropriate existing content in a way that adds value, rather than steals it?  

LLMs can help adapt content to be more relevant in two ways.  First, they can modify content so that the level of detail and timing are more appropriate to a user’s context.  Second, they can adapt discussions to users based on their motivations. Do they have the same goals, parallel goals, or divergent goals compared to an existing editorial framing of a topic?  Some readers are experts and want deeper details, while other readers need convincing that a topic is worth paying attention to.

Source: Axios

The online news website Axios, which delivers stories in brief, modular form, illustrates how to adapt content to meet users where they are.  Many stories are updates of prior ones, but users may not remember these foggy details. Axios uses certain devices to help readers get the context, such as: 

  • Catch up quick  – what’s happened in the past (without all the details)?
  • Flashback –  what similar cases happened, and how did they turn out?

LLMs could easily generate “catch up quick” and “flashback” statements from past content. 

LLMs change the context of content.

Neither dead nor alive: the value of legacy content

Old content tends to go to the proverbial digital landfill when it’s no longer exactly what users want.  But now, the legacy content is appreciating in value, thanks to LLMs.

Managers of digital content should borrow (or steal) insights from the “circular economy” practices that are gaining adoption for physical products.  A great overview is in a recent MIT Sloan Review article, “A New Method for Assessing Circular Business Cases” (paywall).

Ordinarily, product manufacturers approached products linearly.  Companies realized value only once, when products were made and sold. Once sold, the product was used and disposed of without the company’s involvement.  The consumer decided when the product was worn out and no longer useful.

A circular approach to products considers alternative pathways whereby products can have second lives, yielding new revenue streams for manufacturers.  A circular model identifies new opportunities for existing products through:

  • Sharing (enabling more parties to use the product)
  • Repairing (fixing a weakness in a product)
  • Recycling (making a new product from old ones)
  • Remanufacture (rebuilding an existing product)
  • Regeneration (breaking down an old product into raw source material usable for alternative goals) 

Products can have multiple lives after they are first made.  But to take advantage of these, the producer must plan ahead.

Content managers can draw inspiration from these new lifecycle management techniques. 

The archives as IP

Content professionals must shake off the perception that content has little or no value after publication. Executives see content as an expense, not an asset. Unlike branding assets, published content doesn’t have intangible value, say the accountants. Although copyrightable, online content isn’t considered intellectual property that lawyers consider worth legally defending. Predatory web scraping is rarely challenged in court. 

Why retain old content if it has no value after its publication? Even many content professionals have been unable to answer that question satisfactorily, which is why few organizations have a serious process for archiving their content after it is taken offline.

Part of content’s lowly status relates to its quick depreciation. Content’s relevance decays with age. The typical half-life of digital content (the point at which the content loses half its value) can be anywhere from a week (for an announcement) to a year (for an “evergreen” topic).  The content doesn’t necessarily become inaccurate.  It simply becomes less relevant as user priorities and contextual environment change.  Fewer users access and consume the content.

The equivocal status of content value has created another false dichotomy: between live content (currently online) and dead content (content taken offline).

LLMs have blurred the distinction between living and dead content.  LLMs can easily revise content, sometimes radically, and bring dead content back to life.  Old content can now be reused in ways not originally intended. LLMs can address one of the chief reasons old content loses value: its relevance. Old content can gain new relevancy.

Legacy content can take on new purposes by reaching new audiences, incorporating new developments, or supporting new initiatives.  

LLMs change how content can be transformed compared to older tactics, such as “content repurposing”, a mundane content marketing tactic to amplify a piece’s reach. For example, a marketing team might hold a webinar, then create video clips from it to use in social media posts or embed in a blog post summary of the webinar.  Such activities don’t really add value to the existing content; they simply spread existing value elsewhere. 

The value of LLMs is not their ability to reduce the busywork of making variations and spraying content everywhere. Rather, LLMs are radical because they can tap the latent value from old content to produce something new. 

Professionals who design physical objects, such as clothing or furniture, draw on design archives of past works for inspiration for developing new products.  Disney and other film studios draw on their film archives when developing new film releases.  Similarly, content strategists will draw on digital content archives to generate new content offerings. 

Deciding when legacy content has value

The mission of content strategists will be to determine which legacy content may be relevant to users in the future.  To do so, they need to rethink how content is valued.

Currently, content gets evaluated based on its external value. Are users reading the content?  If not, the content is purged.

In the future, content will be evaluated based on its internal value to the organization.  The current content may not be relevant to users as it is, but it could be used to create relevant content later. Just because the content has lost its current relevance doesn’t imply it won’t be useful later.  Outdated content may retain internal value even after losing external value.

Not all old content will have future value.  The legacy content must be unique.  Duplicative content won’t help LLMs. Some legacy content may be irrelevant to the organization’s future mission.  

The decision will center on what to purge (content with a single-use) versus what to archive (content with recycling potential).

Legacy content can hold different kinds of value.  It may have editorial value.  While the factual details are no longer relevant, the narrative framing of the content is powerful and can be applied to other topics. The content might have been highly successful at introducing a new topic to someone inclined to be skeptical of the idea. While the product featured might no longer be offered, the approach would be relevant for other products. 

Another example is a complex explanatory graphic that was successful in promoting understanding of a topic, but whose details are no longer current.

In these cases, generative AI can remove irrelevant details and enable the reuse of the editorial structure.

A different situation is when the editorial content is no longer needed, but the informational details are.  LLMs can extract information from legacy content, making it available to incorporate into future content.

Source: LandingAI + DeepLearning.AI

Generative AI encompasses more than text-oriented LLMs.  Visual Language Models can extract information from tables, graphics, photos, and PDFs.  Tools such as LandingAI (shown above) can identify implicit editorial structure through layout and textual cues.  

Generative AI can be used to modify existing content to maintain its relevance.  But more significantly, it can extend the relevance of legacy content through regeneration. It allows legacy content to be adapted and repurposed. 

Legacy content can serve as the organization’s institutional memory, providing examples of past efforts that can be leveraged in the future.

Rethinking the role of content models

I’ve long been an advocate of structured content, especially headless content management approaches.  Yet my thinking has evolved in light of the radical changes occurring outside the parochial world of content management.

I’ve concluded that long-cherished ideas about content models must change, because the realities they are meant to address have changed. Best practices have a shelf life too.

LLMs have made “unstructured content” more valuable, and in so doing, have made structured content less valuable. Long-established distinctions between structured and unstructured content are becoming less meaningful.

Structured content can no longer be regarded as the preferred solution for managing content in the LLM era. It might still play a tactical, supporting role, but it is no longer the all-in-one solution as it was positioned before the arrival of LLMs.

Structured content has historically been sold on the promise that it would reduce authors’ work. A single author could output multiple versions and formats from a common file, a task otherwise impossible to perform without structured content. Yet that benefit is no longer compelling for two reasons.  

First, authors’ experiences with structured content reveal that the approach creates extra work for writers even when it reduces other tasks, and many times, the burdens of authoring in structured content outweigh its benefits. For example, many technical communicators, who have been the primary targets of structured authoring, have abandoned it in favor of a docs-as-code approach. Structured content buyer’s remorse is a thing.

Second, LLMs have disrupted structured content’s monopoly on the complex assembly of content.  Rather than relying on nested XSLT transformations or complicated GraphQL queries, LLMs can perform complex content transformations using plain-language prompts. Computer code is an awkward tool to shape narrative text. Often, written directives are more transformative than encoded software rules. 

The adoption of structured content as the foundation of enterprise content management hit a wall because it overemphasized databases. Authors don’t think about creating content in terms of databases – they’re instead bewildered when content is divided into fragments, because they can’t see how the fragments fit together. They think imperatively using words, which LLMs offer.

LLMs are changing how content is assembled and are decreasing reliance on a database of fragments to generate coherent output for readers. LLMs don’t draw a distinction between content and code; for them, it’s all just strings of text. They can write, format, and assemble text.

Content models have, at many times, been idealized and granted superpowers they’ve never had.  Content models don’t represent the real world of people, things, and actions. They are not ontologies that describe the physical world conceptually. They are merely tools to help manage content, and their importance is now diminishing. 

The value of content models does not derive from simplifying content authoring. Structured content has always been challenging – even baffling – for authors, despite its advantages.  Writing with a database is confusing. Their effort-saving benefits can be outweighed by the time costs of learning and oversight.

At the same time, content models, by splitting content into modules with specific roles, can hard-code content intent, making it difficult to pivot to other purposes. Too much structure can inhibit transformative generation. Complex, token-heavy content models can be a barrier to LLMs performing transformations. LLMs are trained on web articles and prefer working with such outputs. 

As LLMs take over more authoring tasks, the role of the CMS is likely to change.  It will continue to store content for API-based delivery to websites and other channels. But CMSs won’t necessarily be where content is drafted or composed. Archived legacy content that LLMs might access could be stored separately, perhaps in a RAG database, and made available to the authoring interface. Customer-facing chatbots would also need access to a RAG database of content curated to work with customer-oriented prompts. The authoring interface could be something akin to Claude Code, where agents can pull resources from other systems as needed.  Controlled content and dynamic variables that require structured data management may be stored elsewhere, such as in a graph database like Neo4J. Agents will drive the orchestration of content, negotiating between prompts and code. 

The future of content management is likely to be a hybrid mix of systems rather than a single CMS.  It’s hard to speculate with any certainty what it will look like, given the rapid changes underway in technology. The biggest unknown is how quickly LLM content generation improves in terms of speed, cost, and accuracy.  There have been significant improvements in all these dimensions over the past year.  

How much content can realistically be generated on demand, and how much will need to be pre-generated with author oversight?  More content will be low-touch (generated without human oversight), but high-touch content (needing editorial oversight) will be important for content addressing high-stakes circumstances.

If LLMs are taking over more responsibility for generating content, how does this affect content models?  LLMs can generate content flexibly, but struggle to do so consistently. When consistency is needed, LLMs perform better when combined with a database.  

The new purpose of the content model will be to store variables required for LLM-generated content.

Content models will be viewed from the perspective of output delivery rather than their original purpose of authoring. Publishers will focus on which elements they must control. These might be elements with special accuracy requirements (such as numeric values like prices) or granular details that, for business reasons, must be optimized.

Content isn’t data in most cases. Content is like data only when data values are displayed within narratives (the customer name is inserted into a terms of service agreement), or when narratives are counted like data (the number of times a disclaimer appears is counted and compared to the number of times it is supposed to appear). 

Similarly, composing content generally isn’t algorithmic in the sense of following formal decision trees. Compositional choices are more often based on subjective choices about what would appeal most to readers. 

Only certain kinds of content need to be part of a content model. Content structuring is required under two conditions.

First, the content wording must be invariant, meaning it must be treated like data. 

Second, the assembly of the content must be deterministic. The content shown depends on encoded rules rather than on instructional guidance.  For example, a rule might exist that the statement “free shipping” does not appear if the order total is less than $40. The message’s assembly is guided by if-then code.

Instead of structuring everything in one’s content, the goal is to structure only what’s necessary.  These will be elements that authors rarely need to touch because they are largely fixed.  

The short list of structured content elements would include:

  • Data variables (allowed alternatives or dynamic values)
  • Fixed phrasing (required wording or allowed alternatives)
  • Templated boilerplate content (background – rather than foreground – content used to frame the essential information, such as explanations on how to understand a table of information)
  • Combinational elements (chunks that could be used in different sequences)

Some of the information stored within the content model will be details extracted by LLMs from legacy content.  

The promise of change

We can no longer divide content into single-source or single-use, live or dead, structured or unstructured.  

The developments I’ve outlined are already happening, though sometimes in isolation from one another and at different speeds.  I’m also aware that other parties are approaching content management differently, seeking a more all-in-one solution that fuses various AI technologies into a unified system, though I’m sceptical about the widespread adoption of this approach.  What’s easiest to implement from a corporate IT and employee learning perspective will be what gets adopted. Whether it is a perfect system is far less important.

The changes underway will take years to be in widespread practice, but will nonetheless be disruptive, not evolutionary, as some believe. It’s important for content professionals to look beyond their immediate domain, because broader technology developments will determine future content practices more than the decisions of content management vendors. 

The future I’ve sketched entails an ecosystem that is more architecturally complex than a single CMS would be.  But content professionals should not have to worry about where information is stored – that will be the agent’s responsibility.  They gain the freedom to transform different dimensions of content, from broad ideas previously used that they want to rework to precise data that must be included exactly.

The emerging approach removes a major obstacle of structured content: the need to determine in advance which content will be reused. The prominent role of LLMs will give authors more flexibility and control over how to shape new content.  

— Michael Andrews

Categories
Content design

Separating content and presentation: Moving past FUD

The principle of separating content from its presentation is more critical than ever.  So why is it so hard to get buy-in for it?

This post takes a deep look at the FUD (Fear, uncertainty, and doubt) surrounding separation. It will address why FUD is prevalent and why it’s misplaced:

  • How content and design separation is different today from how it was considered in the past 
  • Why tools make it difficult to separate content from design
  • The problems arising from design-defined content 
  • The dodgy reasons why visual editing tools and DIY design are popular 
  • How separation promotes clarity 
  • Why the meaning of content is independent of its presentation 
  • Why content’s meaning is persistent
  • How all kinds of content are becoming format free
  • The problems for users when relying on presentation to clarify content 
  • Why dependence on presentation leads to ambiguity for AI and assistive technologies 
  • The importance of supporting presentation changes that don’t require content changes 
  • Why “custom” pages still need separation 
  • How content assembly is different from content presentation
  • The two distinct kinds of assembly 
  • How content assembly gives authors missing control 
  • Why bad implementations generate FUD about separation 
  • Why trapped content will become the new worry

A concept’s long journey toward acceptance

The principle of separating content from its presentation is a powerful and useful idea that is also controversial and resisted by people in all roles. 

Resistance comes not just from writers accustomed to WYSIWYG editors. Developers can exaggerate the complexity of content-design separation or question its practicality. UX designers don’t always see its value.  Vendors also play on this fear and sell solutions that undermine implementing mature practices.

Even people who agree with the concept in principle often abandon it when it seems like it’s too much effort.  

Why doesn’t the concept of separating presentation from content get more love if it’s truly valuable? The simple answer is that the concept is so radical and powerful that it is easy to misunderstand.  FUD sets in and disrupts progress. 

Separation matters now more than ever. Discussions about separating content from presentation have a long history. Why revisit this topic now?

Past discussions, responding to changes happening in the early 2000s, don’t account for the current changes reshaping today’s digital ecosystems (e.g., the development of design systems, structured content, and the shift to composable and headless architectures.) UX practices have lagged behind these changes, which are forcing teams to re-examine assumptions about the fundamentals of how user experiences are developed and implemented. 

What’s at stake is how we decide to create what we communicate: 

  • Does the content of web pages depend on their layout?  
  • Does an author need to work around a predefined design or change the design to match their content? 
  • Should the layout adjust to the content?

The renewed relevance of an old debate. The topicality of separating content from its presentation has assumed renewed significance. While previously debated issues remain relevant, the context of the discussion has shifted over time. 

Separating presentation from content is a long-established web design principle.  It has earned its own Wikipedia entry and Wikidata identifier (Q3511030).  The concept has an even older heritage as an extension of the principle of the “separation of concerns” used to design systems.  

Two decades ago, the W3C took a significant, if incremental, step when it decided to jettison presentational tags (such as bold and italic) in favor of semantic ones (like strong and emphasis). Even though presentational elements were not entirely abolished—underlining still exists–the decision signaled the expectation that presentation would be managed separately from content. 

Partisans debated the call to separate content from presentation as CSS began to displace presentational markup in HTML.  For many discussants, the debate was never about content or presentation.  It was about nothing more than CSS.  

But others viewed the issue more existentially and contested the desirability and feasibility of thinking about content separately from its presentation. Websites continued to be designed with wireframes before any content was created.  Developers crafted frontend frameworks composed of UI components that often defined the content presented on a website.  It was hard for some people to imagine content without being able to “see” how it would be presented. 

People settled into their conclusions and routines.  

Lately, the fault lines between content and presentation have been exposed again. Vendors have struggled (poorly, in my view) with how to deliver “visual editing” while simultaneously supporting structured content, which has enjoyed a renaissance of interest. Vendors have been trying to graft UI layout components (front-end) and content blocks (back-end) into a “universal editor.” Some front-end frameworks turn every variable into a common pool of JSON data.

At the same time, technical developments are erasing prior distinctions between how we distinguish content, formats, and presentations. Computers are taking over many presentation decisions. All kinds of media can now be generated from text.

These developments have prompted a reexamination of core principles. Content and design are now governed by separate systems (content models and design systems) that have specific responsibilities.  The content expresses “what” information and messages contain, while the design expresses “how” messages and information are presented, typically layout and formatting, but not limited to those dimensions.

Separating content and presentation brings transparency. The belief that what you say and how you say it are indivisible is an illusion. They are not bonded together in a hermetically sealed package, but are distinct ideas and goals.

That’s not to imply what content says and how it’s said are unrelated. Rather, the reality is that each side has independent power.  The presentation can make trivial or even false details seem important, and it can bury important ones.  Likewise, critical information can be overlooked by poor presentation.  

The presentation does matter. But it’s a distinct dimension from content.

Content changes are explicit. The facts in content sometimes change, and messages may need to adapt to audiences. But the presentation is much more implicit and contextual. Presentations can change on a whim.  Even when the content remains consistent, the presentation may change radically depending on where and when it appears.

The audience experience is derived from both the content and its presentation.  It’s important to understand the contribution of each to that experience. 

Dealing with separation anxiety 

Loss aversion is a powerful motivation.  Because our thinking about situations is anchored in how we habitually experience them, it can be hard to embrace a different experience. What’s familiar is comforting; what’s novel is disruptive. When your child is leaving for a week-long camp away from home, he or she may have separation anxiety.  Similarly, when your content is separated from its design, it can feel disorienting.  

Content professionals have become accustomed to thinking about content and presentation together. They expect to see what the content will look like and often expect to change that appearance as well. 

Tools can lull us into believing content and presentation are inseparable. Two interaction paradigms have shaped authors’ expectations about how content and presentation interact. While very different, both imply that content should change based on the presentation chosen. 

The first approach is represented by WYSIWYG tools, such as the page builders in many CMSs, which allow authors to format text and graphics any way they please. This approach encourages authors to adjust their content and presentation concurrently.  

The second approach is represented by tools that use design templates that guide what content to create.  In traditional CMSs, the creation of the content is guided by how it will appear on a page.  A template defines what content is required. The content must adapt to the presentation defined by the template.  

When content depends on its presentation, the design decides the content’s details. Numerous online tools promote the perception that the development of content depends on its layout. The dramatic popularity of Figma in designing web pages is an extreme example. Writers play a junior role on UX teams, filling in words in a graphic design layout. While such tools may promise the freedom of self-expression, they tend to impose constraints on making changes to layouts. 

But content also needs to change. Whenever the content changes but is dependent on a fixed presentation, it creates a conflict. The presentation restricts what content is allowed. 

A major motivation for separating content and presentation is to make presentations more flexible and changeable. The separation of content from presentation has expanded with the decoupling of frontend and backend systems. This decoupling enables content to be presented in multiple ways and allows the presentation to change quickly. 

While the technical means to separate content from presentation are established and growing, the capacity of organizations to manage these dimensions remains immature. Some organizations avoid confronting change and favor expediency over improvement.

Separating concerns about tools from processes

Numerous online editing tools allow writers to change fonts, resize images, align text, change spacing, change the number of columns, and so on. Many provide more advanced layout features, such as the positioning of headings, the color of fonts, and entire color and layout themes.

“Visual editing” tools are a Band-Aid. Tools that allow authors to change both the content and its appearance are popular, and CMS vendors keep promoting them.  But an awkward question arises: Why should an author make decisions about a page’s layout?  The organization they work for likely publishes thousands of web pages.  Shouldn’t all these pages need to follow common presentation guidelines rather than have individual authors decide how individual pages appear?  Isn’t the UX design team supposed to be in charge of the presentation?

The desire of authors to decide the presentation is an old theme. DIY web design was once prevalent in organizations, and its problems prompted the emergence of design systems to reign in such patchwork design.

When DIY web design persists, it indicates a failure in an organization’s UX processes.

Some authors want presentation options out of necessity. They are given a generic blank page and are expected to fashion it into a meaningful experience. They hope that they can do that by dragging and dropping widgets on a screen. If effective UX design were truly so easy, millions of UX designers would be out of work.

Other times, authors are trying to override a rigid and poorly designed layout template that doesn’t support the presentation of the content they have developed. 

In both cases, the author has been shortchanged by their UX design colleagues, who failed to provide them with a serviceable layout for their content.  

Separating content from presentation forces organizations to confront how well they understand their publishing requirements. When large numbers of pages must be custom-designed because each is considered a “special case,” that’s an indication that the organization hasn’t planned its presentation adequately. Special cases, by definition, are exceptions, not defaults. No organization should feel overwhelmed by the volume of custom web pages it must design. 

The goal of separation is to enhance clarity, not enforce style. When the concept of separation first emerged with the use of CSS, it became linked to the notion of styling. But to view the presentation as merely styling is a crude understanding of the principle. 

Separation recognizes that there is no “one best way” to present content. What’s best is contextual to the situation and provisional until a newer presentation proves more effective. The same underlying content can be presented in multiple ways, which can shift how it is perceived, understood, or consumed. The goal of separating content and presentation is to allow multiple presentations of the same content, some of which will be better and clearer than others.

Separation allows the content to benefit from iterative design improvements. Presentation standards evolve to reflect learnings about what works most effectively. Separation allows UI components that are used across many content items, such as heroes or alerts, to be tested and improved. 

Yes, Virginia, content is still meaningful without presentation 

Separation requires a shift in mindset and practice. People may push back by proclaiming that it can’t be done – that the notion is nonsensical, that it threatens the magic content can offer.

A common objection contends that content can’t be stripped of its presentation and remain intelligible. This view holds that presentation is integral to the meaning of content, so it can’t be separated from the content. After all, if presentation supports the meaning of content, then content without presentation must be meaningless, right? 

To address categorical objections like this, it’s necessary to unpack beliefs about how experiences become meaningful.  Doing so helps to clear the cobwebs of unexamined assumptions and highlight the changes happening in digital practices.

The meaning of content is independent of its presentation. While the presentation is significant in conveying meaning in the broadest sense (by stressing emphasis or salience), it doesn’t follow that content depends on a specific presentation.  

Content may be harder to understand without its presentation, but rarely is it contingent on its presentation to convey its meaning because that implies the presentation materially changes the meaning of the content, which should never be the case. The presentation can change without altering the meaning of the content.

The principle of independence has some radical implications:

  1. Authors must let go of preconceptions of how their content will appear, either now or later. The content’s appearance is subject to change.
  2. The content is independent of the media it may appear in as well.

Yet because content can exist in many forms (media), it’s sometimes difficult to distinguish what’s content from what’s presentation.

Simply put, the content represents the substance or essence of what’s presented. That essence should be defined precisely and not be subject to variable interpretation.  The substance doesn’t depend on its context: It will remain the same wherever it is presented.  

Content’s meaning is persistent, however or wherever it’s presented. The literal meaning of content is fixed by its encoding. Its presentation may influence its connotation but not its literal meaning.

Communication–the ability of different people to reproduce the same message–depends on distilling the essence of a message – its content– from how it is presented.  

Throughout history, people have encoded the meaning of content by using standardized notations. These standards allow people who do not know one another to interpret the content in a consistent way. 

The substance of content is typically defined as text, symbols, or structured data of some sort that can be composed or compiled into various presentations. As computer technology continues to advance, it is becoming easier to break down content presentations into constituent elements and separate the content from its presentation.

Writing started by using symbols to stand for things or concepts.  Then, writing developed symbols for the sounds of words – using letters, phonetic alphabets, and even shorthand symbols.

Later, people developed notation to represent music and even dance. The symbols don’t need to be visual. Braille can represent letters or sounds.

As symbols become formalized, they become independent of a specific presentation. At first, writing was handwritten, then engraved, and later typeset; with each step, the content became less tied to its original presentation.

Text is a surprisingly versatile way to represent content that can be transformed into presentations in all kinds of media. Even the richest content media – the movie  – is built from a text script.  AI has shown the possibilities of generating audio and video from text. 

The trend has evolved to separate content from its presentation.  All kinds of content can be extracted and separated from their presentations, while presentations in many formats can be built from “raw” content. For example, an audio recording can be turned into a text transcript, and that text can be used to generate another audio presentation featuring a different voice or even a different language. 

Maps were historically considered content that was inseparable from its presentation. What value is a map outside of its presentation?  But maps today are databases of structured content that can be presented in multiple ways.  The same information can be presented as a street map or a satellite image or rely on text labels or icons, for example. Maps are manifested through their presentation but are not defined by any specific presentation. 

Skeptics may object that certain kinds of content always depend on their presentation.  If a presentation can only be presented in one way, then it is content. Presentation, by definition, implies that there is more than one way to present something. The presentation is not fixed.  

Photos as media can be content or presentations, depending on their essence. The original source file of a photo image is content, but subsequent cropping, edits and treatments of the image are presentations of the original content. The trend in image manipulation is toward non-destructive editing.  

Even visual content can be represented non-visually. Many content creators believe that visual content has a fixed presentation and thus can’t be separated from the content it represents. That assumption is being challenged in more and more domains.

Consider diagrams. While diagrams are meant to be visual, they do not have to be represented visually. There are multiple approaches to representing diagrams as text, which can generate alternative visual renderings of the diagram. Neither the format of diagrams nor its presentation are fixed.

What about music?  Since music relies on standard symbols positioned on a staff, it would seem to have a fixed presentation.  But while sheet music is the most popular representation of a music score, it is not the only option.  Music scores can also be represented as text using the ABC notation, which can generate a visual score. Electronic music compositions can also be represented using the MIDI protocol, which can be manipulated to generate alternative presentations of the composition.

Mathematics is another kind of content that is often presented visually but doesn’t not need to be represented with a fixed presentation. Even though mathematics uses widely understood symbols, their presentation can be variable. Certain mathematical statements can be presented in more than one way.  Mathematics has developed two parallel markups: one for the content and one for its presentation. 

The presentation should add meaning, not change meaning. Presentation supplies context to content, which can enhance its meaning.  The presentation helps define the intent for how readers will experience the content. 

The same content should always mean the same thing, however it is presented.  The one situation where a presentation will alter the intrinsic meaning is if it reinterprets the content’s original intent by changing the selection of details — the process of context shifting.  This may happen unintentionally when the content is poorly developed. For example, it could be possible that a less detailed view of the content gives a different impression than the views with full details.  Or it may occur when the content can support scenarios beyond what was originally envisioned, which shifts how the content is understood. Because these situations are possible with decoupling, it’s imperative to develop content that is not wedded to preconceptions of how it will be presented, since future presentations cannot be known in advance. 

One reason machines (whether assistive technology or AI bots) misinterpret content is that the content is ambiguous, relying on contextual cues to explain what it is meant to say.  The W3C has warned of the reliance on visual structure to convey the meaning of content: “While presentational features visually imply structure — users can determine headings, paragraphs, lists, etc. from the formatting conventions used — these features do not encode the structure unambiguously enough for assistive technology to interact with the page effectively.” 

Presentation can’t fix ambiguity in content. If your content depends on how it’s presented to be understood correctly, then the content itself is likely ambiguous and inherently confusing.   The role of presentation is to connect ideas that are intelligible on their own, not to make unintelligible ideas somehow discernible through hand-waving.

Some brands, unfortunately, publish fragments of content whose meaning is unintelligible without seeing the context in which it appears. These practices have become more prevalent in recent years, as the fetish of minimalism has been rationalized as promoting simplicity and usability, even when it often results in the opposite effect.  Readers are expected to guess the meaning of a hint or icon based on other content presented elsewhere.  These hidden meanings, while seemingly elegant, fail to inform the screen reader user or pass legal compliance reviews for clarity and the absence of potential misinterpretation. The ubiquity of bad practices does not legitimize them. Rather, they demonstrate the need for content to be explicit and clear independent of its presentation.

Treating communication as a “content design” package has resulted in numerous examples of deceptive design practices where essential information is suppressed.  These examples are misleading precisely because the content, on its own, does not fully or candidly convey the information users need to know to make an informed decision.

Humpty Dumpty and Alice, from Through the Looking-Glass. Illustration by John Tenniel.

Illusions of control 

How should decisions be made about how content appears? An individual’s latitude to make decisions about the presentation of content is not synonymous with the organization’s capacity to make these choices.  

Some authors protest when they don’t have options to change the styling or layout of their content. They jump to the conclusion that the presentation can’t be changed and believe that their input is necessary to decide how the content looks.  In essence, they assume if they don’t see an option to change the presentation, that option doesn’t exist. 

Even though authors are not in charge of the presentation, that doesn’t imply that the presentation is fixed. Organizations can change the presentation whenever they want to. Organizations generally aim to have various content they publish presented in a consistent manner because such consistently promotes clarity and understanding. They don’t want to encourage the helter-skelter redesigns of individual web pages. 

The presentation can change independently of the content. The presentation is not fixed and can change readily when the organization decides to do so.  

Yet, such changes are not the byproduct of content changes. They are separate decisions. What that means is:

  1. Changing the content does not alter its presentation or layout. For example, a longer title won’t necessarily shrink in font size to fit a fixed space.
  2. Changing the content and changing the presentation are not concurrent activities because separate systems manage them. If you want to adjust both the content and the presentation, you need to pivot between separate modes.

The second point raises a question: Could the same individual change both the content and its presentation?  In principle, yes. But in practice, the two sides are intended to be governed separately. Each has rules for what is allowed and changes must conform to these rules.  For example, the content can’t use nonstandard terms or punctuation.  Similarly, the presentation can’t incorporate nonstandard colors or fonts. 

The presentation is decided by rules that apply to multiple pages, not by individual choices for specific pages. Some individuals deride rules for constricting their expression or preventing them from configuring their web pages as they’d like.  But rules aren’t stifling.  They actually simplify processes and broaden the scope of possible changes by enabling global changes.  By having rules, organizations can change the content everywhere on a website without worrying that it will break the design and force fixes to the presentation.  They can also change the presentation globally without worrying about needing to adjust existing content. 

Singleton pages demonstrate the need for separation. Many objections to separation focus on singleton pages, which are one-off pages that have unique content and require a special layout because the nature of the content is unlike content elsewhere. An example would be a webpage presenting a timeline. While single pages seem to represent a tight correspondence between the content and its presentation, the presentation and content remain independent of each other.

The mistake some people make is to confuse design instances with design versions. Even if only a single page has a unique layout (one design instance), that does not imply the presentation is fixed (that there can only be one version of that instance.) An alternative presentation could be developed and used. 

Because the organization could decide to change the design of a unique webpage later, it’s important that the content should lead the design, not follow it.  

Content is also subject to change, and presentations must be prepared to “flex” to adjust to content changes. The original author often won’t control the content over its lifespan. Authors switch jobs, meaning someone else might revise the content later. 

With online content, there’s no single author.  All online content appears alongside other online content that has been created by other individuals at different times. 

It’s necessary to distinguish the content context (what other content is adjacent) from the context of its presentation (layout, formatting, and other presentational choices). 

This gets into content assembly: How content is layered into larger experiences. 

Content assembly is not presentation 

Content assembly is increasingly important as organizations move away from presentation-defined content creation.  Presentation-driven templated content traditionally determined the content’s assembly.  As practices move away from using templates to define the content, the role of assembly is becoming more significant, though it remains poorly understood.  Developers often confuse content assembly and content presentation, especially if they have spent careers working with template-based CMSs.  

Because templates previously handled assembly, some people mistakenly consider content assembly as part of content presentation.  But assembly is distinct from presentation. The context of the content (the related content that appears together) is conceptually distinct from the presentation context (how those content items are presented.)  

The layout is indifferent, while the assembly is opinionated. The layout is generic and agnostic about what content appears in a slot. Content assembly, by contrast, is specific about which content items are conceptually connected.  

Assembly determines which content pieces will appear together – though not how they will appear.

When assembly is subsumed by content presentation decisions, the construction of the content is fragile and brittle. 

Like Humpty Dumpty, after taking a fall, poorly assembled content can’t be reassembled. It’s breakable and is unusable.  

Fragile content that can’t be reassembled typically has been defined by its design.  

If content is assembled correctly, there should be no “breaking changes.”

Not all content assembly happens the same way.  The biggest barrier is how various people think about content assembly. They don’t make a distinction between two kinds of assembly:

  1. Intrinsic assembly, where units must be provided together to make sense and, therefore, should be preassembled during content development 
  2. Extrinsic assembly,  where variable combinations could be potentially offered, is best defined outside of the content development process.

Both the goals and process for intrinsic and extrinsic assembly are different.

Intrinsic assembly connects content that is intrinsically related in meaning: The pieces together form the larger message.  The content is preassembled through two means:

  1. Linking (or referencing) items
  2. Ordering items (in lists or as arrays of items)

Intrinsic relationships are predefined: A goes with B, or A always precedes B. The pieces used in intrinsic assembly are generally broken apart to support content reuse or maintenance rather than support variability in which pieces are combined.  Embedding items (images, for example) within another content item is another kind of intrinsic assembly, albeit less predefined than linking, since the decision of whether to embed is optional.

Extrinsic assembly is used when the communication is more contextual and situationally dependent. It often draws on content variations that have been developed to address highly specific situations where the right combination can’t be preassembled easily because they involve too many scenarios.  

Extrinsic assembly relies on predefining evaluative rules or creating instructions that are not fixed. These rules define which pieces and select what attributes they should have under specific conditions. This kind of programmatic assembly is often based on contextual rules relating to processes. 

Sometimes rules can be written into a schema such as JSON Schema when they are persistent as if—then—else statements.  Otherwise, the rules are written into code when matching specific variables and values. The rules or instructions could be written in GraphQL, Javascript, or some other programming language.

Authors have control over the assembly. Once organizations embrace true separation of content from its presentation by ensuring that presentation isn’t defining the assembly of content, authors can regain control of important decisions.  

With intrinsic assembly, authors can connect content pieces within their editor.  Ideally, the content model behind the scenes has already defined relationships between various content types, so the author doesn’t need to figure out which types belong together. Instead, they can focus on associating related content items.  If items must appear in a specific order to make sense to users, they can indicate that.

Extrinsic assembly happens outside the editor in the API layer. Because extrinsic assembly instructions rely on code, developers have, until recently, been the ones responsible for defining extrinsic assembly. But in the past few years, a new category of content orchestration tools has emerged that allows authors and other business users to define rules for assembly content without needing to rely on a developer.  

Content assembly gives power to authors to decide which content pieces to deliver to audiences. 

By setting up content assembly correctly and disentangling it from presentation, organizations remove common “it can’t be done” objections.

Poor implementations are a barrier, not an excuse for FUD

Separating content from its presentation has triggered resistance for many years. Change management case studies teach that people have difficulty changing habits and adopting new practices. It’s much easier to stick with the familiar, even if it isn’t desirable in the long term.

Yet the imperative of implementing such a separation only keeps growing.  Planning and managing web pages whose content and designs are tangled together is simply not sustainable. And shifts in technology, from composable systems to AI, require that content be unencumbered by its formatting and presentation. Layout can’t signal what content means if for no other reason than machines won’t see it.

Given the longstanding resistance to separation, one may wonder how the concept will ever gain the traction necessary to become the default practice in organizations. 

The good news is that separation is a sound concept that provides multiple benefits. Fear, uncertainty, and doubt may conspire to cloud these benefits, but they don’t negate them. Separating content from presentation is essential to building and improving upon prior content and design work.

The biggest barrier to the universal adoption of content-presentation separation is poor implementation. Bad tools, weak requirements, and immature knowledge all contribute to poor implementations, which seem to validate the expectation that separation can’t be done.

Yet poor implementations, while common, are hardly inevitable. Many organizations are moving up the maturity ladder. They recognize that the stakes are too important to ignore essential transformation in UX practices. They will leave behind organizations that commingle their content and presentation. The fear will shift to being left behind. 

–Michael Andrews