Tag: UI structure

Separating content and presentation: Moving past FUD

Post author By Michael Andrews
Post date September 24, 2024
1 Comment on Separating content and presentation: Moving past FUD

The principle of separating content from its presentation is more critical than ever. So why is it so hard to get buy-in for it?

This post takes a deep look at the FUD (Fear, uncertainty, and doubt) surrounding separation. It will address why FUD is prevalent and why it’s misplaced:

How content and design separation is different today from how it was considered in the past
Why tools make it difficult to separate content from design
The problems arising from design-defined content
The dodgy reasons why visual editing tools and DIY design are popular
How separation promotes clarity
Why the meaning of content is independent of its presentation
Why content’s meaning is persistent
How all kinds of content are becoming format free
The problems for users when relying on presentation to clarify content
Why dependence on presentation leads to ambiguity for AI and assistive technologies
The importance of supporting presentation changes that don’t require content changes
Why “custom” pages still need separation
How content assembly is different from content presentation
The two distinct kinds of assembly
How content assembly gives authors missing control
Why bad implementations generate FUD about separation
Why trapped content will become the new worry

A concept’s long journey toward acceptance

The principle of separating content from its presentation is a powerful and useful idea that is also controversial and resisted by people in all roles.

Resistance comes not just from writers accustomed to WYSIWYG editors. Developers can exaggerate the complexity of content-design separation or question its practicality. UX designers don’t always see its value. Vendors also play on this fear and sell solutions that undermine implementing mature practices.

Even people who agree with the concept in principle often abandon it when it seems like it’s too much effort.

Why doesn’t the concept of separating presentation from content get more love if it’s truly valuable? The simple answer is that the concept is so radical and powerful that it is easy to misunderstand. FUD sets in and disrupts progress.

Separation matters now more than ever. Discussions about separating content from presentation have a long history. Why revisit this topic now?

Past discussions, responding to changes happening in the early 2000s, don’t account for the current changes reshaping today’s digital ecosystems (e.g., the development of design systems, structured content, and the shift to composable and headless architectures.) UX practices have lagged behind these changes, which are forcing teams to re-examine assumptions about the fundamentals of how user experiences are developed and implemented.

What’s at stake is how we decide to create what we communicate:

Does the content of web pages depend on their layout?
Does an author need to work around a predefined design or change the design to match their content?
Should the layout adjust to the content?

The renewed relevance of an old debate. The topicality of separating content from its presentation has assumed renewed significance. While previously debated issues remain relevant, the context of the discussion has shifted over time.

Separating presentation from content is a long-established web design principle. It has earned its own Wikipedia entry and Wikidata identifier (Q3511030). The concept has an even older heritage as an extension of the principle of the “separation of concerns” used to design systems.

Two decades ago, the W3C took a significant, if incremental, step when it decided to jettison presentational tags (such as bold and italic) in favor of semantic ones (like strong and emphasis). Even though presentational elements were not entirely abolished—underlining still exists–the decision signaled the expectation that presentation would be managed separately from content.

Partisans debated the call to separate content from presentation as CSS began to displace presentational markup in HTML. For many discussants, the debate was never about content or presentation. It was about nothing more than CSS.

But others viewed the issue more existentially and contested the desirability and feasibility of thinking about content separately from its presentation. Websites continued to be designed with wireframes before any content was created. Developers crafted frontend frameworks composed of UI components that often defined the content presented on a website. It was hard for some people to imagine content without being able to “see” how it would be presented.

People settled into their conclusions and routines.

Lately, the fault lines between content and presentation have been exposed again. Vendors have struggled (poorly, in my view) with how to deliver “visual editing” while simultaneously supporting structured content, which has enjoyed a renaissance of interest. Vendors have been trying to graft UI layout components (front-end) and content blocks (back-end) into a “universal editor.” Some front-end frameworks turn every variable into a common pool of JSON data.

At the same time, technical developments are erasing prior distinctions between how we distinguish content, formats, and presentations. Computers are taking over many presentation decisions. All kinds of media can now be generated from text.

These developments have prompted a reexamination of core principles. Content and design are now governed by separate systems (content models and design systems) that have specific responsibilities. The content expresses “what” information and messages contain, while the design expresses “how” messages and information are presented, typically layout and formatting, but not limited to those dimensions.

Separating content and presentation brings transparency. The belief that what you say and how you say it are indivisible is an illusion. They are not bonded together in a hermetically sealed package, but are distinct ideas and goals.

That’s not to imply what content says and how it’s said are unrelated. Rather, the reality is that each side has independent power. The presentation can make trivial or even false details seem important, and it can bury important ones. Likewise, critical information can be overlooked by poor presentation.

The presentation does matter. But it’s a distinct dimension from content.

Content changes are explicit. The facts in content sometimes change, and messages may need to adapt to audiences. But the presentation is much more implicit and contextual. Presentations can change on a whim. Even when the content remains consistent, the presentation may change radically depending on where and when it appears.

The audience experience is derived from both the content and its presentation. It’s important to understand the contribution of each to that experience.

Dealing with separation anxiety

Loss aversion is a powerful motivation. Because our thinking about situations is anchored in how we habitually experience them, it can be hard to embrace a different experience. What’s familiar is comforting; what’s novel is disruptive. When your child is leaving for a week-long camp away from home, he or she may have separation anxiety. Similarly, when your content is separated from its design, it can feel disorienting.

Content professionals have become accustomed to thinking about content and presentation together. They expect to see what the content will look like and often expect to change that appearance as well.

Tools can lull us into believing content and presentation are inseparable. Two interaction paradigms have shaped authors’ expectations about how content and presentation interact. While very different, both imply that content should change based on the presentation chosen.

The first approach is represented by WYSIWYG tools, such as the page builders in many CMSs, which allow authors to format text and graphics any way they please. This approach encourages authors to adjust their content and presentation concurrently.

The second approach is represented by tools that use design templates that guide what content to create. In traditional CMSs, the creation of the content is guided by how it will appear on a page. A template defines what content is required. The content must adapt to the presentation defined by the template.

When content depends on its presentation, the design decides the content’s details. Numerous online tools promote the perception that the development of content depends on its layout. The dramatic popularity of Figma in designing web pages is an extreme example. Writers play a junior role on UX teams, filling in words in a graphic design layout. While such tools may promise the freedom of self-expression, they tend to impose constraints on making changes to layouts.

But content also needs to change. Whenever the content changes but is dependent on a fixed presentation, it creates a conflict. The presentation restricts what content is allowed.

A major motivation for separating content and presentation is to make presentations more flexible and changeable. The separation of content from presentation has expanded with the decoupling of frontend and backend systems. This decoupling enables content to be presented in multiple ways and allows the presentation to change quickly.

While the technical means to separate content from presentation are established and growing, the capacity of organizations to manage these dimensions remains immature. Some organizations avoid confronting change and favor expediency over improvement.

Separating concerns about tools from processes

Numerous online editing tools allow writers to change fonts, resize images, align text, change spacing, change the number of columns, and so on. Many provide more advanced layout features, such as the positioning of headings, the color of fonts, and entire color and layout themes.

“Visual editing” tools are a Band-Aid. Tools that allow authors to change both the content and its appearance are popular, and CMS vendors keep promoting them. But an awkward question arises: Why should an author make decisions about a page’s layout? The organization they work for likely publishes thousands of web pages. Shouldn’t all these pages need to follow common presentation guidelines rather than have individual authors decide how individual pages appear? Isn’t the UX design team supposed to be in charge of the presentation?

The desire of authors to decide the presentation is an old theme. DIY web design was once prevalent in organizations, and its problems prompted the emergence of design systems to reign in such patchwork design.

When DIY web design persists, it indicates a failure in an organization’s UX processes.

Some authors want presentation options out of necessity. They are given a generic blank page and are expected to fashion it into a meaningful experience. They hope that they can do that by dragging and dropping widgets on a screen. If effective UX design were truly so easy, millions of UX designers would be out of work.

Other times, authors are trying to override a rigid and poorly designed layout template that doesn’t support the presentation of the content they have developed.

In both cases, the author has been shortchanged by their UX design colleagues, who failed to provide them with a serviceable layout for their content.

Separating content from presentation forces organizations to confront how well they understand their publishing requirements. When large numbers of pages must be custom-designed because each is considered a “special case,” that’s an indication that the organization hasn’t planned its presentation adequately. Special cases, by definition, are exceptions, not defaults. No organization should feel overwhelmed by the volume of custom web pages it must design.

The goal of separation is to enhance clarity, not enforce style. When the concept of separation first emerged with the use of CSS, it became linked to the notion of styling. But to view the presentation as merely styling is a crude understanding of the principle.

Separation recognizes that there is no “one best way” to present content. What’s best is contextual to the situation and provisional until a newer presentation proves more effective. The same underlying content can be presented in multiple ways, which can shift how it is perceived, understood, or consumed. The goal of separating content and presentation is to allow multiple presentations of the same content, some of which will be better and clearer than others.

Separation allows the content to benefit from iterative design improvements. Presentation standards evolve to reflect learnings about what works most effectively. Separation allows UI components that are used across many content items, such as heroes or alerts, to be tested and improved.

Yes, Virginia, content is still meaningful without presentation

Separation requires a shift in mindset and practice. People may push back by proclaiming that it can’t be done – that the notion is nonsensical, that it threatens the magic content can offer.

A common objection contends that content can’t be stripped of its presentation and remain intelligible. This view holds that presentation is integral to the meaning of content, so it can’t be separated from the content. After all, if presentation supports the meaning of content, then content without presentation must be meaningless, right?

To address categorical objections like this, it’s necessary to unpack beliefs about how experiences become meaningful. Doing so helps to clear the cobwebs of unexamined assumptions and highlight the changes happening in digital practices.

The meaning of content is independent of its presentation. While the presentation is significant in conveying meaning in the broadest sense (by stressing emphasis or salience), it doesn’t follow that content depends on a specific presentation.

Content may be harder to understand without its presentation, but rarely is it contingent on its presentation to convey its meaning because that implies the presentation materially changes the meaning of the content, which should never be the case. The presentation can change without altering the meaning of the content.

The principle of independence has some radical implications:

Authors must let go of preconceptions of how their content will appear, either now or later. The content’s appearance is subject to change.
The content is independent of the media it may appear in as well.

Yet because content can exist in many forms (media), it’s sometimes difficult to distinguish what’s content from what’s presentation.

Simply put, the content represents the substance or essence of what’s presented. That essence should be defined precisely and not be subject to variable interpretation. The substance doesn’t depend on its context: It will remain the same wherever it is presented.

Content’s meaning is persistent, however or wherever it’s presented. The literal meaning of content is fixed by its encoding. Its presentation may influence its connotation but not its literal meaning.

Communication–the ability of different people to reproduce the same message–depends on distilling the essence of a message – its content– from how it is presented.

Throughout history, people have encoded the meaning of content by using standardized notations. These standards allow people who do not know one another to interpret the content in a consistent way.

The substance of content is typically defined as text, symbols, or structured data of some sort that can be composed or compiled into various presentations. As computer technology continues to advance, it is becoming easier to break down content presentations into constituent elements and separate the content from its presentation.

Writing started by using symbols to stand for things or concepts. Then, writing developed symbols for the sounds of words – using letters, phonetic alphabets, and even shorthand symbols.

Later, people developed notation to represent music and even dance. The symbols don’t need to be visual. Braille can represent letters or sounds.

As symbols become formalized, they become independent of a specific presentation. At first, writing was handwritten, then engraved, and later typeset; with each step, the content became less tied to its original presentation.

Text is a surprisingly versatile way to represent content that can be transformed into presentations in all kinds of media. Even the richest content media – the movie – is built from a text script. AI has shown the possibilities of generating audio and video from text.

The trend has evolved to separate content from its presentation. All kinds of content can be extracted and separated from their presentations, while presentations in many formats can be built from “raw” content. For example, an audio recording can be turned into a text transcript, and that text can be used to generate another audio presentation featuring a different voice or even a different language.

Maps were historically considered content that was inseparable from its presentation. What value is a map outside of its presentation? But maps today are databases of structured content that can be presented in multiple ways. The same information can be presented as a street map or a satellite image or rely on text labels or icons, for example. Maps are manifested through their presentation but are not defined by any specific presentation.

Skeptics may object that certain kinds of content always depend on their presentation. If a presentation can only be presented in one way, then it is content. Presentation, by definition, implies that there is more than one way to present something. The presentation is not fixed.

Photos as media can be content or presentations, depending on their essence. The original source file of a photo image is content, but subsequent cropping, edits and treatments of the image are presentations of the original content. The trend in image manipulation is toward non-destructive editing.

Even visual content can be represented non-visually. Many content creators believe that visual content has a fixed presentation and thus can’t be separated from the content it represents. That assumption is being challenged in more and more domains.

Consider diagrams. While diagrams are meant to be visual, they do not have to be represented visually. There are multiple approaches to representing diagrams as text, which can generate alternative visual renderings of the diagram. Neither the format of diagrams nor its presentation are fixed.

What about music? Since music relies on standard symbols positioned on a staff, it would seem to have a fixed presentation. But while sheet music is the most popular representation of a music score, it is not the only option. Music scores can also be represented as text using the ABC notation, which can generate a visual score. Electronic music compositions can also be represented using the MIDI protocol, which can be manipulated to generate alternative presentations of the composition.

Mathematics is another kind of content that is often presented visually but doesn’t not need to be represented with a fixed presentation. Even though mathematics uses widely understood symbols, their presentation can be variable. Certain mathematical statements can be presented in more than one way. Mathematics has developed two parallel markups: one for the content and one for its presentation.

The presentation should add meaning, not change meaning. Presentation supplies context to content, which can enhance its meaning. The presentation helps define the intent for how readers will experience the content.

The same content should always mean the same thing, however it is presented. The one situation where a presentation will alter the intrinsic meaning is if it reinterprets the content’s original intent by changing the selection of details — the process of context shifting. This may happen unintentionally when the content is poorly developed. For example, it could be possible that a less detailed view of the content gives a different impression than the views with full details. Or it may occur when the content can support scenarios beyond what was originally envisioned, which shifts how the content is understood. Because these situations are possible with decoupling, it’s imperative to develop content that is not wedded to preconceptions of how it will be presented, since future presentations cannot be known in advance.

One reason machines (whether assistive technology or AI bots) misinterpret content is that the content is ambiguous, relying on contextual cues to explain what it is meant to say. The W3C has warned of the reliance on visual structure to convey the meaning of content: “While presentational features visually imply structure — users can determine headings, paragraphs, lists, etc. from the formatting conventions used — these features do not encode the structure unambiguously enough for assistive technology to interact with the page effectively.”

Presentation can’t fix ambiguity in content. If your content depends on how it’s presented to be understood correctly, then the content itself is likely ambiguous and inherently confusing. The role of presentation is to connect ideas that are intelligible on their own, not to make unintelligible ideas somehow discernible through hand-waving.

Some brands, unfortunately, publish fragments of content whose meaning is unintelligible without seeing the context in which it appears. These practices have become more prevalent in recent years, as the fetish of minimalism has been rationalized as promoting simplicity and usability, even when it often results in the opposite effect. Readers are expected to guess the meaning of a hint or icon based on other content presented elsewhere. These hidden meanings, while seemingly elegant, fail to inform the screen reader user or pass legal compliance reviews for clarity and the absence of potential misinterpretation. The ubiquity of bad practices does not legitimize them. Rather, they demonstrate the need for content to be explicit and clear independent of its presentation.

Treating communication as a “content design” package has resulted in numerous examples of deceptive design practices where essential information is suppressed. These examples are misleading precisely because the content, on its own, does not fully or candidly convey the information users need to know to make an informed decision.

Humpty Dumpty and Alice, from Through the Looking-Glass. Illustration by John Tenniel.

Illusions of control

How should decisions be made about how content appears? An individual’s latitude to make decisions about the presentation of content is not synonymous with the organization’s capacity to make these choices.

Some authors protest when they don’t have options to change the styling or layout of their content. They jump to the conclusion that the presentation can’t be changed and believe that their input is necessary to decide how the content looks. In essence, they assume if they don’t see an option to change the presentation, that option doesn’t exist.

Even though authors are not in charge of the presentation, that doesn’t imply that the presentation is fixed. Organizations can change the presentation whenever they want to. Organizations generally aim to have various content they publish presented in a consistent manner because such consistently promotes clarity and understanding. They don’t want to encourage the helter-skelter redesigns of individual web pages.

The presentation can change independently of the content. The presentation is not fixed and can change readily when the organization decides to do so.

Yet, such changes are not the byproduct of content changes. They are separate decisions. What that means is:

Changing the content does not alter its presentation or layout. For example, a longer title won’t necessarily shrink in font size to fit a fixed space.
Changing the content and changing the presentation are not concurrent activities because separate systems manage them. If you want to adjust both the content and the presentation, you need to pivot between separate modes.

The second point raises a question: Could the same individual change both the content and its presentation? In principle, yes. But in practice, the two sides are intended to be governed separately. Each has rules for what is allowed and changes must conform to these rules. For example, the content can’t use nonstandard terms or punctuation. Similarly, the presentation can’t incorporate nonstandard colors or fonts.

The presentation is decided by rules that apply to multiple pages, not by individual choices for specific pages. Some individuals deride rules for constricting their expression or preventing them from configuring their web pages as they’d like. But rules aren’t stifling. They actually simplify processes and broaden the scope of possible changes by enabling global changes. By having rules, organizations can change the content everywhere on a website without worrying that it will break the design and force fixes to the presentation. They can also change the presentation globally without worrying about needing to adjust existing content.

Singleton pages demonstrate the need for separation. Many objections to separation focus on singleton pages, which are one-off pages that have unique content and require a special layout because the nature of the content is unlike content elsewhere. An example would be a webpage presenting a timeline. While single pages seem to represent a tight correspondence between the content and its presentation, the presentation and content remain independent of each other.

The mistake some people make is to confuse design instances with design versions. Even if only a single page has a unique layout (one design instance), that does not imply the presentation is fixed (that there can only be one version of that instance.) An alternative presentation could be developed and used.

Because the organization could decide to change the design of a unique webpage later, it’s important that the content should lead the design, not follow it.

Content is also subject to change, and presentations must be prepared to “flex” to adjust to content changes. The original author often won’t control the content over its lifespan. Authors switch jobs, meaning someone else might revise the content later.

With online content, there’s no single author. All online content appears alongside other online content that has been created by other individuals at different times.

It’s necessary to distinguish the content context (what other content is adjacent) from the context of its presentation (layout, formatting, and other presentational choices).

This gets into content assembly: How content is layered into larger experiences.

Content assembly is not presentation

Content assembly is increasingly important as organizations move away from presentation-defined content creation. Presentation-driven templated content traditionally determined the content’s assembly. As practices move away from using templates to define the content, the role of assembly is becoming more significant, though it remains poorly understood. Developers often confuse content assembly and content presentation, especially if they have spent careers working with template-based CMSs.

Because templates previously handled assembly, some people mistakenly consider content assembly as part of content presentation. But assembly is distinct from presentation. The context of the content (the related content that appears together) is conceptually distinct from the presentation context (how those content items are presented.)

The layout is indifferent, while the assembly is opinionated. The layout is generic and agnostic about what content appears in a slot. Content assembly, by contrast, is specific about which content items are conceptually connected.

Assembly determines which content pieces will appear together – though not how they will appear.

When assembly is subsumed by content presentation decisions, the construction of the content is fragile and brittle.

Like Humpty Dumpty, after taking a fall, poorly assembled content can’t be reassembled. It’s breakable and is unusable.

Fragile content that can’t be reassembled typically has been defined by its design.

If content is assembled correctly, there should be no “breaking changes.”

Not all content assembly happens the same way. The biggest barrier is how various people think about content assembly. They don’t make a distinction between two kinds of assembly:

Intrinsic assembly, where units must be provided together to make sense and, therefore, should be preassembled during content development
Extrinsic assembly, where variable combinations could be potentially offered, is best defined outside of the content development process.

Both the goals and process for intrinsic and extrinsic assembly are different.

Intrinsic assembly connects content that is intrinsically related in meaning: The pieces together form the larger message. The content is preassembled through two means:

Linking (or referencing) items
Ordering items (in lists or as arrays of items)

Intrinsic relationships are predefined: A goes with B, or A always precedes B. The pieces used in intrinsic assembly are generally broken apart to support content reuse or maintenance rather than support variability in which pieces are combined. Embedding items (images, for example) within another content item is another kind of intrinsic assembly, albeit less predefined than linking, since the decision of whether to embed is optional.

Extrinsic assembly is used when the communication is more contextual and situationally dependent. It often draws on content variations that have been developed to address highly specific situations where the right combination can’t be preassembled easily because they involve too many scenarios.

Extrinsic assembly relies on predefining evaluative rules or creating instructions that are not fixed. These rules define which pieces and select what attributes they should have under specific conditions. This kind of programmatic assembly is often based on contextual rules relating to processes.

Sometimes rules can be written into a schema such as JSON Schema when they are persistent as if—then—else statements. Otherwise, the rules are written into code when matching specific variables and values. The rules or instructions could be written in GraphQL, Javascript, or some other programming language.

Authors have control over the assembly. Once organizations embrace true separation of content from its presentation by ensuring that presentation isn’t defining the assembly of content, authors can regain control of important decisions.

With intrinsic assembly, authors can connect content pieces within their editor. Ideally, the content model behind the scenes has already defined relationships between various content types, so the author doesn’t need to figure out which types belong together. Instead, they can focus on associating related content items. If items must appear in a specific order to make sense to users, they can indicate that.

Extrinsic assembly happens outside the editor in the API layer. Because extrinsic assembly instructions rely on code, developers have, until recently, been the ones responsible for defining extrinsic assembly. But in the past few years, a new category of content orchestration tools has emerged that allows authors and other business users to define rules for assembly content without needing to rely on a developer.

Content assembly gives power to authors to decide which content pieces to deliver to audiences.

By setting up content assembly correctly and disentangling it from presentation, organizations remove common “it can’t be done” objections.

Poor implementations are a barrier, not an excuse for FUD

Separating content from its presentation has triggered resistance for many years. Change management case studies teach that people have difficulty changing habits and adopting new practices. It’s much easier to stick with the familiar, even if it isn’t desirable in the long term.

Yet the imperative of implementing such a separation only keeps growing. Planning and managing web pages whose content and designs are tangled together is simply not sustainable. And shifts in technology, from composable systems to AI, require that content be unencumbered by its formatting and presentation. Layout can’t signal what content means if for no other reason than machines won’t see it.

Given the longstanding resistance to separation, one may wonder how the concept will ever gain the traction necessary to become the default practice in organizations.

The good news is that separation is a sound concept that provides multiple benefits. Fear, uncertainty, and doubt may conspire to cloud these benefits, but they don’t negate them. Separating content from presentation is essential to building and improving upon prior content and design work.

The biggest barrier to the universal adoption of content-presentation separation is poor implementation. Bad tools, weak requirements, and immature knowledge all contribute to poor implementations, which seem to validate the expectation that separation can’t be done.

Yet poor implementations, while common, are hardly inevitable. Many organizations are moving up the maturity ladder. They recognize that the stakes are too important to ignore essential transformation in UX practices. They will leave behind organizations that commingle their content and presentation. The fear will shift to being left behind.

–Michael Andrews

Tags content model, UI structure

Content Experience

Bridging the divide between structured content and user interface design

Post author By Michael Andrews
Post date September 25, 2023
1 Comment on Bridging the divide between structured content and user interface design

Decoupled design architectures are becoming common as more organizations embrace headless approaches to content delivery. Yet many teams encounter issues when implementing a decoupled approach. What needs to happen to get them unstuck?

Digital experts have long advocated for separating or decoupling content from its presentation. This practice is becoming more prevalent with the adoption of headless CMSs, which decouple content from UI design.

Yet decoupling has been held back by UI design practices. Enterprise UX teams rely on design systems too much as the basis for organizing UIs, creating a labor-intensive process for connecting content with UI components.

Why decoupled design is hard

Decoupled design, where content and UI are defined independently, represents a radical break from incumbent practices used by design teams. Teams have been accustomed to defining UI designs first before worrying about the content. They create wireframes (or more recently, Figma files) that reflect the UI design, whether that’s a CMS webpage template or a mobile app interface. Only after that is the content developed.

Decoupled design is still unfamiliar to most enterprise UX teams. It requires UX teams to change their processes and learn new skills. It requires robust conceptual thinking, proactively focusing on the patterns of interactions rather than reactively responding to highly changeable details.

The good news: Decoupling content and design delivers numerous benefits. A decoupled design architecture brings teams flexibility that hasn’t been possible previously. Content and UI design teams can each focus on their tasks without generating bottlenecks arising from cross-dependencies. UI designs can change without requiring the content be rewritten. UI designers can understand what content needs to be presented in the UI before they start their designs. Decoupling reduces uncertainty and reduces the iteration cycles associated with content and UI design changes needing to adjust to each other.

It’s also getting easier to connect content to UI designs. I have previously argued that New tools, such as RealContent, can connect structured content in a headless CMS directly to a UI design in Figma. Because decoupled design is API-centric, UX teams have the flexibility to present content in almost any tool or framework they want.

The bad news: Decoupled design processes still require too much manual work. While they are not more labor intensive than existing practices, decoupled design still requires more effort than it should.

UI designers need to focus on translating content requirements into a UI design. The first need to look at the user story or job to be done and translate that into an interaction flow. Then, they need to consider how users will interact with content on screen by screen. They need to map the UI components presented in each screen to fields defined in the content model

When UX teams need to define these details, they are commonly starting from scratch. They map UI to the content model on a case-by-case basis, making the process slow and potentially inconsistent. That’s hugely inefficient and time-consuming.

Decoupled design hasn’t been able to realize its full potential because UX design processes need more robust ways of specifying UI structure.

UI design processes need greater maturity

Design systems are limited in their scope. In recent years, much of the energy in UI design processes has centered around developing design systems. Design systems have been important in standardizing UI design presentations across products. They have accelerated the implementation of UIs.

Design systems define specific UI components, allowing their reusability.

But it’s essential to recognize what design systems don’t do. They are just a collection of descriptions of the UI components that are available for designers to use if they decide to. I’ve previously argued that Design systems don’t work unless they talk to content models.

Design systems, to a large extent, are content-agnostic. They are a catalog of empty containers, such as cards or tiles, that could be filled with almost anything. They don’t know much about the meaning of the content their components present, and they aren’t very robust in defining how the UI works. They aren’t a model of the UI. They are a style guide.

Design systems define the UI components’ presentation, not the UI components’ role in supporting user tasks. They define the styling of UI components but don’t direct which component must be used. Most of these components are boxes constructed from CSS.

Unstructured design is a companion problem to unstructured content. Content models arose because unstructured content is difficult for people and machines to manage. The same problem arises with unstructured UI designs.

Many UI designers mistakenly believe that their design systems define the structure of the UI. In reality, they define only the structure of the presentation: which box is embedded in another box. While they sometimes contain descriptive annotations explaining when and how the component can be used, these descriptions are not formal rules that can be implemented in code.

Cascading Style Sheets do not specify the UI structure; it only specifies the layout structure. No matter how elaborately a UI component layout is organized in CSS or how many layers of inheritance design tokens contain, the CSS does not tell other systems what the component is about.

Designers have presumed that the Document Object Model in HTML structures the UI. Yet, the structure that’s defined by the DOM is rudimentary, based on concepts dating from the 1990s, and cannot distinguish or address a growing range of UI needs. The DOM is inadequate to define contemporary UI structure, which keeps adding new UI components and interaction affordances. Although the DOM enables the separation of content from its presentation (styling), the DOM mixes content elements with functional elements. It tries to be both a content model and a UI model but doesn’t fulfill either role satisfactorily.

Current UIs lack a well-defined structure. It’s incredible that after three decades of the World Wide Web, computers can’t really read what’s on a webpage. Bots can’t easily parse the page and know with confidence the role of each section. IT professionals who need to migrate legacy content created by people at different times in the same organization find that there’s often little consistency in how pages are constructed. Understanding the composition of pages requires manual interpretation and sleuthing.

Even Google has trouble understanding the parts of web pages. The problem is acute enough that a Google research team is exploring using machine vision to reverse engineer the intent of UI components. They note the limits of DOMs: “Previous UI models heavily rely on UI view hierarchies — i.e., the structure or metadata of a mobile UI screen like the Document Object Model for a webpage — that allow a model to directly acquire detailed information of UI objects on the screen (e.g., their types, text content and positions). This metadata has given previous models advantages over their vision-only counterparts. However, view hierarchies are not always accessible, and are often corrupted with missing object descriptions or misaligned structure information.”

The lack of UI structure interferes with the delivery of structured content. One popular attempt to implement a decoupled design architecture, the Blocks Protocol spearheaded by software designer Joel Spolsky, also notes the unreliability of current UI structures. “Existing web protocols do not define standardized interfaces between blocks [of content] and applications that might embed them.”

UI components should be machine-readable

Current UI designs aren’t machine-readable – they aren’t intelligible to systems that need to consume the code. Machines can’t understand the idiosyncratic terminology added to CSS classes.

Current UIs are coded for rendering by browsers. They are not well understood by other kinds of agents. The closest they’ve come is the addition of WAI-ARIA code that adds explicit role-based information to HTML tags to help accessibility agents interpret how to navigate contents without audio, visual, or haptic inputs and outputs. Accessibility code aims to provide parity in browser experiences rather than describe interactions that could be delivered outside of a browser context. Humans must still interpret the meaning of widgets and rely on browser-defined terminology to understand interaction affordances.

The failure of frontend frameworks to declare the intent of UI components is being noticed by many parties. UI needs a model that can specify the purpose of the UI component so that it can be connected to the semantic content model.

A UI model will define interaction semantics and rules for the functional capabilities in a user interface. A UI model needs to define rules relating to the functional purpose of various UI components and when they must be used. A UI model will provide a level of governance missing from current UI development processes, which rely on best-efforts adherence to design guidelines and don’t define UI components semantically.

When HTML5 was introduced, many UI designers hailed the arrival of “semantic HTML.” But HTML tags are not an adequate foundation for a UI model. HTML tags are limited to a small number of UI elements that are overly proscriptive and incomplete. HTML tags describe widgets like buttons rather than functions like submit or cancel. While historically, actions were triggered by buttons, that’s no longer true today. Users can invoke actions using many UI affordances. UI designers may change UI element supporting an action from a button to a link if they change the context where the action is presented, for example. Hard-coding the widget name to indicate its purpose is not a semantic approach to managing UIs. This issue becomes more problematic as designers must plan for multi-modal interaction across interfaces.

UI specifications must transcend the widget level. HTML tags and design system components fall short of being viable UI models because they specify UI instances rather than UI functions. A button is not the only way for a user to submit a request. Nor is a form the only way for a user to submit input.

When a designer needs to present a choice to users, the design system won’t specify which UI component to use. Rather it will describe a range of widgets, and it is up to the designer to figure out how they want to present the choice.

Should user choices be presented as a drop-down menu? A radio button? A slider? Design systems only provide descriptive guidance. The UI designer needs to read and interpret them. Rarely will the design system provide a rule based on content parameters, such as if the number of choices is greater than three, and the choice text is less than 12 characters, use a drop-down.

UIs should be API-ready. As content becomes more structured, semantically defined, and queriable via APIs, the content needs the UI designs that present it to be structured, too. Content queries need to be able to connect to UI objects that will present the content and allow interaction with the content. Right now, this is all done on an ad hoc basis by individual designers.

Let’s look at the content and UI sides from a structural perspective.

On the content side, a field may have a series of enumerated values: predefined values such as a controlled vocabulary, taxonomy terms, ordinal values, or numeric ranges. Those values are tracked and managed internally and are often connected to multiple systems that process information relating to the values.

On the UI side, users face a range of constrained choices. They must pick from among the presented values. The values might appear as a pick list (or a drop-down menu or a spinner). The first issue, noted by many folks, is the naming problem in design systems. Some systems talk about “toasts,” while other systems don’t refer to them. UI components that are essentially identical in their outward manifestations can operate under different names.

Why is this component used? The bigger structural problem is defining the functional purpose of the UI component. The component chosen may change, but its purpose will remain persistent. Currently, UI components are defined by their outward manifestation rather than their purpose. Buttons are defined generically as being primary or secondary – expressed in terms of the visual attention they draw – rather than the kind of actions the button invokes (confirm, cancel, etc.)

Constrained choice values can be presented in multiple ways, not just as a drop-down menu. It could be a slider (especially if values are ranked in some order) or even as free text where the user enters anything they wish and the system decides what is the closest match to enumerated values managed by the system.

A UI model could define the component as a constrained value option. The UI component could change as the number of values offered to users changed. In principle, the component updating could be done automatically, provided there were rules in place to govern which UI component to use under which circumstances.

The long march toward UI models

A design system specifies how to present a UI component: its colors, size, animation behaviors, and so on. A UI model, in contrast, will specify what UI component to present: the role of the component (what it allows users to do) and the tasks it supports.

Researchers and standards organizations have worked on developing UI models for the past two decades. Most of this work is little known today, eclipsed by the attention in UI design to CSS and Javscript frameworks.

In the pre-cloud era, at the start of the millennium, various groups looked at how to standardize descriptions of the WIMP (windows, icons, menu, pointers) interface that was then dominant. The first attempt was Mozilla’s XUL. A W3C group drafted a Model-Based User Interfaces specification (MBUI). Another coalition of IBM, Fujitsu, and others developed a more abstract approach to modeling interactions, the Software & Systems Process Engineering Meta-Model Specification.

Much of the momentum for creating UI models slowed down as UI shifted to the browser with the rise of cloud-based software. However, the need for platform-independent UI specification continues.

Over the past decade, several parties have pursued the development of a User Interface Description Language (UIDL). “A User Interface Description Language (UIDL) is a formal language used in Human-Computer Interaction (HCI) in order to describe a particular user interface independently of any implementation….meta-models cover different aspects: context of use (user, platform, environment), task, domain, abstract user interface, concrete user interface, usability (including accessibility), workflow, organization, evolution, program, transformation, and mapping.”

Another group defines UIDL as “a universal format that could describe all the possible scenarios for a given user interface.”

Task and scenario-driven UI modeling. Source: OpenUIDL

Planning beyond the web. The key motivation has been to define the user interface independently of its implementation. But even recent work at articulating a UIDL has largely been web-focused.

Providing a specification that is genuinely independent of implementation requires that it not be specific to any delivery channel. Most recently, a few initiatives have sought to define a UI model that is channel agnostic.

One group has developed OpenUIDL, “a user interface description language for describing omnichannel user interfaces with its semantics by a meta-model and its syntax based on JSON.”

UI models should work across platforms. Much as content models have allowed content to be delivered to many channels via APIs, UI models are needed to specific user interaction across various channels. While responsive design has helped allow a design to adapt to different devices that use browsers, a growing range of content is not browser-based. In addition to emerging channels such as mixed reality (XR) promoted by Apple and Meta and Generative AI chatbots promoted by Microsoft, Google, OpenAI, and others, the IoT revolution is creating more embedded UIs in devices of all kinds.

The need for cross-platform UI models isn’t only a future need. It shapes companies’ ability to coordinate decades-old technologies such as ATMs, IVRs, and web browsers.

A model can support a ‘portable UI.’ A prominent example of the need for portable UIs comes from the financial sector, which relies on diverse touchpoints to service customers. One recent UI model focused on the financial industry is called Omni-script. It provides “a basic technique that uses a JSON based user interface definition format, called omni-script, to separate the representation of banking services in different platforms/devices, so-called channels….the target platforms that the omnichannel services span over contains ATMs, Internet banking client, native mobile clients and IVR.”

The ideal UI model will be simple enough to implement but flexible enough to address many modes of interaction (including natural language interfaces) and UI components that will be used in various interfaces.

Abstraction enables modularity. UI models share a level of abstraction that is missing in production-focused UI specifications.

The process of abstraction starts with an inventory of UI components a firm has deployed across channels and touchpoints. Ask what system and user functionality each component supports. Unlike design systems development, which looks to standardize the presentation of components, UI models seek to formalize how to describe the role of each component in supporting a user or system task.

The abstraction of UI components according to the tasks they support. Source: W3C Model-Based UI XG

Suppose the functionality is intended to provide help for users. Help functionality can be further classified according to the kind of help offered. Will the functionality diagnose a problem, guide users in making a decision, disambiguate an instruction, introduce a new product feature, or provide an in-depth explanation of a topic?

A UI model maps relationships. Consider functionality that helps users disambiguate the meaning of content. We can refer to UI components as disambiguation elements in the UI model (a subset of help elements) whose purpose is to clarify the user’s understanding of terms, statements, assertions, or representations. They would be distinct from confirmation elements that are presented to affirm that the user has seen or heard information and acknowledges or agrees to it. The model would enumerate different UI elements that the UI design can implement to support disambiguation. Sometimes, the UI element will be specific to a field or data type. Some examples of disambiguation elements are:

Tooltips used in form instructions or labels
“Explain” prompt requests used in voice bots
Annotations used in text or images
Visual overlays used in photos, maps, or diagrams
Did-you-mean counter-suggestions used in text or voice search
See-also cross-references used in menus, indexes, and headings

The model can further connect the role of the UI element with:

When it could be needed (the user tasks such as content navigation, information retrieval, or providing information)
Where the elements could be used (context of application, such as a voice menu or a form.)

The model will show the M:N relationships between UI components, UI elements, UI roles and subroles, user tasks, and Interaction contexts. Providing this traceability will facilitate a rules-based mapping between structured content elements defined in the content model with cross-channel UX designs delivered via APIs. As these relationships become formalized, it will be possible to automate much of this mapping to enable adaptive UI designs across multiple touchpoints.

The model modularizes functionality based on interaction patterns. Designers can combine functional modules in various ways. They can provide hybrid combinations when functional modules are not mutually exclusive, as in the case of help. They can adapt and adjust them according to the user context: what information the user knows or has available, or what device they are using and how readily they can perform certain actions.

What UI models can deliver that’s missing today

A UI model allows designers to focus on the user rather than the design details of specific components, recognizing that multiple components could be used to support users, It can provide critical information before designers choose a specific UI component from the design system to implement for a particular channel.

Focus the model on user affordances, not widgets. When using a UI model, the designer can focus on what the user needs to know before deciding how users should receive that information. They can focus on the user’s task goals – what the user wants the computer to do for them – before deciding how users must interact with the computer to satisfy that need. As interaction paradigms move toward natural language interfaces and other non-GUI modalities, defining the interaction between users, systems, and content will be increasingly important. Content is already independent of a user interface, and interaction should become unbound to specific implementations as well. Users can accomplish their goals by interacting with systems on platforms that look and behave differently.

Both content and interactions need to adapt to the user context.

What the user needs to accomplish (the user story)
How the user can achieve this task (alternative actions that reflect the availability of resources such as user or system information and knowledge, device capabilities, and context constraints
The class of interaction objects that allow the user to convey and receive information relating to the task

Much of the impetus for developing UI models has been driven by the need to scale UI designs to address complex domains. For UI designs to scale, they must be able to adapt to different contexts.

UI models enable UX orchestration. A UI model can represent interactions at an abstract level so that content can be connected to the UI layer independently of which UI is implemented or how the UI is laid out.

For example, users may want to request a change, specify the details of a change, or confirm a change. All these actions will draw on the same information. But they could be done in any order and on various platforms using different modalities.

Users live in a multi-channel, multi-modal world. Even a simple action, such as confirming one’s identity while online, can be done through multiple pathways: SMS, automated phone call, biometric recognition, email, authenticator apps, etc.

When firms specify interactions according to their role and purpose, it becomes easier for systems to hand off and delegate responsibilities to different platforms and UIs that users will access. Currently, this orchestration of the user experience across touchpoints is a major challenge in enterprise UX. It is difficult to align channel-specific UI designs with the API layer that brokers the content, data, and system responses across devices.

UI models can make decoupled design processes work better

UI models can bring greater predictability and governance to UI implementations. Unlike design systems, UI models do not rely on not voluntary opt-in by individual developers. They become an essential part of the fabric of the digital delivery pipeline and remove inconsistent ways developers may decide to connect UI components to the content model – sometimes derisively referred to as “glue code.” Frontend developers still have options about which UI components to use, provided the UI component matches the role specified in the UI model.

UI governance is a growing challenge as new no-code tools allow business users to create their UIs without relying on developers. Non-professional designers could use components in ways not intended or even create new “rogue” containers. A UI model provides a layer to govern UIs so that the components are consistent with their intended purpose.

UI models can link interaction feedback with content. A UI model can provide a metadata layer for UIs. It can, for example, connect state-related information associated with UI components such as allowed, pending, or unavailable with content fields. This can reduce manual work mapping these states, making implementation more efficient,

An opportunity to streamline API management. API federation is currently complex to implement and difficult to understand. The ad hoc nature of many federations often means that there can be conflicting “sources of truth” for content, data, and transactional systems of record.

Many vendors are offering tools providing composable front-ends to connect with headless backends that supply content and data. However, composable frontends are still generally opinionated about implementation, offering a limited way to present UIs that don’t address all channels or scenarios. A UI model could support composable approaches more robustly, allowing design teams to implement almost any front end they wish without difficulty.

UI models can empower business end-users. Omnichannel previews are challenging, especially for non-technical users. By providing a rule-based encoding of how content is related to various presentation possibilities in different contexts and on various platforms, UI models can enable business users to preview different ways customers will experience content.

UI models can future-proof UX. User interfaces change all the time, especially as new conventions emerge. The decoupling of content and UI design makes redesign easier, but it is still challenging to adapt a UI design intended for one platform to present on another. When interactions are grounded in a UI model, this adaptation process becomes simpler.

The work ahead

While a few firms are developing UI models, and a growing number are seeing the need for them, the industry is far from having an implementation-ready model that any firm can adopt and use immediately. Much more work is needed.

One lesson of content models is that the need to connect systems via APIs drives the model-making process. It prompts a rethinking of incumbent practices and a willingness to experiment. While the scope of creating UI models may seem daunting, we have more AI tools to help us locate common interaction patterns and catalog how they are presented. It’s becoming easier to build models.

–Michael Andrews

Tags UI model, UI structure