Categories
Content Engineering

Where does Content Structuring Happen?

Many useful approaches are available to support the structuring of content.  It’s important to understand their differences, and how they complement each other.  I want to consider content structuring in terms of a spectrum of approaches that address different priorities. 

Only a few years ago most discussion about structuring content focused on the desirability of doing it.  Now we are now seeing more written about how to do it, spawning discussion around topics such as content models, design patterns, templates, message hierarchies, vocabulary lists, and other approaches.  All these approaches contribute to structuring content.  Structuring content involves a combination of human decisions, design methods, and automated systems. 

Lately I’ve been thinking about how to unlock the editorial benefits of content models, which are generally considered a technical topic. I realized that discussing this angle could be challenging because I would need to separate general ideas about structuring content from concepts that are specific to content models.  While content models provide content with structure, so do other activities and artifacts.  If the goal is to structure content, what’s unique about content models?   We need to unpack the concept of structuring content to clarify what it means in practice.

Yet structuring content is not the true end goal.  Structuring content is simply a means to an end.  It’s the benefits of structuring content that are the real goal.  The expected benefits include improved consistency, flexibility, scaleability, and efficiency. Ultimately, the goal should be to deliver more unique and distinctive content: tailored and targeted to user needs, and reflecting the brand’s strategic priorities.  

 In reality, content structuring is not a thing.  It’s an umbrella term that covers a range of things that promote benefits, such as content consistency and so on.  Many approaches contribute.  But no single can approach claim “mission accomplished.”  

What are we talking about, exactly?

People with different roles and responsibilities talk about structuring content.  It sometimes seems like they are talking about different surface features of a giant pachyderm.

The Guardian newspaper earlier this month published an article on how they use “a content model to create structured content”.  

“Structured content works well for a media company like The Guardian, but the same approach can work for any organization that publishes large amounts of data on multiple platforms. Whether you’re a government department, art gallery, retailer or university.”

The Guardian

I applaud these sentiments, and endorse them enthusiastically.  Helpfully, the article provided tangible examples of how structuring content can make publishing content easier.  But the article unintentionally highlighted how terminology on this topic can be used in different ways.  The article mentions content reuse as a benefit of content structuring.  But the examples related more to republishing finished articles with slight modification, rather than reusing discrete components of content to build new content. When the writer, a solutions architect, refers to a content type, he identifies video as an example.  Most content strategists would consider video a content format, not a content type.  Similarly, when the article illustrates the Guardian’s content model, it looks very limited in its focus (a generic article) — much more like a content type than a full content model.  

Mike Atherton commented on twitter that the article, like many discussions of content structuring, didn’t address distinctions between “presentation structure vs semantic structure, how the two are compatible or, indeed, different, and whether they can or should be captured in the same model.”  

Mike raises a fair point: we often talk about different aspects of structure, without being explicit about what aspect is being addressed.

I think about structure as a spectrum. As yet there’s no Good Housekeeping Seal Of Approval on the one right way to structure content.  Even people who are united in enthusiasm for content structure can diverge in how they discuss it — as the Guardian article shows.  I know other people use different terminology, define the same terminology in different ways, and follow slightly different processes.  That doesn’t imply others are wrong.  It merely suggests that practices are still far from settled.  How an organization uses content structuring will partly depend on the kind of content they publish, and their specific goals.  The Guardian’s approach makes sense for their needs, but may not serve the needs of other publishers.  

For me, it helps to keep the focus on the value of each distinct kind of decision offers.  For those who write simple articles, or write copy for small apps that don’t need to be coordinated with other content, some of these distinctions won’t be as important.  Structure becomes increasingly important for enterprises trying to coordinate different web-related tasks.  The essence of structure is repeatability.  

The Spectrum of content structuring

The structuring of content needs to support different decisions.   

Structure brings greater precision to content. It can influence five dimensions:

  1. How content is presented
  2. What content is presented
  3. Where content is presented
  4. What content is required
  5. What content is available

Some of these issues involve audience-facing aspects, and others involve aspects handled by backend systems.  

Different aspects of content structure

Content doesn’t acquire its structure at one position along this spectrum.  The structuring of content happens in many places.  Each decision on the spectrum has a specific activity or artifact associated with it. The issues addressed by each decision can be inter-related.  But they shouldn’t become entangled, where it is difficult to understand how each influences another.  

UI Design or Interaction Design

UI design is not just visual styling.  Interaction design shapes the experience by structuring micro-tasks and the staging of information. From a content perspective, it’s not so much about surface behaviors such as animated transitions, but how to break up the presentation of content into meaningful moments.  For example, progressive disclosure, which can be done using CSS, both paces the delivery of content and directs attention to specific elements within the content.  Increasingly, UX writers are designing content within the context of a UI design or prototype.  They need understand the cross-dependences between the behavior of content and how it is understood and perceived.  

The design of behavior involves the creation of structure.  Content needs to behave predictably to be understandable.  UI design leverages structure by utilizing design patterns and design libraries.    

Content Design

Content design encompasses the creation and arrangement of different long and short messages into meaningful experiences. It defines what is said.  

Content design is not just about styling words.  It involves all textual and visual elements that influence the understanding and perception of messages, including the interaction between different messages over time and in different scenarios.  Words are central to content design; some professionals involved with content design refer to themselves as UX writers. Terminology is finely tuned and controlled to be consistent, clear, and on-brand. 

Writers commonly break content into blocks of text.  They may use a simple tool like Dropbox paper to provide a “distraction free” view of different text elements that’s unencumbered by the visual design.  It may look a bit like a template (and is sometimes referred to as one), but it’s purpose is to help writers to plan their text, rather than to define how the text is managed.  The design of content relies heavily on the application of implicit structure.  Audiences understand better when they are comfortable knowing what they can expect.  The design may utilize a message hierarchy (identifying major and minor messages), or voice and tone guidelines that depend on the scenario in which the writing appears.  For the most part these implicit structures are managed offline through guidelines for writers, rather than through explicit formal online systems.  But some writers are looking to operationalize these guidelines into more formal design systems that are easier and more reliable to use.  

Content design involves delivering a mix of the fresh and the familiar.  The content that’s fresh, that talks about novel issues or delivers unique or distinctive messages, is unstructured — it doesn’t rely on pre-existing elements.   Messages that are familiar (recycled in some way) have the possibility of becoming structured elements.  Content design thus involves both the creation of elements that will be reused (such as feedback messaging), and ad hoc content that will be specific to given screen.  But even ad hoc elements present the opportunity reuse certain phrases and terminology so that it is consistent with the content’s tone of voice guidelines.   Some publishers are even managing strings of phrases to reuse across different content.

Page Templates

Templates provide organizational structure for the content — for example, prioritizing the order of content, and creating a hierarchy between primary and secondary content.  The template defines the elements will be consistent for any content using the template, in contrast to the interaction design, which defines the elments that will be fluid and will change and respond to users as they consume the content.  

Templates provide slots to fill with content. Page templates specify HTML structure, in contrast to the drafting templates writers use to design specific content elements.   Page templates express organizational structure, such as where an image should be placed, or where a heading is needed. The template doesn’t indicate what each heading says, which will vary according to the specifics of the content.  Templates can sometimes incorporate fixed text elements, such as copyright notice in the footer of the page, if they are specific to that page and are unlikely to change.  The critical role that templates play is that they define what’s fixed about a page that the audience will see.  Templates provides the framework for the layout of the content, allowing other aspects of the content to adjust.  

Layout has a subtle effect on how content is delivered and is accessed across different screens.  Elements that are obvious on some screen sizes may not be so on other screen sizes — for example, a list of related articles, or a cross-promotion.  Page templates must address how to make core information consistently available.  

Content Types

Content types indicate what kinds of information and messages audiences need to see to satisfy their goals.  The more specific the audience goal, the most specific the content type is likely to be. For example, many websites have an “article” content type that has only a few basic attributes, such as title, author and body.  Such types aren’t associated with any specific goal.  But a product profile on an e-commerce website will be much more specific, since different elements are important to satisfying user needs for them to decide to buy the product.  The more specific a content type, the more similar each screen of content based on it will seem, even though the specific messages and information will vary. Content types provide consistency in the kinds of information presented for a given scenario.

Content types are designed for a specific audience who has a specific goal. It specifies: to support this purpose, this information must be presented.  It answers: what elements of content needs to be delivered here for this scenario?  One of the benefits of a content type is that it can provide options to show more details, fewer details, or different details, according to the audience and scenario. 

Content types also encode business rules about the display of content. In doing so, they provide the logical structure of content.   If the content model already has defined the specifics of required information, it can pre-populate the information — enabling the reuse of content elements.  

Content Models

Content models indicate the elements of content that are available to support different audiences and scenarios.  They specify the specific kinds of messages the publisher has planned to use across different content.  They specify the semantic structure of the content — or put more simply, how different content elements are related to each other in their meaning.

Content is built from various kinds of messages associated with different topics and having different roles, such as extended descriptions, instructions, calls-to-action, value propositions, admonitions, and illustrations.  The content model provides a overview of the different kinds of essential messages that are available to build different versions and variations of content.  

In some respects, a content model is analogous to a site map.  A site map provides external audiences and systems a picture of the content published on a website.  A content model provides a map of the internal content resources that are available for publication.  But instead of representing a tree of web pages like a site map, the content model presents constellation of  “nodes”  that indicate available information resources.  A node is a basic unit of content that part of and connected to the larger structure of content.  They correspond to a content elements within published content — the units of content described within a pair of HTML tags.

Each node in a content model represents a distinct unit of content covering a discrete message or statement of information. Nodes are connected to other nodes elsewhere.  A node may be empty (authors can supply any message provided it relates to the expected meaning), or a node may be pre-populated with one or more values (indicating that the meaning will have a certain predefined message).  

Content models connect nodes by identifying the relationships between them —  how one element relates to another.  It can show how different nodes are associated, such as what role one node has to another.  For example, one node could be part of another node because is a detail relating to a larger topic.  The relationships provide pathways between different nodes of content.  

Content models are more abstract than other approaches to structuring content, and can therefore be open to wider interpretation about what they do.  The content model represents perhaps the deepest level of content structure, capturing all reusable and variable content elements. 

No single model, template or design system

No single representation of content structure can effectively depict all its different aspects.  I haven’t seen any single view representation that supports the different kinds of design decisions required.  For example, wireframes mix together fixed structures defined by templates with dynamic structures associated with UI design.  When content is embedded within screen comps, it is hard to see which elements are fixed and which are fluid.  Single views promote a tunnel focus on a specific decision, but block visibility into larger considerations that may be involved.  I’ve seen various attempts to improve wireframes to make them more interactive and content-friendly, but the basic limitations remain.

Consider a simple content element: an alert that tells a customer that their subscription is expiring and that they need to submit new payment details.  UI design needs to consider how the alert is delivered where it is noticed but not annoying.  Content design needs to decide on whether to use an existing alert, or write a new one.  The template must decide where within a  page or screen the alert appears.  The content type will specify the rules triggering delivery of the alert: who gets it, and when. And the content model may hold variations of the alert, and their mappings to different content types that use them.  You need a better alert, but what do you need to change?  What should stay the same, so you don’t mess up other things you’ve worked hard to get right?

Such decisions require coordination; different people may be responsible for different aspects. Not only must decisions and tasks be coordinated across people, they must be coordinated across time.  Those involved need to be aware of past decisions, easily reuse these when appropriate, and be able to modify them when not.  Agility is important, but so is governance.

A benefit of content structure is that it can accelerate the creation and delivery of content.  The challenge of content structure is that it’s not one thing.  There are different approaches, and each has its own value to offer.   Web publishers have more tools than ever to solve specific problems. But they still need truly integrated platforms that help web teams coordinate different kinds of decisions relating to specifying and choosing content elements. 

— Michael Andrews

Categories
Content Engineering

Where Domain Models Meet Content Models

A model is supposed to be a simplification of reality.  It is meant to help people understand and act.  But sometimes models do the opposite and cause confusion.  If the model becomes an artifact of its own, it can be hard to see its connection to what it is supposed to represent.  Over recent months, several people have raised questions on Twitter about the relationship between a domain model, a content model, and a data model.  We also may also encounter the terms ontology or vocabulary, which are also models. With so many models out there, it’s small wonder that people might be confused. 

From what I can see, no consensus yet exists on how to blend these perspectives in a way that’s both flexible and easy to understand.  I want to offer my thinking about how these different models are related.  

Part of the source of confusion is that all these models were developed by different parties to solve different problems.   Only recently has the content strategy community started to focus on how integrate the different perspectives offered by these models.   

The Different Purposes of Models

Domain models have a geeky pedigree.  They come from a software development approach known as domain-driven design (DDD).  DDD is an approach to developing applications rather than to publishing content.  It’s focused on tasks that a software application must support (behavior), and maps tasks to domain objects that are centered on entities (data).  The notion of a domain model was subsequently adopted by ontology engineers (people who design models of information.)  Again, these ontology engineers weren’t focused on the needs of web publishers: they just wanted a way to define the relationship between different kinds of information to allow the information to be queried. From these highly technical origins, domain models attracted attention in the content strategy community as a tool to model the relationships of entities that will appear in one’s content.  The critical question is, so what?  What value does a domain model offer to online publishers?  This question can elicit different and sometimes fuzzy answers.  I’ll offer my perspective in a moment.

A content model sounds similar to a domain model, but the two are different.  A content model is an abstract picture of the elements that content creators must create, which are managed by a CMS.  When content strategists talk about structuring content, they are generally referring to the elements that comprise a content model.  Where a domain model is concerned with data or facts, a content model is concerned with expressive content — the text, images, videos and other material that audiences consume.  Compared with a domain model, a content model is more focused on the experience of audiences.  Unsurprisingly, content strategists talk about content models more than they talk about domain models.  

Content models can serve two roles: representing what the audience is interested in consuming, and representing how that content is managed.   The content model can become confusing when it tries to indicate both what the machine delivering content needs to know about, as well as what the audience needs to see.  

Regrettably, the design of CMSs has trained authors to think about content elements in a certain way.  Authors decompose text articles into chunks, presented as fields in a CMS.  The content model can start to look like a massive form, with many fields available to address different aspects of a topic or theme.  Not all fields will display in all scenarios, and fields may be shared across different views of content (hence rules are needed to direct what’s shown when). It may look like a data model.  But the content model doesn’t impose strict rules about what types of values are allowed for the fields.  The values of some fields are numbers, some are pick list values.  Many fields are multiple paragraphs of text representing thousands of characters.  Some fields are links to images, audio, or to videos.  Some fields may involve values that are phrases, such as the text used on a button.  While all these values are “data” in the sense of being ones and zeros, they don’t add up to a robust data model.  That’s one reason that many developers consider content as unstructured — the values of content defy any uniformity.  

A content model is not a solid foundation for a data model about the content. The structure represented in a content model is not semantic (machine intelligible) — contrary to the beliefs of many content strategists.  Creating a content model doesn’t make the content semantic.   Structured authoring helps authors plan how different pieces of content can fit together. But author-defined structures don’t mean anything to outside parties, and most machines won’t automatically understand what the chunks of content mean.  A content model can inform a schematic of the content’s architecture, such as what content is needed and from where it will be sourced (it could come from other systems, or even external sources).  That’s useful for internal purposes.  The content model is implemented with custom code.  

The primary value of content models is to guide editorial decisions.  The content model defines content types — distinct profiles of content that address specific user purposes and goals.  A content model can specify many details, such as a short and a long description to accommodate different kinds of devices, or alternative text for different audiences in different regions.   A detailed content model can help the content adapt to different contexts.  

Domain models are strong where content models are weak. Although domain models did not originally rely on metadata standards (e.g., in DDD), domain models increasingly have become synonymous with metadata vocabularies or ontologies.  Domain models define data models: how factual information is stored so it can be accessed. They supply one source of truth for information, in contrast to the many expressive variations represented in a content model.  Domain models represent the relationships of the data or information relating to a domain or broad subject area.  Domain models can be precise about the kinds of values expected.  Precise values are required in order to allow the information to be understood and reused in different contexts by different machines.  Because a domain model is based on metadata standards, the information can be used by different parties.  Content defined by a content model, in contrast, is primarily of use to the publisher only.   

The core value of a domain model is to represent entities — the key things discussed in content.  Metadata vocabularies define entity types that provide properties for all the important values that would provide important information.  Some entity types will reference other entity types.  For example, an event (entity type 1) takes place at a location (entity type 2).  The relationships between different entities are already defined by the vocabulary, which reduces the need for the publisher to set up special rules defining these relationships.  The domain model can suggest the kinds of information that authors need to include in content delivered to audiences.  In addition, the domain model can also support non-editorial uses of the information.  For example, it can provide information to a functional app on a smartphone.  Or it can provide factual information to bots or to search engines.  

The Boundary between Domain and Content Models

What’s the boundary between a domain model and a content model?

A common issue I’ve noticed is that model makers try to use a content type to represent an entity type. Certain CMSs aren’t too clear about the difference between content types and entity types.  One must be careful not to let your CMS force you to think in certain ways. 

Let’s consider a common topic: events.  Some content strategists consider events as a distinct content type.  That would seem to imply the content model manages all the information relating to events. But an event is actually an entity type.  Metadata standards already define all the common properties associated with an event.  There’s little point replicating that information in the content model.  The event information may need to travel to many places: to a calendar on someone’s phone, in search results, as well as on the publisher’s website which has a special webpage for events.    But how the publisher wants to promote the event could still be productively represented in the content model.  The publisher needs to think about editorial elements associated with the event, such as images and calls-to-action.

Event content contains both structured editorial content, as well as structured metadata

The domain model represents what something is, while the content model can represent what is said or how it is said.  Let’s return to the all important call-to-action (CTA).  A CTA is a user action that is monitored in analytics.  The action itself can be represented as metadata — for example, there is a “buy action” in schema.org.  Publishers can use metadata to track what products are being bought according to the product’s properties, for example, color.  But the text on the buy button is part of the content model.  The CTA phrasing can be reused on different buttons.  The value of the content model is to facilitate the reuse of expressive content rather than the reuse of information.  Content models will change, as different elements gain or lose their mojo when presented to audiences.  The elements in a content model can be tested.  The domain model, centered on factual information, is far more stable.  The values may change, but the entities and properties in the model will rarely change.

When information is structured semantically with metadata standards, a database designed around a domain model can populate information used in content.  In such cases, the domain model supports the content model.  But in other cases, authors will be creating loosely structured information, such as long narrative texts that discuss information.  In these cases, authors can annotate the text to capture the core facts that should be included.  The annotation allows these facts to be reused later for different contexts.  

Over time, more editorial components are becoming formalized as structured data defined by metadata vocabulary standards.  As different publishers face similar needs and borrow from each others’ approaches, the element in the content model becomes a design pattern that’s widely used, and therefore a candidate for standardization.  For example, simple how-to instructions can be specified using metadata standards.  

The Layered Cake

How domain models can support content models

One simple way to think about the two models is as layers of a cake.  The domain model is the base layer.  It manages the factual information that’s needed by the content and by machines for applications.  The content model is the layer above the domain model.  It manages all the relevant content assets (thumbnails, video trailers, diagrams, etc), all the sections of copy (introductions, call outs, quotes, sidebars, etc.) and all the messaging (button text, alternative headlines, etc.)  On the top of these layers is the icing on the cake: the presentation layer.  The presentation layer is not about the raw ingredients, or how the ingredients are cooked.  It’s about how the finished product looks.  

The distinctions I’ve made between the domain model and content model may not align with how your content management systems are set up.  But such decoupling of data and content is becoming more common.   If factual information is kept separate from expressive content, publishers can gain more flexibility when configuring how they deliver content and information to audiences.

— Michael Andrews