Categories
Content Engineering Personalization

Orchestrating the assembly of content

Structured content enables online publishers to assemble pieces of content in multiple ways.  However, the process by which this assembly happens can be opaque to authors and designers. Read on to learn how orchestration is evolving and how it works.

To many people, orchestration sounds like jargon or a marketing buzzword. Yet orchestration is no gimmick. It is increasingly vital to developing, managing, and delivering online content. It transforms how publishers make decisions about content, bringing flexibility and learning to a process hampered in the past by short-term planning and jumbled, ad-hoc decisions.  

Revealing the hidden hand of orchestration

Orchestration is both a technical term in content management and a metaphor. Before discussing the technical aspects of orchestration, let’s consider the metaphor.  Orchestration in music is how you translate a tune into a score that involves multiple instruments that play together harmoniously. It’s done by someone referred to as an arranger, someone like Quincy Jones. As the New Yorker once wrote: “Everyone knows Quincy Jones’s name, even if no one is quite sure what he does. Jones got his start in the late nineteen-forties as a trumpeter, but he soon mastered the art of arranging jazz—turning tunes and melodies into written music for jazz ensembles.”

Much like music arranging, content orchestration happens off stage, away from the spotlight. It doesn’t get the attention given to UI design. Despite its stealthy profile, numerous employees in organizations become involved with orchestration, often through small-scale A/B testing by changing an image or a headline. 

Orchestration typically focuses on minor tweaks to content, often cosmetic changes. But orchestration can also address how to assemble content on a bigger scale. The emergence of structured content makes intricate, highly customized orchestration possible.

Content assembly requires design and a strategy. Few people consider orchestration when planning how content is delivered to customers. They generally plan content assembly by focusing on building individual screens or a collection of web pages on a website. The UI design dictates the assembly logic and reflects choices made at a specific time.  While the logic can change, it tends to happen only in conjunction with changes to the UI design. 

Orchestration allows publishers to specify content assembly independently of its layout presentation. It does so by approaching the assembly process abstractly: evaluating content pieces’ roles and purposes that address specific user scenarios.

Assembly logic is becoming distributed. Content assembly logic doesn’t happen in one place anymore. Originally, web teams created content for assembly into web pages using templates defined by a CMS on the backend. In the early 2000s, frontend developers devised ways to change the content of web pages presented in the browser using an approach known initially as Ajax, a term coined by the information architect Jesse James Garrett. Today, content assembly can happen at any stage and in any place. 

Assembly is becoming more sophisticated. At first, publishers focused on selecting the right web page to deliver. The pages were preassembled – often hand-assembled. Next, the focus shifted to showing or hiding parts of that web page by manipulating the DOM (document object model).  

Nowadays, content is much more dynamic. Many web pages, especially in e-commerce, are generated programmatically and have no permanent existence.  “Single page applications” (SPAs) have become popular, and the content will morph continuously. 

The need for sophisticated approaches for assembling content has grown with the emergence of API-accessible structured content. When content is defined semantically, rather than as web pages, the content units are more granular. Instead of simply matching a couple of web page characteristics, such as a category tag and a date, publishers now have many more parameters to consider when deciding what to deliver to a user.

Orchestration logic is becoming decoupled from applications. While orchestration can occur within a CMS platform, it is increasingly happening outside the CMS to take advantage of a broader range of resources and capabilities. With APIs growing in coordinating web content, much content assembly now occurs in a middle layer between the back-end storing the content and the front-end presenting it. The logic driving assembly is becoming decoupled from both the back-end and front-end. 

Publishers have a growing range of options outside their CMS for deciding what content to deliver.  Tools include:

  • Digital experience, composition, and personalization orchestration engines (e.g., Conscia, Ninetailed)
  • Graph query tools (e.g., PoolParty)
  • API federation management tools (e.g., Apollo Federation)

These options vary in their aims and motivations, and they differ in their implementations and features. Their capabilities are sometimes complementary, which means they can be used in combination. 

Orchestration inputs that frame the content’s context

Content structuring supports extensive variation in the types of content to present and what that content says. 

Orchestration involves more than retrieving a predefined web page.  It requires considering many kinds of inputs to deliver the correct details. 

Content orchestration will reflect three kinds of considerations:

  1. The content’s intent – the purpose of each content piece
  2. The organization’s operational readiness to satisfy a customer’s need
  3. The customer or user’s intent – their immediate or longer-term goal

Content characteristics play a significant role in assembly. Content characteristics define variations among and within content items. An orchestration layer will account for characteristics of available content pieces, such as:

  • Its editorial role and purpose, such as headings, explanations, or calls to action
  • Topics and themes, including specific products or services addressed
  • Intended audience or customer segment
  • Knowledge level such as beginner or expert
  • Intended journey or task stage
  • Language and locale
  • Date of creation or updating
  • Author or source
  • Size, length, or dimensions
  • Format and media
  • Campaign or announcement cycle
  • Product or business unit owner
  • Location information, such as cities or regions that are relevant or mentioned
  • Version 

Each of these characteristics can be a variable and useful when deciding what content to assemble. They indicate the compatibility between pieces and their suitability for specific contexts.

Other information in the enterprise IT ecosystem can help decide what content to assemble that will be most relevant for a specific context of use. This information is external to the content but relevant to its assembly.

Business data is also an orchestration input. Content addresses something a business offers. The assembled content should link to business operations to reflect what’s available accurately.

The assembled content will be contextually relevant only if the business can deliver to the customer the product or services that the content addresses. Customers want to know which pharmacy branches are open now or which items are available for delivery overnight.  The assembled content must reflect what the business can deliver when the customer seeks it.

The orchestration needs to combine content characteristics from the CMS with business data managed by other IT systems. Many factors can influence what content should be presented, such as:

  • Inventory management data
  • Bookings and orders data
  • Facilities’ capacity or availability
  • Location hours
  • Pricing information, promotions, and discount rules
  • Service level agreement (SLA) rules
  • Fulfillment status data
  • Event or appointment schedules
  • Campaigns and promotions schedule
  • Enterprise taxonomy structure defining products and operating units

Business data have complex rules managed by the IT system of record, not the CMS or the orchestration layer.  For content orchestration, sometimes it is only necessary to provide a “flag,” checking whether a condition is satisfied to determine which content option to show.

Customer context is the third kind of orchestration input. Ideally, the publisher will tailor the content to the customer’s needs – the aim of personalization.  The orchestration process must draw upon relevant known information about the customer: the customer’s context.

The customer context encompasses their identity and their circumstances. A customer’s circumstances can change, sometimes in a short time.  And in some situations, the customer’s circumstances dictate the customer’s identity. People can have multiple identities, for example, as consumers, business customers at work, or parents overseeing decisions made by their children.

Numerous dimensions will influence a customer’s opinions and needs, which in turn will influence the most appropriate content to assemble. Some common customer dimensions include:

  • Their location
  • Their personal characteristics, which might include their age, gender, and household composition, especially when these factors directly influence the relevance of the content, for example, with some health topics
  • Things they own, such as property or possessions, especially for content relating to the maintenance, insurance, or buying and selling of owned things
  • Their profession or job role, especially for content focused on business and professional audiences
  • Their status as a new, loyal, or churned customer
  • Their purchase and support history

The chief challenge in establishing the customer context is having solid insights.  Customers’ interactions on social media and with customer care provide some insights, but publishers can tap a more extensive information store.  Various sources of customer data could be available:

  • Self-disclosed information and preferences to the business (zero-party data or 0PD)
  • The history of a customer’s interactions with the business (first-party data or 1PD) 
  • Things customers have disclosed about themselves in other channels such as social media or survey firms (second-party data or 2PD)
  • Information about a cohort they are categorized as belonging to, using aggregated data originating from multiple sources (third-party data or 3PD)

Much of this information will be stored in a customer data platform (CDP), but other data will be sourced from various systems.  The data is valid only to the extent it is up-to-date and accurate, which is only sometimes a safe assumption.

Content behavior can shape the timing and details assembled in orchestration. Users can signal their intent through their interaction with content. The user’s decisions while interacting with content can signal their intentions.  Some behavior variables include:

  • Source of referral 
  • Previously viewed content 
  • Expressed interest in topics or themes based on prior content consumed
  • Frequency of repeat visits 
  • Search terms used 
  • Chatbot queries submitted
  • Subscriptions chosen or online events booked
  • Downloads or requests for follow-up information
  • The timing of their visit in relation to an offer 

The most valuable and reliable signals will be specific to the context. Many factors can shape intent, so many potential factors will not be relevant to individual customers. Just because some factors could be relevant in certain cases does not imply they will be applicable in most cases. 

Though challenging, leveraging customer intent offers many opportunities to improve the relevance of content. A rich range of possible dimensions is available. Selecting the right ones can make a difference. 

Don’t rely on weak signals to overdetermine intent. When the details about individual content behavior or motivations are scant, publishers sometimes rely on collective behavioral data to predict individual customer intentions.  While occasionally useful, predictive inputs about customers can be based on faulty assumptions that yield uneven results. 

Note the difference between tailoring content to match an individual’s needs and the practice of targeting. Targeting differs from personalization because it aims to increase average uptake rather than satisfy individual goals. It can risk alienating customers who don’t want the proffered content.

Draw upon diverse sources of input. By utilizing a separate layer to manage orchestration, publishers, in effect, create a virtual data tier that can federate and assess many distinct and independent sources of information to support decisions relating to content delivery. 

An orchestration layer gives publishers greater control over choosing the right pieces of content to offer in different situations. Publishers gain direct control over parameters to select,  unlike many AI-powered “decision engines” that operate like a black box and assume control over the content chosen.

The orchestration score

If the inputs are the notes in orchestration, the score is how they are put together – the arrangement. A rich arrangement will sometimes be simple but often will be sophisticated. 

Orchestration goes beyond web search and retrieval. In contrast to a ordinary web search, which retrieves a list of relevant web pages, orchestration queries must address many more dimensions. 

In a web search, there’s a close relationship between what is requested and what is retrieved. Typically, only a few terms need matching. Web search queries are often loose, and the results can be hit or miss. The user is both specifying and deciding what they want from the results retrieved.

In orchestration, what is requested needs to anticipate what will be relevant and exclude what won’t be. The request may refer to metadata values or data parameters that aren’t presented in the content that’s retrieved. The results must be more precise. The user will have limited direct input into the request for assembled content and limited ability to change what is provided to them.

Unlike a one-shot web search process, in orchestration, content assembly involves a multistage process.  

The orchestration of structured content is not just choosing a specific web page based on a particular content type.  It differs in two ways:

  1. You may be combining details from two (or more) content types.  
  2. Instead of delivering a complete web page associated with each content type (and potentially needing to hide parts you don’t want to show), you select specific details from content items to deliver as an iterative procedure.

Unpacking the orchestration process. Content orchestration consists of three stages:

  1. FIND stage: Choose which content items have relevant material to support a user scenario
  2. MATCH stage: Combine content types that, if presented together, provide a meaningful, relevant experience
  3. SELECT and RETURN stage: Choose which elements within the content items will be most relevant to deliver to a user at a given point in time

Find relevant content items. Generally, this involves searching metadata tags such as taxonomy terms or specific values such as dates. Sometimes, specific words in text values are sought. If we have content about events, and all the event descriptions have a field with the date, it is a simple query to retrieve descriptions for events during a specified time period.

Typically, a primary content type will provide most of the essential information or messages. However, we’ll often also want to draw on information and messages from other content types to compose a content experience. We must associate different types of items to be able to combine their details.

Match companion content types. What other topics or themes will provide more context to a message? The role of matching is to associate related topics or tasks so that complementary information and messages can be included together.

Graph queries are a powerful approach to matching because they allow one to query “edges” (relationships) between “nodes” (content types.)  For example, if we know a customer is located in a specific city, we might want to generate a list of sponsors of events happening in that city.  The event description will have a field indicating the city. It will also reference another content type that provides a profile of event sponsors.  It might look like this in a graph query language like GQL, with the content types in round brackets and the relationships in square brackets.

MATCH (:Event WHERE location=”My City”) - [:SponsoredBy] -> (:SponsorProfile)

We have filtered events in the customer’s city (fictiously named My City) and associated content items about sponsors who have sponsored those events. Note that this query hasn’t indicated what details to present to users. It only identifies which content types would be relevant so that various types of details can be combined. 

Unlike in a common database query, what we are looking for and want to show are not the same. 

Select which details to assemble. We need to decide what information within a relevant content type which details will be of greatest interest to a user. Customers want enough details for the pieces to provide meaningful context. Yet they probably won’t want to see everything available, especially all at once – that’s the old approach of delivering preassembled web pages and expecting users to hunt for relevant information themselves.

Different users will want different details, necessitating decisions about which details to show. This stage is sometimes referred to as experience composition because the focus is on which content elements to deliver. We don’t have to worry about how these elements will appear on a screen, but we will be thinking about what specific details should be offered.

GraphQL, a query language used in APIs, is very direct in allowing you to specify what details to show. The GraphQL query mirrors the structure of the content so that one can decide which fields to show after seeing which fields are available. We don’t want to show everything about a sponsor, just their name, logo, and how long they’ve been sponsoring the event.  A hypothetical query named “local sponsor highlights” would extract only those details about the sponsor we want to provide in a specific content experience.

Query LocalSponsorHighlights {
… on SponsorProfile {
name
logo
sponsorSince
} }

The process of pulling out specific details will be repeated iteratively as customers interact with the content.

Turning visions into versions

Now that we have covered the structure and process of orchestration let’s look at its planning and design. Publishers enjoy a broad scope for orchestrating content. They need a vision for what they aim to accomplish. They’ll want to move beyond the ad hoc orchestration of page-level optimization and develop a scenario-driven approach to orchestration that’s repeatable and scaleable.

Consider what the content needs to accomplish. Content can have a range of goals. They can explicitly encourage a reader to do something immediately or in the future. Or they encourage a reader’s behavior by showing goodwill and being helpful enough that the customer wants to do something without being told what to do.

Content goalImmediate 
(Action outcome)
Consequent
 (Stage outcome)
Explicit 
(stated in the content)
CTA (call to action) conversionContact sales or visit a retail outlet
Implicit 
(encouraged by the content)
Resolve an issue without contacting customer supportRenew their subscription

Content goals must be congruent with the customer’s context. If customers have an immediate goal, then the content should be action-oriented. If their goal is longer-term, the content should focus on helping the customer move from one stage to another.

Orchestration will generate a version of the content representing the vision of what the pieces working together aim to accomplish.

Specify the context.  Break down the scenario and identify which contextual dimensions are most critical to providing the right content. The content should adapt to the user context, reflect the business context, and provide users with viable options. The context includes:

  • Who is seeking content (the segment, especially when the content is tailored for new or existing customers, or businesses or consumers, for example)
  • What they are seeking (topics, questions, requests, formats, and media)
  • When they are seeking it (time of day, day of the week, month, season, or holiday, all can be relevant)
  • Where they are seeking it (region, country, city, or facility such as an airport if relevant)
  • Why (their goal or intent as far as can be determined)
  • How (where they started their journey, channels used, how long have they pursuing task)

Perfecting the performance: testing and learning

Leonard Bernstein conducts the New York Philharmonic in a Young People’s Concert. Image: Library of Congress


An orchestral performance is perfected through rehearsal. The performance realized is a byproduct of practice and improvisation.

Pick the correct parameters. With hundreds of parameters that could influence the optimal content orchestration, it is essential that teams not lock themselves into a few narrow ones. The learning will arise from determining which factors deliver the right experience and results in which circumstances. 

Content parameters can be of two kinds:

  1. Necessary characteristics tell us what values are required 
  2. Contingency characteristics indicate values to try to find which ones work best
Specifies in the orchestrationDetermines in the contentOutcome expected
Necessary characteristics

(tightly defined scenarios)
What values are required in a given situationWhich categorical version or option the customer getsThe right details to show to a given customer in a given situation
Contingency characteristics

(loosely defined scenarios)
What values are allowed in a given situationWhich versions could be presentedCandidate options to present to learn which most effectively matches the customer’s needs

The two approaches are not mutually exclusive. More complex orchestration (sometimes referred to as “multihop” queries) will involve a combination of both approaches.

Necessary characteristics reflect known and fixed attributes in the customer or business context that will affect the correct content to show. For example, if the customer has a particular characteristic, then a specific content value must be shown. The goal should be to test that the orchestration is working correctly – that the assumptions about the context are correct.  For example, there are no wrong assumptions or missing ones. This dimension is especially important for aspects that are fixed and non-negotiable. The content needs to adapt to these circumstances, not ignore them. 

Contingency characteristics reflect uncertain or changeable attributes relating to the customer’s context. For example, if the customer has had any one of several characteristics now or in the past, try showing any one of several available content values to see which works best given what’s known. The orchestration will prioritize variations randomly or in some ranked order based on what’s available to address the situation.

You can apply the approach to other situations involving uncertainty. When there are information gaps or delays, contingency characteristics can apply to business operations variables and to the content itself.  The goal of using contingency characteristics is to try different content versions to learn what’s most effective in various scenarios.  

Be clear on what content can influence. We have mostly looked at the customer’s context as an input into orchestration. Customers will vary widely in their goals, interests, abilities, and behaviors. A large part of orchestration concerns adapting content to the customer’s context. But how does orchestration impact the customer? In what ways might the customer’s context be the outcome of the content?

Consider how orchestration supports a shift in the customer’s context. Orchestration can’t change the fixed characteristics of the customer. It can sway ephemeral characteristics, especially content choices, such as whether the customer has requested further information.  And the content may guide customers toward a different context. 

Context shifting involves using content to meet customers where they are so they can get where they want to be. Much content exists to change the customer’s context by enabling them to resolve a problem or encouraging them to take action on something that will improve their situation. 

The orchestration of content needs to connect to immediate and downstream outcomes.  Testing orchestration entails looking at its effects on online content behavior and how it influences interactions with the business in other areas. Some of these interactions will happen offline.  

The task of business analytics is to connect orchestration outputs with customer outcomes. The migration of orchestration to an API layer should open more possibilities for insights and learning. 

– Michael Andrews

Categories
Content Engineering

Content models: lessons from LEGO

Content models can be either a friend or foe.  A content model can empower content to become a modular and flexible resource when developed appropriately. Models can liberate how you develop content and accelerate your momentum.

However, if the model isn’t developed correctly, it can become a barrier to gaining control over your content.  The model ends up being hard to understand, the cause of delay, and eventually an albatross preventing you from pivoting later.  

Models are supposed to be the solution, not the problem

Content modeling can be challenging. Those involved with content modeling have likely heard stories about teams wrestling with their content model because it was too complex and difficult to implement. 

Some common concerns about content modeling include:

  • There’s not enough time to develop a proper content model.
  • We don’t know all our requirements yet.
  • People don’t understand why we need a content model or how to develop one.
  • Most of the content model doesn’t seem relevant to an individual writer.
  • Someone else should figure this out.

These concerns reflect an expectation that a content model is “one big thing” that needs to be sorted out all at once in the correct way, what might be called the monolithic school of content modeling. 

Rather than treat content models as monolithic plans, it is more helpful to think of them as behaving like LEGO. They should support the configuration of content in multiple ways.

Yet, many content models are designed to be monolithic. They impose a rigid structure on authors and prevent organizations from addressing a range of needs.  They become the source of stress because how they are designed is brittle.

In an earlier post, I briefly explored how LEGO’s design supports modularity through what’s called “clutch power.” LEGO can teach us insights about bringing modularity to content models. Contrary to what some believe, content models don’t automatically make content modular, especially when they are missing clutch power. But it’s true that content models can enable modularity. The value of a content model depends on its implementation. 

A complex model won’t simplify content delivery.  Some folks mistakenly think that the content model can encapsulate complexity that can then be hidden from authors, freeing them from the burdens of details and effort. That’s true only to a point.  When the model gets too complex for authors to understand and when it can’t easily be changed to address new needs, its ability to manage details deteriorates. The model imposes its will on authors rather than responding to the desire of authors.

The trick is to make model-making a modular process instead of a top-down, “here’s the spec, take it or leave it” approach. 

Don’t pre-configure your content

LEGO pioneered the toy category of multipurpose bricks. But over time, they have promoted the sale of numerous single-purpose kits, such as one for a typewriter that will “bring a touch of nostalgia to your home office.”  For $249.99, buyers get the gratification of knowing exactly what they will assemble before opening the package.  But they never experience the freedom of creating their own construction.

LEGO typewriter, 2079 pieces (screenshot)

The single-purpose kits contain numerous single-purpose bricks. The kit allows you to build one thing only. Certain pieces, such as the typewriter keys and a black and red ink spool ribbon, aren’t useful for other applications. When the meme gets stale, the bricks are no longer useful.

One of the most persistent mistakes in design is building for today’s problems with no forethought as to how your solution will accommodate tomorrow’s needs. 

Many content models are designed to be single-purpose.  They reflect a design frozen in time when the model was conceived – often by an outside party such as a vendor or committee. Authors can’t change what they are permitted to create, so the model will often be vague and unhelpful to not overly constrain the author. They miss out on the power of true modularity. 

When you keep your model fluid, you can test and learn.

Make sure you can take apart your model (and change it)

Bent Flyvbjer and Dan Gardner recently published a book on the success or failure of large projects called How Big Things Get Done.  They contrast viewing projects as “one big thing” versus “many small things.”

Flyvbjer and Gardner cite LEGO as a model for how to approach large projects. They say: “Get a small thing, a basic building block. Combine it with another and another until you have what you need.”

The textbook example they cite of designing “one big thing” all at once is a nuclear power plant.  

Flyvbjer and Gardner comment: “You can’t build a nuclear power plant quickly, run it for a while, see what works and what doesn’t, then change the design to incorporate the lessons learned. It’s too expensive and dangerous.”

Building a content model can seem like designing a nuclear power plant: a big project with lots of details that can go wrong. Like designing a nuclear power plant, building a content model is not something most of us do very often.  Most large content models are developed in response to a major IT re-platforming.  Since re-platformings are infrequent, most folks don’t have experience building content models, and when a re-platform is scheduled, they are unprepared. 

Flyvbjer and Gardner note that “Lacking experimentation and experience, what you learn as you proceed is that the project is more difficult and costly than you expected.” They note an unwanted nuclear power plant costs billions to decommission.  Similarly, a content model developed all at once in a short period will have many weaknesses and limitations. Some content types will be too simple, others will be too complex.  Some will be unnecessary ultimately, while others will be overlooked.  In the rush to deliver a project, it can be difficult to even reflect on how various decisions impact the overall project.

You will make mistakes with your initial content model and will learn what best supports authoring or frontend interaction needs, enables agile management of core business messages and data, or connects to other IT systems.  And these requirements will change too. 

Make sure you can change your mind.  Don’t let your IT team or vendor tell you the model is set in stone.  While gratuitous changes should be avoided – they generate superfluous work for everyone – legitimate revisions to a content model should be expected.  Technical teams need to develop processes that can address the need to add or remove elements in content types, create new types, and split or merge types. 

Allow authors to construct a range of outputs from basic pieces

In LEGO, the color of bricks gives them expressive potential. Even bricks of the same size and shape can be combined to render an experience.  I recently visited an exhibit of LEGO creations at the National Building Museum in Washington, DC.  

Mona Lisa in LEGO by Warren Elsmore at National Building Museum

Much online content is developed iteratively.  Writers pull together some existing text, modify or update it, add some images, and publish it.  Publishing is often a mixture of combining something old with something new.  A good content model will facilitate that process of composition. It will allow authors to retrieve parts they need, develop parts they don’t yet have, and combine them into a content item that people read, watch, or listen to.

Content modeling is fundamentally about developing an editorial schema.  Elements in content types represent the points authors want to make to audiences. The content types represent larger themes. 

A content model developed without the input of authors isn’t going to be successful. Ideally, authors will directly participate in the content modeling process. Defining a content type does not require any software expertise, and at least one CMS has a UI that allows non-technical users to create content types. However, it remains true that modeling is not easy to understand quickly for those new to the activity. I’m encouraged by a few vendor initiatives that incorporate AI and visualization to generate a model that could be edited. But vendors need to do more work to empower end-users so they can take more control over the content model they must rely on to create content. 

Keep logic out of your model 

LEGO is most brilliant when users supply the creativity rather than rely on instructions from the LEGO corporation.  

The DIY ethos of LEGO is celebrated on the Rebrickable website, where LEGO enthusiasts swap concepts for LEGO creations.

Rebrickable website (screenshot)

Rebrickable illustrates the principle of decoupling content from its assembly. LEGO bricks have an independent existence from assembly instructions. Those who develop plans for assembling LEGO are not the same people who created the bricks.  Bricks can be reused to be assembled in many ways – including ways not foreseen. Rebrickable celebrates the flexibility of LEGO bricks.

A content model should not contain logic. Logic acts like glue that binds pieces together instead of allowing them to be configured in different ways. Decouple your content from any logic used to deliver that content. Logic in content models gets in the way: it complicates the model, making it difficult for authors (and even developers) to understand, and it makes the model less flexible.  

Many Web CMS and XML content models mix together the structure of types with templating or assembly logic. When the model includes assembly instructions, it has already predetermined how modules are supposed to fit together, therefore precluding other ways they might connect. Content models for headless CMS implementations, in contrast, define content structure in terms of fields that can be accessed by APIs that can assemble content in any manner.  

A brittle content model that can’t be modified is also expensive.  Many publishers are saddled with legacy content models that are difficult to change or migrate. They hate what they have but are afraid to decommission it because they are unsure how it was put together. They are unable to migrate off of a solution someone built for them long ago that doesn’t meet their needs.  This phenomenon is referred to as “lock-in.”

A flexible content model will focus only on the content, not how to assemble the content. It won’t embed “views” or navigation or other details that dictate how the content must be delivered. When the model is focused solely on the content, the content is truly modular and portable.  

Focus on basics and learn

LEGO encourages play.  People don’t worry about assembling LEGO bricks the wrong way because there are many valid ways to connect them. There’s always the possibility of changing your mind.

Each LEGO brick is simple.  But by trying combinations over time, the range of possibilities grows.  As Flyvbjer and Gardner say in their book, “Repetition is the genius of modularity; it enables experimentation.”

Start your model with something basic. An author biography content type would be a good candidate. You can use it anytime you need to provide a short profile of an author. It seems simple enough, too.  A name, a photo, and a paragraph description might be all you need.  Congratulations, you have created a reusable content type and are on the path toward content modularity.

Over time, you realize there are other nuances to consider. Your website also features presenters at live events.  Is a presenter biography different from an author biography?  Someone in IT suggests that the author bio can be prepopulated with information from the employee directory, which includes the employee’s job title. The HR department wants to run profiles of employees on their hiring blog and wonders if the employee profile would be like the author bio.  

As new requirements and requests emerge, you start to consider the variations and overlaps among them. You might try to consolidate variations into a single content type, extend a core content type to handle specialized needs or decide that consolidating everything into a single content type wouldn’t simplify things at all. Only through experimentation and learning will the correct choice become apparent.

It’s a good idea to document your decisions and share what you’ve learned, including with outside teams who might be curious about what you are working on – encourage them to steal your ideas.  You can revisit your rationale when you are faced with new requirements to evaluate.

Evolve your model

Embrace the LEGO mindset of starting small and thinking big.

The more experience your team gains with content modeling, the more comprehensive and capable your content model will become.

Flyvbjer and Gardner note: “Repetition also generates experience, making your performance better. This is called ‘positive learning.’”  They contrast it with negative learning, where “the more you learn, the more difficult and costly it gets.” When the model starts off complex, it gets too big to understand and manage. Teams may realize only one person ever understood how the model was put together, and that person moved to another company. 

Content modeling is about harnessing repeating patterns in content. The job of content modeling is to discern these patterns.

Content modeling should be a continuous activity. A content model isn’t a punch-list task that, once launched, is done. The model isn’t frozen. 

The parts that work right will be second nature, while the parts that are still rough will suggest refinement.

While content modeling expertise is still far from mainstream, there are growing opportunities to gain it. Teams don’t have to wait for a big re-platforming project.  Instead, they should start modeling with smaller projects. 

New low-cost and/or low-code tools are making it easier to adopt modular approaches to content. Options include static site generators (SSGs) like Jekyll or Hugo, open-source headless CMSs like Strapi, and no-code web-builders like Webflow. Don’t worry if these tools don’t match your eventual needs.  If you build a model that supports true modularity, it will be easy to migrate your content to a more sophisticated tool later. 

With smaller projects, it will be more evident that the content model can change as you learn new things and want to improve it. Like other dimensions of software in the SaaS era, content models can continuously be released with improvements.  Teams can enhance existing content types and add new ones gradually. 

The evolution of content models will also include fixing issues and improving performance, similar to the refactoring process in software.  You may find that some elements aren’t much used and can be removed. You can simplify or enrich your model.  Sometimes you’ll find similar types that can be folded into a single type – simplification through abstraction.  Other times, you’ll want to extend types to accommodate details that weren’t initially identified as being important. 

Continual iteration may seem messy and inefficient, and in some ways, it is. Those who approach a content model as one big thing wish the model to be born fully mature, timeless, and resistant to aging. In practice, content models require nurture and maintenance. Changing details in a large content model will seem less scary once you’ve had experience making changes on a smaller model.  And by working on them continuously, organizations learn how to make them better serve their needs. 

Regard changes to content models as a learning experience, not a fearful one.

Allow your model to scale up

Flyvbjer and Gardner note: “A block of Lego is a small thing, but by assembling more than nine thousand of them, you can build one of the biggest sets Lego makes, a scale model of the Colosseum in Rome. That’s modularity.”

My visit to the National Building Museum revealed how big a scale small LEGO bricks can build, as shown in this model of London’s St Pancras rail station. 

St Pancras rail station in LEGO by Warren Elsmore at National Building Museum

The magic of LEGO is that all bricks can connect to one another. The same isn’t necessarily true of content models. Many models reflect the idiosyncrasies of specific CMS platforms.  Each model becomes an isolated island that can’t be connected to easily.  That’s why content silos are a pervasive problem.

However, a modular content model focused on content and not assembly logic can easily connect to other modular models. 

A content model can enable content to scale.  But how does one scale the content model?

The good news is that the same process for developing a small model applies to developing a larger one.  When you embrace an iterative, learning-driven approach to modeling, scaling your model is much easier.  You understand how models work: which decisions deliver benefits and which ones can cause problems. You understand tradeoffs and can estimate the effort involved with alternatives.

One key to scaling a model is to play well with other models. In large organizations, different teams may be developing content models to support their respective web properties.  If these models are modular, they can connect.  Teams can share content.

It’s likely when there are two models, there will be overlaps in content types. Each model will have a content type defining a blog post, for example. Such situations offer an opportunity to rationalize the content type and standardize it across the teams. Eventually, separate teams may use a common content model supporting a single CMS. But until then, they can at least be using the same content type specifications.  They can learn from each other.

– Michael Andrews

Categories
Content Engineering

Making it easier to build structured content


A quick note about an amazing UI from Writefull called Sentence Palette that shows how to combine structured content with prompts to write in a structured way. It nicely fleshes out the granular structure of a complex document down to the sentence level. I see broad application for this kind of approach for many kinds of content.

The deep analysis of large content corpora reveals common patterns not just in the conceptual organization of topics but also in the narrative patterns used to discuss them.

Writefull has assessed the academic writing domain and uncovered common text patterns used, for example, in article abstracts. LLMs can draw upon these text fragments to help authors develop new content. This opens up new possibilities for the development of structured content.

— Michael Andrews