Categories
Content Integration

Digital transformation for content workflows

Content workflows remain a manually intensive process. Content staff face the burden of deciding what to do and who should do it. How can workflow tools evolve to reduce burdens and improve outcomes? 

Content operations are arguably one of the most backward areas of enterprise business operations. They have been largely untouched by enterprise digital transformation. They haven’t “change[d] the conditions under which business is done, in ways that change the expectations of customers, partners, and employees” – even though business operations increasingly rely on online content to function. Compared with other enterprise functions, such as HR or supply chain management, content operations rely little on process automation or big data. Content operations depend on content workflow tools that haven’t modernized significantly.  Content workflow has become a barrier to digital transformation.

The missing flow 

Water flows seamlessly around any obstacle, downward toward a destination below.  Content, in contrast, doesn’t flow on its own. Content items get stuck or bounce around in no apparent direction. Content development can resemble a game of tag, where individuals run in various directions without a clear sense of the final destination.  Workflow exists to provide direction to content development.

Developing content is becoming more complex, but content workflow capabilities remain rudimentary. Workflow functionality has limited awareness of what’s happened previously or what should (or could) happen later. It requires users to perform actions and decisions manually. It doesn’t add value.

Workflow functionality has largely stayed the same over the years, whether in a CMS or a separate content workflow tool. Vendors are far removed from the daily issues the content creators face managing content that’s in development. All offer similar generic workflow functionality. They don’t understand the problem space.  

Vendors consider workflow problems to be people problems, not software problems. Because people are prone to be “messy” (as one vendor puts it), the problem the software aims to solve is to track people more closely. 

To the extent workflow functionality has changed in the past decade, it has mainly focused on “collaboration.” The vendor’s solution is to make the workflow resemble the time-sucking chats of social media, which persistently demand one’s attention. By promoting open discussion of any task, tools encourage the relitigation of routine decisions rather than facilitating their seamless implementation. Tagging people for input is often a sign that the workflow isn’t clear. Waiting on responses from tagged individuals delays tasks. 

End users find workflow tools kludgy. Workflows trigger loads of notifications, which result in notification fatigue and notification blindness. Individuals can be overwhelmed by the lists and messages that workflow tools generate. 

Authors seek ways to compensate for tool limitations. Teams often supplement CMS workflow tools with project management tools or spreadsheets. Many end users skirt the built-in CMS workflow by avoiding optional features. 

Workflow optimization—making content workflows faster and easier—is immature in most organizations. Ironically, writers are often more likely to write about improving other people’s workflows (such as those of their customers or their firm’s products and services) than to dedicate time to improving their own content workflows.  

Content workflows must step up to address growing demands.  The workflow of yesterday needs reimagining.

Deane Barker wrote in his 2016 book on content management: “Workflow is the single most overpurchased aspect of any CMS…I fully believe that 95% of content approvals are simple, serial workflows, and 95% of those have a single step.”

Today, workflow is not limited to churning out simple static web pages. Content operations must coordinate supply chains of assets and copy, provide services on demand, create variants to test and optimize, plan delivery across multiple channels, and produce complex, rich media. 

Content also requires greater coordination across organizational divisions. Workflows could stay simple when limited to a small team. But as enterprises work to reduce silos and improve internal integration, workflows have needed to become more sophisticated. Workflows must sometimes connect people in different business functions, business units, or geographic regions. 

Current content workflows are hindered by:

  • Limited capabilities, missing features, and closed architectures that preclude extensions
  • Unutilized functionality that suffers from poor usability or misalignment with work practices

Broken workflows breed cynicism. Because workflow tools are cumbersome and avoided by content staff, some observers conclude workflow doesn’t matter. The opposite is true: workflows are more consequential than ever and must work better. 

While content workflow tools have stagnated, other kinds of software have introduced innovations to workflow management. They address the new normal: teams that are not co-located but need to coordinate distinct responsibilities. Modern workflow tools include IT service management workflows and sophisticated media production toolchains that coordinate the preproduction, production, and postproduction of rich media.

What is the purpose of a content workflow?

Workflow isn’t email. Existing workflow tools don’t solve the right problems. They are tactical solutions focused on managing indicators rather than substance. They reflect a belief that if everyone achieves a “zero inbox” with no outstanding tasks, then the workflow is successful.  But a workflow queue shouldn’t resemble an email box stuffed with junk mail, unsolicited requests, and extraneous notices, with a few high-priority action items buried within the pile. Workflows should play a role in deciding what’s important for people to work on.

Don’t believe the myth that having a workflow is all that’s needed. Workflow problems stem from the failure to understand why a workflow is necessary. Vendors position the issue as a choice of whether or not to have a workflow instead of what kind of workflow enterprises should have.  

Most workflow tools focus on tracking content items by offering a fancy checklist. The UI covers up an unsightly sausage-making process without improving it. 

Many tools prioritize date tracking. They equate content success with being on time. While content should be timely, its success depends on far more than the publication date and time. 

A workflow in itself doesn’t ensure content quality. A poorly implemented workflow can even detract from quality, for example, by specifying the wrong parties or steps. A robust workflow, in contrast, will promote consistency in applying best practices.  It will help all involved with doing things correctly and making sound decisions.  

As we shall see, workflow can support the development of high-quality content if it:

  • Validates the content for correctness
  • Supports sound governance

A workflow won’t necessarily make content development more productive. Workflows can be needlessly complex, time-consuming, or confusing. They are often not empowering and don’t allow individuals to make the best choices because they constrain people in counterproductive ways.  

Contrary to common belief, the primary goal of workflow should not be to track the status of content items. If all a workflow tool does is shout in red that many tasks are overdue, it doesn’t help. The tool behaves like an airport arrival and departure board that tells you flights are delayed without revealing why.  

Status-centric workflow tools simply present an endless queue of tasks with no opportunity to make the workload more manageable. 

Workflows should improve content quality and productivity.  Workflow tools contribute value to the extent they make the content more valuable. Quality and productivity drive content’s value. 

Yet few CMS workflow tools can seriously claim they significantly impact either the quality or productivity of the content development process. Administratively focused tools don’t add value.

Workflow tools should support people and goals –  the dimensions that ultimately shape the quality of outcomes. Yet workflow tools typically delegate all responsibility to people to ensure the workflow succeeds. Administratively focused workflows don’t offer genuine support. 

A workflow will enhance productivity – making content more valuable relative to the effort applied – only if it: 

  • Makes planning more precise
  • Accelerates the completion of tasks
  • Focuses on goals, not just activities
Elements of content workflow

Generic workflows presume generic tasks

Workflow tools fail to be “fit for purpose” when they don’t distinguish activities according to their purpose. They treat all activities as similar and equally important. Everything is a generic task: the company lawyer’s compliance review is no different than an intern’s review of broken links.  

Workflows track and forward tasks in a pass-the-batton relay. Each task involves a chain of dependencies. Tasks are assigned to one or more persons. Each task has a status, which determines the follow-on task.

CMS workflow tools focus on configuring a few variables:

  • Stage in the process
  • Task(s) associated with a stage
  • Steps involved with a task
  • Assigned employees required to do a step or task
  • Status after completing a task
  • The subsequent task or stage

From a coding perspective, workflow tools implement a series of simple procedural loops. The workflow engine resembles a hampster wheel. 

Hamster wheel
Like a hamster wheel, content workflow “engines” require manual pushing. Image: Wikimedia

A simple procedural loop would be adequate if all workflow tasks were similar. However, generic tasks don’t reflect the diversity of content work.

Content workflow tasks vary in multiple dimensions, involving differing priorities and hierarchies. Simple workflow tools flatten out these differences by designing for generic tasks rather than concrete ones. 

Variability within content workflows

Workflows vary because they involve different kinds of tasks.  Content tasks can be:

  • Cognitive (applying judgment)
  • Procedural (applying rules)
  • Clerical (manipulating resources) 

Tasks differ in the thought required to complete them.  Workflow tools commonly treat tasks as forms for users to complete.  They highlight discrete fields or content sections that require attention. They don’t distinguish between:

  1. Reflexive tasks (click, tap, or type)
  2. Reflective tasks (pause and think)

The user’s goal for reflexive tasks is to “Just do it” or “Don’t make me think.” They want these tasks streamlined as much as possible.  

In contrast, their goal for reflective tasks is to provide the most value when performing the task. They want more options to make the best decision. 

Workflows vary in their predictability. Some factors (people, budget, resources, priorities) are known ahead of time, while others will be unknown. Workflows should plan for the knowns and anticipate the unknowns.

Generic workflows are a poor way to compensate for uncertainty or a lack of clarity about how content should proceed. Workflows should be specific the content and associated business and technical requirements.  

Many specific workflows are repeatable. Workflows can be classified into three categories according to their frequency of use:

  1. Routine workflows 
  2. Ad hoc, reusable workflows
  3. Ad hoc, one-off workflows 

Routine workflows recur frequently. Once set, they don’t need adjustment. Because tasks are repeated often, routine workflows offer many opportunities to optimize, meaning they can be streamlined, automated, or integrated with related tasks. 

Ad hoc workflows are not predefined. Teams need to decide how to shape the workflow based on the specific requirements of a content type, subject matter, and ownership. 

Ad hoc workflows can be reusable. In some cases, teams might modify an existing workflow to address additional needs, either adding or eliminating tasks or changing who is responsible. Once defined, the new workflow is ready for immediate use. But while not routinely used, it may be useful again in the future, especially if it addresses occasional or rare but important requirements.  

Even when a content item is an outlier and doesn’t fit any existing workflow, it still requires oversight.  Workflow tools should make it easy to create one-off workflows. Ideally, generative AI could help employees state in general terms what tasks need to be done and who should be involved, and a bot could define the workflow tasks and assignments.

Workflows vary in the timing and discretion of decisions.  Some are preset, and some are decided at the spur of the moment.  

Consider deadlines, which can apply to intermediate tasks in addition to the final act of publishing.  Workflow software could suggest the timing of tasks – when a task should be completed – according to the operational requirements. It might assign task due dates:

  • Ahead of time, based on when actions must be completed to meet a mandatory publication deadline. 
  • Dynamically, based on the availability of people or resources.

Similarly, decisions associated with tasks have different requirements. Content task decisions could be 

  • Rules-driven, where rules predetermine the decision   
  • Discretionary and dependent on the decisionmaker’s judgment.

Workflows for individual items don’t happen in isolation. Most workflows assume a discrete content item. But workflows can also apply to groups of related items.  

Two common situations exist where multiple content items will have similar workflows:

  • Campaigns of related items, where items are processed together
  • A series of related items, where items are processed serially

In many cases, the workflow for related items should follow the same process and involve the same people.  Tools should enable employees to reuse the same workflow for related items so that the same team is involved.

Does the workflow validate the content for correctness?

Content quality starts with preventing errors. Workflows can and should prevent errors from happening.  

Workflows should check for multiple dimensions of content correctness, such as whether the content is:

  • Accurate – the workflow checks that dates, numbers, prices, addresses, and other details are valid.
  • Complete – the workflow checks that all required fields, assets, or statements are included.
  • Specific – the workflow accesses the most relevant specific details to include.
  • Up-to-date – the workflow validates that the data is the most recent available.
  • Conforming – the workflow checks that terminology and phrasing conform to approved usage.
  • Compliant – the workflow checks that disclaimers, warranties, commitments, and other statements meet legal and regulatory obligations.

Because performing these checks is not trivial, they are often not explicitly included in the workflow.  It’s more expeditious to place the responsibility for these dimensions entirely on an individual.  

Leverage machines to unburden users. Workflows should prevent obvious errors without requiring people to check themselves if an error is present. They should scrutinize text entry tasks to prevent input errors by including default or conditional values and auto-checking the formatting of inputs. In more ambiguous situations, they can flag potential errors that require an individual to look at. But they should never act too aggressively, where they generate errors through over-correction.

Error preemption is becoming easier as API integrations and AI tools become more prevalent. Many checks can be partially or fully automated by:

  • Applying logic rules and parameter-testing decision trees
  • Pulling information from other systems
  • Using AI pattern-matching capabilities 

Workflows must be self-aware. Workflows require hindsight and foresight. Error checking should be both reactive and proactive.  They must be capable of recognizing and remediating problems.

One of the biggest drivers of workflow problems is delays. Many delays are caused by people or contributions being unavailable because:

  • Contributors are overbooked or are away
  • Inputs are missing because they were never requested

Workflows should be able to anticipate problems stemming from resource non-availability.  Workflow tools can connect to enterprise calendars to know when essential people are unavailable to meet a deadline.  In such situations, it could invoke a fallback. The task could be reassigned, or the content’s publication could be a provisional release, pending final input from the unavailable stakeholder.

Workflows should be able to perform quality checks that transcend the responsibilities of a single individual to ensure these issues are not so dependent on one person. Before publication, it can monitor and check what’s missing, late, or incompatible. 

Automation promises to compress workflows but also carries risks. Workflows should check automation tasks in a staging environment to ensure they will perform as expected. Before making automation functionality generally available, the workflow staging will monitor discrete automation tasks and run batch tests on the automation of multiple items. Teams don’t want to discover that the automation they depend on doesn’t work when they have a deadline to meet. 

Does the workflow support sound governance?

Governance, risk, and compliance (GRC) are growing concerns for online publishers, particularly as regulators introduce more privacy, transparency, and online safety requirements. 

Governance provides reusable guidelines for performing tasks. It promotes consistency in quality and execution. It enables workflows to run faster and more smoothly by avoiding repeated questions about how to do things.  It ensures compliance with regulatory requirements and reduces reputation, legal, and commercial risks arising from a failure to vet content adequately.  

Workflow tools should promote three objectives:

  • Accountability (who is supposed to do what)
  • Transparency (what is happening compared to what’s supposed to happen)
  • Explainability (why tasks should be done in a certain way)

These qualities are absent from most content workflow functionality.

Defining responsibilities is not enough. At the most elemental level, a generic workflow specifies roles, responsibilities, and permissions. It controls access to content and actions, determining who is involved with a task and what they are permitted to do.  This kind of governance can prevent the wrong actors from messing up work, but they don’t help people responsible for the work from making unintended mistakes.

Assigned team members need support. The workflow should make it easier for them to make the correct decisions.  

Workflows should operationalize governance policies. However, if guidance is too intrusive, autocorrecting too aggressively, or making wrong assumptions, team members will try to short-circuit intrusive it.  

Discretionary decisions need guardrails, not enforcement. When a decision is discretionary, the goal should be to guide employees to make the most appropriate decision, not enforce a simple rule.  

Unfortunately, most governance guidance exists in documentation that is separated from workflow tools. Workflows fail to reveal pertinent guidance when it is needed. 

Incorporate governance into workflows at the point of decision. Bring guidance to the task so employees don’t need to seesaw between governance documents and workflow applications.  

Workflows can incorporate governance guidance in multiple ways by providing:

  • Guided decisions incorporating decision trees
  • Screen overlays highlighting areas to assess or check
  • Hints in the use interface
  • Coaching prompts from chatbots

When governance guidance isn’t specific enough for employees to make a clear decision, the workflow should provide a pathway to resolve the issue for the future. Workflows can include Issue management that triggers tasks to review and develop additional guidelines.

Does the workflow make planning more precise?

Bad plans are a common source of workflow problems.  Workflow planning tools can make tasks difficult to execute.

Planning acts like a steering wheel for a workflow, indicating the direction to go. 

Planning functionality is loosely integrated with workflow functionality, if at all. Some workflow tools don’t include planning, while those that do commonly detach the workflow from the planning.  

Planning and doing are symbiotic activities.  Planning functionality is commonly a calendar to set end dates, which the workflow should align with. 

But calendars don’t care about the resources necessary to develop the content. They expect that by choosing dates, the needed resources will be available.

Calendars are prevalent because content planning doesn’t follow a standardized process. How you plan will depend on what you know. Teams know some issues in advance, but other issues are unknown.  

Individuals will have differing expectations about what content planning comprises.  Content planning has two essential dimensions:

  • Task planning that emphasizes what tasks are required
  • Date planning that emphasizes deadlines

While tasks and dates are interrelated, workflow tools rarely give them equal billing.  Planning tools favor one perspective over the other.  

Task plans focus on lists of activities that need doing. The plan may have no dates associated with discrete tasks or have fungible dates that change.  One can track tasks, but there’s limited ability to manage the plan. Many workflows provide no scheduling or visibility into when tasks will happen.  At most, they show a Kanban board showing progress tracking.  They focus on if a task is done rather than when it should be done.

Design systems won’t solve workflow problems. Source: Utah design system

Date plans emphasize calendars. Individuals must schedule when various tasks are due. In many cases, those assigned to perform a task are notified in real time when they should do something. The due date drives a RAG (red-amber-green) traffic light indicator, where tasks are color-coded as on-track, delayed, or overdue based on dates entered in the calendar.

Manually selecting tasks and dates doesn’t provide insights into how the process will happen in practice.  Manual planning lacks a preplanning capability, where the software can help to decide in advance what tasks will be completed at specific times based on a forecast of when these can be done. 

Workflow planning capabilities typically focus on setting deadlines. Individuals are responsible for setting the publication deadline and may optionally set intermediate deadlines for tasks leading to the final deadline. This approach is both labor-intensive and prone to inaccuracies. The deadlines reflect wishes rather than realistic estimates of how long the process will take to complete. 

Teams need to be able to estimate the resources required for each task. Preplanning requires the workflow to: 

  1. Know all activities and resources that will be required  
  2. Schedule them when they are expected to happen.  

The software should set task dates based on end dates or SLAs. Content planning should resemble a project planning tool, estimating effort based on task times and sequencing—it will provide a baseline against which to judge performance.

For preplanning to be realistic, dates must be changeable. This requires the workflow to adjust dates dynamically based on changing circumstances. Replanning workflows will assess deadlines and reallocate priorities or assignments.

Does the workflow accelerate the completion of tasks?

Workflows are supposed to ensure work gets done on schedule. But apart from notifying individuals about pending dates, how much does the workflow tool help people complete work more quickly?  In practice, very little because the workflow is primarily a reminder system.  It may prevent delays caused by people forgetting to do a task without helping people complete tasks faster. 

Help employees start tasks faster with task recommendations. As content grows in volume, locating what needs attention becomes more difficult. Notifications can indicate what items need action but don’t necessarily highlight what specific sections need attention. For self-initiated tasks, such as evaluating groups of items or identifying problem spots, the onus is on the employee to search and locate the right items. Workflows should incorporate recommendations on tasks to prioritize.

Recommendations are a common feature in consumer content delivery. But they aren’t common in enterprise content workflows. Task recommendations can help employees address the expanding atomization of content and proliferation of content variants more effectively by highlighting which items are most likely relevant to an employee based on their responsibilities, recent activities, or organizational planning priorities.

Facilitate workflow streamlining. When workflows push manual activities from one person to another, they don’t reduce the total effort required by a team. A more data-driven workflow that utilizes semantic task tagging, by contrast, can reduce the number of steps necessary to perform tasks by:

  • Reducing the actions and actors needed 
  • Allowing multiple tasks to be done at the same time 

Compress the amount of time necessary to complete work. Most current content workflows are serial, where people must wait on others before being told to complete their assigned tasks. 

Workflows should shorten the path to completion by expanding the integration of: 

  1. Tasks related to an item and groups of related items
  2. IT systems and platforms that interface with the content management system

Compression is achieved through a multi-pronged approach:

  • Simplifying required steps by scrutinizing low-value, manually intensive steps
  • Eliminating repetition of activities through modularization and batch operations  
  • Involving fewer people by democratizing expertise and promoting self-service
  • Bringing together relevant background information needed to make a decision.

Synchronize tasks using semantically tagged workflows. Tasks, like other content types, need tags that indicate their purpose and how they fit within a larger model. Tags give workflows understanding, revealing what tasks are dependent on each other.  

Semantic tags provide information that can allow multiple tasks to be done at the same time. Tags can inform workflows:

  • Bulk tasks that can be done as batch operations
  • Tasks without cross-dependencies that can be done concurrently
  • Inter-related items that can be worked on concurrently

Automate assignments based on awareness of workloads. It’s a burden on staff to figure out to whom to assign a task. Often, task assignments are directed to the wrong individual, wasting time to reassign the task. Otherwise, the task is assigned to a generic queue, where the person who will do it may not immediately see it.  The disconnection between the assignment and the allocation of time to complete the task leads to delays.

The software should make assignments based on:

  • Job roles (responsibilities and experience) 
  • Employee availability (looking at assignments, vacation schedules, etc.) 

Tasks such as sourcing assets or translation should be assigned based on workload capacity. Content workflows need to integrate with other enterprise systems, such as employee calendars and reporting systems, to be aware of how busy people are and who is available.

Workload allocation can integrate rule-based prioritization that’s used in customer service queues. It’s common for tasks to back up due to temporary capacity constraints. Rule-based prioritization avoids finger-pointing. If the staff has too many requests to fulfill, there is an order of priority for requests in the backlog.  Items in backlog move up in priority according to their score, which reflects their predefined criticality and the amount of time they’ve been in the backlog. 

Automate routine actions and augment more complex ones. Most content workflow tools implement a description of processes rather than execute a workflow model, limiting the potential for automation. The system doesn’t know what actions to take without an underlying model.

A workflow model will specify automatic steps within content workflows, where the system takes action on tasks without human prompting. For example, the software can automate many approvals by checking that the submission matches the defined criteria. 

Linking task decisions to rules is a necessary capability. The tool can support event-driven workflows by including the parameters that drive the decision.

Help staff make the right decisions. Not all decisions can be boiled down to concrete rules. In such cases, the workflow should augment the decision-making process. It should accelerate judgment calls by making it easier for questions to be answered quickly.  Open questions can be tagged according to the issue so they can be cross-referenced with knowledge bases and routed to the appropriate subject matter expert.

Content workflow automation depends on deep integration with tools outside the CMS.  The content workflow must be aware of data and status information from other systems. Unfortunately, such deep integration, while increasingly feasible with APIs and microservices, remains rare. Most workflow tools opt for clunky plugins or rely on webhooks.  Not only is the integration superficial, but it is often counterproductive, where trigger-happy webhooks push tasks elsewhere without enabling true automation.

Does the workflow focus on goals, not just activities?

Workflow tools should improve the maturity of content operations. They should produce better work, not just get work done faster. 

Tracking is an administrative task. Workflow tracking capabilities focus on task completion rather than operational performance. With their administrative focus, workflows act like shadow mid-level managers who shuffle paper. Workflows concentrate on low-level task management, such as assignments and dates.

Workflows can automate low-level task activities; they shouldn’t force people to track them.   

Plug workflows’ memory hole. Workflows generally lack memory of past actions and don’t learn for the future. At most, they act like habit trackers (did I remember to take my vitamin pill today?) rather than performance trackers (how did my workout performance today compare with the rest of the week?)

Workflow should learn over time. It should prioritize tracking trends, not low-level tasks.

Highlight performance to improve maturity. While many teams measure the outcomes that content delivers, few have analytic tools that allow them to measure the performance of their work. 

Workflow analytics can answer: 

  • Is the organization getting more efficient at producing content at each stage? 
  • Is end-to-end execution improving?  

Workflow analytics can monitor and record past performance and compare it to current performance. They can reveal if content production is moving toward:

  • Fewer revisions
  • Less time needed by stakeholders
  • Fewer steps and redundant checks

Benchmark task performance. Workflows can measure and monitor tasks and flows, observing the relationships between processes and performance. Looking at historical data, workflow tools can benchmark the average task performance.

The most basic factor workflows should measure is the resources required. Each task requires people and time, which are critical KPIs relating to content production, 

Analytics can:

  1. Measure the total time to complete tasks
  2. Reveal which people are involved in tasks and the time they take.

Historic data can be used to forecast the time and people needed, which is useful for workflow planning. This data will also help determine if operations are improving.  

Spot invisible issues and provide actionable remediation.  It can be difficult for staff to notice systemic problems in complex content systems with multiple workflows. But a workflow system can utilize item data to spot recurring issues that need fixing.  

Bottlenecks are a prevalent problem. Workflows that are defined without the benefit of analytics are prone to develop bottlenecks that recur under certain circumstances. Solving these problems requires the ability to view the behavior of many similar items. 

Analytics can parse historical data to reveal if bottlenecks tend to involve certain stages or people. 

Historical workflow data can provide insights into the causes of bottlenecks, such as tasks that frequently involve:

  • Waiting on others
  • Abnormal levels of rework
  • Approval escalations

The data can also suggest ways to unblock dependencies through smart allocation of resources.  Changes could include:

  • Proactive notifications of forecast bottlenecks
  • Re-scheduling
  • Sifting tasks to an alternative platform that is more conducive

Utilize analytics for process optimization. Workflow tools supporting other kinds of business operations are beginning to take advantage of process mining and root cause analysis.  Content workflows should explore these opportunities.

Reinventing workflow to address the content tsunami

Workflow solutions can’t be postponed.  AI is making content easier to produce: a short prompt generates volumes of text, graphics, and video. The problem is that this content still needs management.  It needs quality control and organization. Otherwise, enterprises will be buried under petabytes of content debt.

Our twentieth-century-era content workflows are ill-equipped to respond to the building tsunami. They require human intervention in every micro decision, from setting due dates to approving wording changes. Manual workflows aren’t working now and won’t be sustainable as content volumes grow.

Workflow tools must help content professionals focus on what’s important. We find some hints of this evolution in the category of “marketing resource management” tools that integrate asset, work, and performance management. Such tools recognize the interrelationships between various content items, and what they are expected to accomplish.  

The emergence of no-code workflow tools, such as robotic process automation (RPA) tools, also points to a productive direction for content workflows. Existing content workflows are generic because that’s how they try to be flexible enough to handle different situations. They can’t be more specific because the barriers to customizing them are too high: developers must code each decision, and these decisions are difficult to change later. 

No code solutions give the content staff, who understand their needs firsthand, the ability to implement decisions about workflows themselves without help from IT. Enterprises can build a more efficient and flexible solution by empowering content staff to customize workflows.

Many content professionals advocate the goal of providing Content as a Service (CaaS).  The content strategist Sarah O’Keefe says, “Content as a Service (CaaS) means that you make information available on request.” Customers demand specific information at the exact moment they need it.  But for CaaS to become a reality, enterprises must ensure that the information that customers request is available in their repositories. 

Systemic challenges require systemic solutions. As workflow evolves to handle more involved scenarios and provide information on demand, it will need orchestration.  While individuals need to shape the edges of the system, the larger system needs a nervous system that can coordinate the activities of individuals.  Workflow orchestration can provide that coordination.

Orchestration is the configuration of multiple tasks (some may be automated) into one complete end-to-end process or job. Orchestration software also needs to react to events or activities throughout the process and make decisions based on outputs from one automated task to determine and coordinate the next tasks.”  

Orchestration is typically viewed as a way to decide what content to provide to customers through content orchestration (how content is assembled) and journey orchestration (how it is delivered).  But the same concepts can apply to the content teams developing and managing the content that must be ready for customers.  The workflows of other kinds of business operations embrace orchestration. Content workflows must do the same. 

Content teams can’t pause technological change; they must shape it.  A common view holds that content operations are immature because of organizational issues. Enterprises need to sort out the problems of how they want to manage their people and processes before they worry about technology. 

We are well past the point where we can expect technology to be put on hold while sorting out organizational issues. These issues must be addressed together. Other areas of digital transformation demonstrate that new technology is usually the catalyst that drives the restructuring of business processes and job roles. Without embracing the best technology can offer, content operations won’t experience the change it needs.

– Michael Andrews

Categories
Content Engineering Personalization

Orchestrating the assembly of content

Structured content enables online publishers to assemble pieces of content in multiple ways.  However, the process by which this assembly happens can be opaque to authors and designers. Read on to learn how orchestration is evolving and how it works.

To many people, orchestration sounds like jargon or a marketing buzzword. Yet orchestration is no gimmick. It is increasingly vital to developing, managing, and delivering online content. It transforms how publishers make decisions about content, bringing flexibility and learning to a process hampered in the past by short-term planning and jumbled, ad-hoc decisions.  

Revealing the hidden hand of orchestration

Orchestration is both a technical term in content management and a metaphor. Before discussing the technical aspects of orchestration, let’s consider the metaphor.  Orchestration in music is how you translate a tune into a score that involves multiple instruments that play together harmoniously. It’s done by someone referred to as an arranger, someone like Quincy Jones. As the New Yorker once wrote: “Everyone knows Quincy Jones’s name, even if no one is quite sure what he does. Jones got his start in the late nineteen-forties as a trumpeter, but he soon mastered the art of arranging jazz—turning tunes and melodies into written music for jazz ensembles.”

Much like music arranging, content orchestration happens off stage, away from the spotlight. It doesn’t get the attention given to UI design. Despite its stealthy profile, numerous employees in organizations become involved with orchestration, often through small-scale A/B testing by changing an image or a headline. 

Orchestration typically focuses on minor tweaks to content, often cosmetic changes. But orchestration can also address how to assemble content on a bigger scale. The emergence of structured content makes intricate, highly customized orchestration possible.

Content assembly requires design and a strategy. Few people consider orchestration when planning how content is delivered to customers. They generally plan content assembly by focusing on building individual screens or a collection of web pages on a website. The UI design dictates the assembly logic and reflects choices made at a specific time.  While the logic can change, it tends to happen only in conjunction with changes to the UI design. 

Orchestration allows publishers to specify content assembly independently of its layout presentation. It does so by approaching the assembly process abstractly: evaluating content pieces’ roles and purposes that address specific user scenarios.

Assembly logic is becoming distributed. Content assembly logic doesn’t happen in one place anymore. Originally, web teams created content for assembly into web pages using templates defined by a CMS on the backend. In the early 2000s, frontend developers devised ways to change the content of web pages presented in the browser using an approach known initially as Ajax, a term coined by the information architect Jesse James Garrett. Today, content assembly can happen at any stage and in any place. 

Assembly is becoming more sophisticated. At first, publishers focused on selecting the right web page to deliver. The pages were preassembled – often hand-assembled. Next, the focus shifted to showing or hiding parts of that web page by manipulating the DOM (document object model).  

Nowadays, content is much more dynamic. Many web pages, especially in e-commerce, are generated programmatically and have no permanent existence.  “Single page applications” (SPAs) have become popular, and the content will morph continuously. 

The need for sophisticated approaches for assembling content has grown with the emergence of API-accessible structured content. When content is defined semantically, rather than as web pages, the content units are more granular. Instead of simply matching a couple of web page characteristics, such as a category tag and a date, publishers now have many more parameters to consider when deciding what to deliver to a user.

Orchestration logic is becoming decoupled from applications. While orchestration can occur within a CMS platform, it is increasingly happening outside the CMS to take advantage of a broader range of resources and capabilities. With APIs growing in coordinating web content, much content assembly now occurs in a middle layer between the back-end storing the content and the front-end presenting it. The logic driving assembly is becoming decoupled from both the back-end and front-end. 

Publishers have a growing range of options outside their CMS for deciding what content to deliver.  Tools include:

  • Digital experience, composition, and personalization orchestration engines (e.g., Conscia, Ninetailed)
  • Graph query tools (e.g., PoolParty)
  • API federation management tools (e.g., Apollo Federation)

These options vary in their aims and motivations, and they differ in their implementations and features. Their capabilities are sometimes complementary, which means they can be used in combination. 

Orchestration inputs that frame the content’s context

Content structuring supports extensive variation in the types of content to present and what that content says. 

Orchestration involves more than retrieving a predefined web page.  It requires considering many kinds of inputs to deliver the correct details. 

Content orchestration will reflect three kinds of considerations:

  1. The content’s intent – the purpose of each content piece
  2. The organization’s operational readiness to satisfy a customer’s need
  3. The customer or user’s intent – their immediate or longer-term goal

Content characteristics play a significant role in assembly. Content characteristics define variations among and within content items. An orchestration layer will account for characteristics of available content pieces, such as:

  • Its editorial role and purpose, such as headings, explanations, or calls to action
  • Topics and themes, including specific products or services addressed
  • Intended audience or customer segment
  • Knowledge level such as beginner or expert
  • Intended journey or task stage
  • Language and locale
  • Date of creation or updating
  • Author or source
  • Size, length, or dimensions
  • Format and media
  • Campaign or announcement cycle
  • Product or business unit owner
  • Location information, such as cities or regions that are relevant or mentioned
  • Version 

Each of these characteristics can be a variable and useful when deciding what content to assemble. They indicate the compatibility between pieces and their suitability for specific contexts.

Other information in the enterprise IT ecosystem can help decide what content to assemble that will be most relevant for a specific context of use. This information is external to the content but relevant to its assembly.

Business data is also an orchestration input. Content addresses something a business offers. The assembled content should link to business operations to reflect what’s available accurately.

The assembled content will be contextually relevant only if the business can deliver to the customer the product or services that the content addresses. Customers want to know which pharmacy branches are open now or which items are available for delivery overnight.  The assembled content must reflect what the business can deliver when the customer seeks it.

The orchestration needs to combine content characteristics from the CMS with business data managed by other IT systems. Many factors can influence what content should be presented, such as:

  • Inventory management data
  • Bookings and orders data
  • Facilities’ capacity or availability
  • Location hours
  • Pricing information, promotions, and discount rules
  • Service level agreement (SLA) rules
  • Fulfillment status data
  • Event or appointment schedules
  • Campaigns and promotions schedule
  • Enterprise taxonomy structure defining products and operating units

Business data have complex rules managed by the IT system of record, not the CMS or the orchestration layer.  For content orchestration, sometimes it is only necessary to provide a “flag,” checking whether a condition is satisfied to determine which content option to show.

Customer context is the third kind of orchestration input. Ideally, the publisher will tailor the content to the customer’s needs – the aim of personalization.  The orchestration process must draw upon relevant known information about the customer: the customer’s context.

The customer context encompasses their identity and their circumstances. A customer’s circumstances can change, sometimes in a short time.  And in some situations, the customer’s circumstances dictate the customer’s identity. People can have multiple identities, for example, as consumers, business customers at work, or parents overseeing decisions made by their children.

Numerous dimensions will influence a customer’s opinions and needs, which in turn will influence the most appropriate content to assemble. Some common customer dimensions include:

  • Their location
  • Their personal characteristics, which might include their age, gender, and household composition, especially when these factors directly influence the relevance of the content, for example, with some health topics
  • Things they own, such as property or possessions, especially for content relating to the maintenance, insurance, or buying and selling of owned things
  • Their profession or job role, especially for content focused on business and professional audiences
  • Their status as a new, loyal, or churned customer
  • Their purchase and support history

The chief challenge in establishing the customer context is having solid insights.  Customers’ interactions on social media and with customer care provide some insights, but publishers can tap a more extensive information store.  Various sources of customer data could be available:

  • Self-disclosed information and preferences to the business (zero-party data or 0PD)
  • The history of a customer’s interactions with the business (first-party data or 1PD) 
  • Things customers have disclosed about themselves in other channels such as social media or survey firms (second-party data or 2PD)
  • Information about a cohort they are categorized as belonging to, using aggregated data originating from multiple sources (third-party data or 3PD)

Much of this information will be stored in a customer data platform (CDP), but other data will be sourced from various systems.  The data is valid only to the extent it is up-to-date and accurate, which is only sometimes a safe assumption.

Content behavior can shape the timing and details assembled in orchestration. Users can signal their intent through their interaction with content. The user’s decisions while interacting with content can signal their intentions.  Some behavior variables include:

  • Source of referral 
  • Previously viewed content 
  • Expressed interest in topics or themes based on prior content consumed
  • Frequency of repeat visits 
  • Search terms used 
  • Chatbot queries submitted
  • Subscriptions chosen or online events booked
  • Downloads or requests for follow-up information
  • The timing of their visit in relation to an offer 

The most valuable and reliable signals will be specific to the context. Many factors can shape intent, so many potential factors will not be relevant to individual customers. Just because some factors could be relevant in certain cases does not imply they will be applicable in most cases. 

Though challenging, leveraging customer intent offers many opportunities to improve the relevance of content. A rich range of possible dimensions is available. Selecting the right ones can make a difference. 

Don’t rely on weak signals to overdetermine intent. When the details about individual content behavior or motivations are scant, publishers sometimes rely on collective behavioral data to predict individual customer intentions.  While occasionally useful, predictive inputs about customers can be based on faulty assumptions that yield uneven results. 

Note the difference between tailoring content to match an individual’s needs and the practice of targeting. Targeting differs from personalization because it aims to increase average uptake rather than satisfy individual goals. It can risk alienating customers who don’t want the proffered content.

Draw upon diverse sources of input. By utilizing a separate layer to manage orchestration, publishers, in effect, create a virtual data tier that can federate and assess many distinct and independent sources of information to support decisions relating to content delivery. 

An orchestration layer gives publishers greater control over choosing the right pieces of content to offer in different situations. Publishers gain direct control over parameters to select,  unlike many AI-powered “decision engines” that operate like a black box and assume control over the content chosen.

The orchestration score

If the inputs are the notes in orchestration, the score is how they are put together – the arrangement. A rich arrangement will sometimes be simple but often will be sophisticated. 

Orchestration goes beyond web search and retrieval. In contrast to a ordinary web search, which retrieves a list of relevant web pages, orchestration queries must address many more dimensions. 

In a web search, there’s a close relationship between what is requested and what is retrieved. Typically, only a few terms need matching. Web search queries are often loose, and the results can be hit or miss. The user is both specifying and deciding what they want from the results retrieved.

In orchestration, what is requested needs to anticipate what will be relevant and exclude what won’t be. The request may refer to metadata values or data parameters that aren’t presented in the content that’s retrieved. The results must be more precise. The user will have limited direct input into the request for assembled content and limited ability to change what is provided to them.

Unlike a one-shot web search process, in orchestration, content assembly involves a multistage process.  

The orchestration of structured content is not just choosing a specific web page based on a particular content type.  It differs in two ways:

  1. You may be combining details from two (or more) content types.  
  2. Instead of delivering a complete web page associated with each content type (and potentially needing to hide parts you don’t want to show), you select specific details from content items to deliver as an iterative procedure.

Unpacking the orchestration process. Content orchestration consists of three stages:

  1. FIND stage: Choose which content items have relevant material to support a user scenario
  2. MATCH stage: Combine content types that, if presented together, provide a meaningful, relevant experience
  3. SELECT and RETURN stage: Choose which elements within the content items will be most relevant to deliver to a user at a given point in time

Find relevant content items. Generally, this involves searching metadata tags such as taxonomy terms or specific values such as dates. Sometimes, specific words in text values are sought. If we have content about events, and all the event descriptions have a field with the date, it is a simple query to retrieve descriptions for events during a specified time period.

Typically, a primary content type will provide most of the essential information or messages. However, we’ll often also want to draw on information and messages from other content types to compose a content experience. We must associate different types of items to be able to combine their details.

Match companion content types. What other topics or themes will provide more context to a message? The role of matching is to associate related topics or tasks so that complementary information and messages can be included together.

Graph queries are a powerful approach to matching because they allow one to query “edges” (relationships) between “nodes” (content types.)  For example, if we know a customer is located in a specific city, we might want to generate a list of sponsors of events happening in that city.  The event description will have a field indicating the city. It will also reference another content type that provides a profile of event sponsors.  It might look like this in a graph query language like GQL, with the content types in round brackets and the relationships in square brackets.

MATCH (:Event WHERE location=”My City”) - [:SponsoredBy] -> (:SponsorProfile)

We have filtered events in the customer’s city (fictiously named My City) and associated content items about sponsors who have sponsored those events. Note that this query hasn’t indicated what details to present to users. It only identifies which content types would be relevant so that various types of details can be combined. 

Unlike in a common database query, what we are looking for and want to show are not the same. 

Select which details to assemble. We need to decide what information within a relevant content type which details will be of greatest interest to a user. Customers want enough details for the pieces to provide meaningful context. Yet they probably won’t want to see everything available, especially all at once – that’s the old approach of delivering preassembled web pages and expecting users to hunt for relevant information themselves.

Different users will want different details, necessitating decisions about which details to show. This stage is sometimes referred to as experience composition because the focus is on which content elements to deliver. We don’t have to worry about how these elements will appear on a screen, but we will be thinking about what specific details should be offered.

GraphQL, a query language used in APIs, is very direct in allowing you to specify what details to show. The GraphQL query mirrors the structure of the content so that one can decide which fields to show after seeing which fields are available. We don’t want to show everything about a sponsor, just their name, logo, and how long they’ve been sponsoring the event.  A hypothetical query named “local sponsor highlights” would extract only those details about the sponsor we want to provide in a specific content experience.

Query LocalSponsorHighlights {
… on SponsorProfile {
name
logo
sponsorSince
} }

The process of pulling out specific details will be repeated iteratively as customers interact with the content.

Turning visions into versions

Now that we have covered the structure and process of orchestration let’s look at its planning and design. Publishers enjoy a broad scope for orchestrating content. They need a vision for what they aim to accomplish. They’ll want to move beyond the ad hoc orchestration of page-level optimization and develop a scenario-driven approach to orchestration that’s repeatable and scaleable.

Consider what the content needs to accomplish. Content can have a range of goals. They can explicitly encourage a reader to do something immediately or in the future. Or they encourage a reader’s behavior by showing goodwill and being helpful enough that the customer wants to do something without being told what to do.

Content goalImmediate 
(Action outcome)
Consequent
 (Stage outcome)
Explicit 
(stated in the content)
CTA (call to action) conversionContact sales or visit a retail outlet
Implicit 
(encouraged by the content)
Resolve an issue without contacting customer supportRenew their subscription

Content goals must be congruent with the customer’s context. If customers have an immediate goal, then the content should be action-oriented. If their goal is longer-term, the content should focus on helping the customer move from one stage to another.

Orchestration will generate a version of the content representing the vision of what the pieces working together aim to accomplish.

Specify the context.  Break down the scenario and identify which contextual dimensions are most critical to providing the right content. The content should adapt to the user context, reflect the business context, and provide users with viable options. The context includes:

  • Who is seeking content (the segment, especially when the content is tailored for new or existing customers, or businesses or consumers, for example)
  • What they are seeking (topics, questions, requests, formats, and media)
  • When they are seeking it (time of day, day of the week, month, season, or holiday, all can be relevant)
  • Where they are seeking it (region, country, city, or facility such as an airport if relevant)
  • Why (their goal or intent as far as can be determined)
  • How (where they started their journey, channels used, how long have they pursuing task)

Perfecting the performance: testing and learning

Leonard Bernstein conducts the New York Philharmonic in a Young People’s Concert. Image: Library of Congress


An orchestral performance is perfected through rehearsal. The performance realized is a byproduct of practice and improvisation.

Pick the correct parameters. With hundreds of parameters that could influence the optimal content orchestration, it is essential that teams not lock themselves into a few narrow ones. The learning will arise from determining which factors deliver the right experience and results in which circumstances. 

Content parameters can be of two kinds:

  1. Necessary characteristics tell us what values are required 
  2. Contingency characteristics indicate values to try to find which ones work best
Specifies in the orchestrationDetermines in the contentOutcome expected
Necessary characteristics

(tightly defined scenarios)
What values are required in a given situationWhich categorical version or option the customer getsThe right details to show to a given customer in a given situation
Contingency characteristics

(loosely defined scenarios)
What values are allowed in a given situationWhich versions could be presentedCandidate options to present to learn which most effectively matches the customer’s needs

The two approaches are not mutually exclusive. More complex orchestration (sometimes referred to as “multihop” queries) will involve a combination of both approaches.

Necessary characteristics reflect known and fixed attributes in the customer or business context that will affect the correct content to show. For example, if the customer has a particular characteristic, then a specific content value must be shown. The goal should be to test that the orchestration is working correctly – that the assumptions about the context are correct.  For example, there are no wrong assumptions or missing ones. This dimension is especially important for aspects that are fixed and non-negotiable. The content needs to adapt to these circumstances, not ignore them. 

Contingency characteristics reflect uncertain or changeable attributes relating to the customer’s context. For example, if the customer has had any one of several characteristics now or in the past, try showing any one of several available content values to see which works best given what’s known. The orchestration will prioritize variations randomly or in some ranked order based on what’s available to address the situation.

You can apply the approach to other situations involving uncertainty. When there are information gaps or delays, contingency characteristics can apply to business operations variables and to the content itself.  The goal of using contingency characteristics is to try different content versions to learn what’s most effective in various scenarios.  

Be clear on what content can influence. We have mostly looked at the customer’s context as an input into orchestration. Customers will vary widely in their goals, interests, abilities, and behaviors. A large part of orchestration concerns adapting content to the customer’s context. But how does orchestration impact the customer? In what ways might the customer’s context be the outcome of the content?

Consider how orchestration supports a shift in the customer’s context. Orchestration can’t change the fixed characteristics of the customer. It can sway ephemeral characteristics, especially content choices, such as whether the customer has requested further information.  And the content may guide customers toward a different context. 

Context shifting involves using content to meet customers where they are so they can get where they want to be. Much content exists to change the customer’s context by enabling them to resolve a problem or encouraging them to take action on something that will improve their situation. 

The orchestration of content needs to connect to immediate and downstream outcomes.  Testing orchestration entails looking at its effects on online content behavior and how it influences interactions with the business in other areas. Some of these interactions will happen offline.  

The task of business analytics is to connect orchestration outputs with customer outcomes. The migration of orchestration to an API layer should open more possibilities for insights and learning. 

– Michael Andrews

Categories
Content Engineering

Content models: lessons from LEGO

Content models can be either a friend or foe.  A content model can empower content to become a modular and flexible resource when developed appropriately. Models can liberate how you develop content and accelerate your momentum.

However, if the model isn’t developed correctly, it can become a barrier to gaining control over your content.  The model ends up being hard to understand, the cause of delay, and eventually an albatross preventing you from pivoting later.  

Models are supposed to be the solution, not the problem

Content modeling can be challenging. Those involved with content modeling have likely heard stories about teams wrestling with their content model because it was too complex and difficult to implement. 

Some common concerns about content modeling include:

  • There’s not enough time to develop a proper content model.
  • We don’t know all our requirements yet.
  • People don’t understand why we need a content model or how to develop one.
  • Most of the content model doesn’t seem relevant to an individual writer.
  • Someone else should figure this out.

These concerns reflect an expectation that a content model is “one big thing” that needs to be sorted out all at once in the correct way, what might be called the monolithic school of content modeling. 

Rather than treat content models as monolithic plans, it is more helpful to think of them as behaving like LEGO. They should support the configuration of content in multiple ways.

Yet, many content models are designed to be monolithic. They impose a rigid structure on authors and prevent organizations from addressing a range of needs.  They become the source of stress because how they are designed is brittle.

In an earlier post, I briefly explored how LEGO’s design supports modularity through what’s called “clutch power.” LEGO can teach us insights about bringing modularity to content models. Contrary to what some believe, content models don’t automatically make content modular, especially when they are missing clutch power. But it’s true that content models can enable modularity. The value of a content model depends on its implementation. 

A complex model won’t simplify content delivery.  Some folks mistakenly think that the content model can encapsulate complexity that can then be hidden from authors, freeing them from the burdens of details and effort. That’s true only to a point.  When the model gets too complex for authors to understand and when it can’t easily be changed to address new needs, its ability to manage details deteriorates. The model imposes its will on authors rather than responding to the desire of authors.

The trick is to make model-making a modular process instead of a top-down, “here’s the spec, take it or leave it” approach. 

Don’t pre-configure your content

LEGO pioneered the toy category of multipurpose bricks. But over time, they have promoted the sale of numerous single-purpose kits, such as one for a typewriter that will “bring a touch of nostalgia to your home office.”  For $249.99, buyers get the gratification of knowing exactly what they will assemble before opening the package.  But they never experience the freedom of creating their own construction.

LEGO typewriter, 2079 pieces (screenshot)

The single-purpose kits contain numerous single-purpose bricks. The kit allows you to build one thing only. Certain pieces, such as the typewriter keys and a black and red ink spool ribbon, aren’t useful for other applications. When the meme gets stale, the bricks are no longer useful.

One of the most persistent mistakes in design is building for today’s problems with no forethought as to how your solution will accommodate tomorrow’s needs. 

Many content models are designed to be single-purpose.  They reflect a design frozen in time when the model was conceived – often by an outside party such as a vendor or committee. Authors can’t change what they are permitted to create, so the model will often be vague and unhelpful to not overly constrain the author. They miss out on the power of true modularity. 

When you keep your model fluid, you can test and learn.

Make sure you can take apart your model (and change it)

Bent Flyvbjer and Dan Gardner recently published a book on the success or failure of large projects called How Big Things Get Done.  They contrast viewing projects as “one big thing” versus “many small things.”

Flyvbjer and Gardner cite LEGO as a model for how to approach large projects. They say: “Get a small thing, a basic building block. Combine it with another and another until you have what you need.”

The textbook example they cite of designing “one big thing” all at once is a nuclear power plant.  

Flyvbjer and Gardner comment: “You can’t build a nuclear power plant quickly, run it for a while, see what works and what doesn’t, then change the design to incorporate the lessons learned. It’s too expensive and dangerous.”

Building a content model can seem like designing a nuclear power plant: a big project with lots of details that can go wrong. Like designing a nuclear power plant, building a content model is not something most of us do very often.  Most large content models are developed in response to a major IT re-platforming.  Since re-platformings are infrequent, most folks don’t have experience building content models, and when a re-platform is scheduled, they are unprepared. 

Flyvbjer and Gardner note that “Lacking experimentation and experience, what you learn as you proceed is that the project is more difficult and costly than you expected.” They note an unwanted nuclear power plant costs billions to decommission.  Similarly, a content model developed all at once in a short period will have many weaknesses and limitations. Some content types will be too simple, others will be too complex.  Some will be unnecessary ultimately, while others will be overlooked.  In the rush to deliver a project, it can be difficult to even reflect on how various decisions impact the overall project.

You will make mistakes with your initial content model and will learn what best supports authoring or frontend interaction needs, enables agile management of core business messages and data, or connects to other IT systems.  And these requirements will change too. 

Make sure you can change your mind.  Don’t let your IT team or vendor tell you the model is set in stone.  While gratuitous changes should be avoided – they generate superfluous work for everyone – legitimate revisions to a content model should be expected.  Technical teams need to develop processes that can address the need to add or remove elements in content types, create new types, and split or merge types. 

Allow authors to construct a range of outputs from basic pieces

In LEGO, the color of bricks gives them expressive potential. Even bricks of the same size and shape can be combined to render an experience.  I recently visited an exhibit of LEGO creations at the National Building Museum in Washington, DC.  

Mona Lisa in LEGO by Warren Elsmore at National Building Museum

Much online content is developed iteratively.  Writers pull together some existing text, modify or update it, add some images, and publish it.  Publishing is often a mixture of combining something old with something new.  A good content model will facilitate that process of composition. It will allow authors to retrieve parts they need, develop parts they don’t yet have, and combine them into a content item that people read, watch, or listen to.

Content modeling is fundamentally about developing an editorial schema.  Elements in content types represent the points authors want to make to audiences. The content types represent larger themes. 

A content model developed without the input of authors isn’t going to be successful. Ideally, authors will directly participate in the content modeling process. Defining a content type does not require any software expertise, and at least one CMS has a UI that allows non-technical users to create content types. However, it remains true that modeling is not easy to understand quickly for those new to the activity. I’m encouraged by a few vendor initiatives that incorporate AI and visualization to generate a model that could be edited. But vendors need to do more work to empower end-users so they can take more control over the content model they must rely on to create content. 

Keep logic out of your model 

LEGO is most brilliant when users supply the creativity rather than rely on instructions from the LEGO corporation.  

The DIY ethos of LEGO is celebrated on the Rebrickable website, where LEGO enthusiasts swap concepts for LEGO creations.

Rebrickable website (screenshot)

Rebrickable illustrates the principle of decoupling content from its assembly. LEGO bricks have an independent existence from assembly instructions. Those who develop plans for assembling LEGO are not the same people who created the bricks.  Bricks can be reused to be assembled in many ways – including ways not foreseen. Rebrickable celebrates the flexibility of LEGO bricks.

A content model should not contain logic. Logic acts like glue that binds pieces together instead of allowing them to be configured in different ways. Decouple your content from any logic used to deliver that content. Logic in content models gets in the way: it complicates the model, making it difficult for authors (and even developers) to understand, and it makes the model less flexible.  

Many Web CMS and XML content models mix together the structure of types with templating or assembly logic. When the model includes assembly instructions, it has already predetermined how modules are supposed to fit together, therefore precluding other ways they might connect. Content models for headless CMS implementations, in contrast, define content structure in terms of fields that can be accessed by APIs that can assemble content in any manner.  

A brittle content model that can’t be modified is also expensive.  Many publishers are saddled with legacy content models that are difficult to change or migrate. They hate what they have but are afraid to decommission it because they are unsure how it was put together. They are unable to migrate off of a solution someone built for them long ago that doesn’t meet their needs.  This phenomenon is referred to as “lock-in.”

A flexible content model will focus only on the content, not how to assemble the content. It won’t embed “views” or navigation or other details that dictate how the content must be delivered. When the model is focused solely on the content, the content is truly modular and portable.  

Focus on basics and learn

LEGO encourages play.  People don’t worry about assembling LEGO bricks the wrong way because there are many valid ways to connect them. There’s always the possibility of changing your mind.

Each LEGO brick is simple.  But by trying combinations over time, the range of possibilities grows.  As Flyvbjer and Gardner say in their book, “Repetition is the genius of modularity; it enables experimentation.”

Start your model with something basic. An author biography content type would be a good candidate. You can use it anytime you need to provide a short profile of an author. It seems simple enough, too.  A name, a photo, and a paragraph description might be all you need.  Congratulations, you have created a reusable content type and are on the path toward content modularity.

Over time, you realize there are other nuances to consider. Your website also features presenters at live events.  Is a presenter biography different from an author biography?  Someone in IT suggests that the author bio can be prepopulated with information from the employee directory, which includes the employee’s job title. The HR department wants to run profiles of employees on their hiring blog and wonders if the employee profile would be like the author bio.  

As new requirements and requests emerge, you start to consider the variations and overlaps among them. You might try to consolidate variations into a single content type, extend a core content type to handle specialized needs or decide that consolidating everything into a single content type wouldn’t simplify things at all. Only through experimentation and learning will the correct choice become apparent.

It’s a good idea to document your decisions and share what you’ve learned, including with outside teams who might be curious about what you are working on – encourage them to steal your ideas.  You can revisit your rationale when you are faced with new requirements to evaluate.

Evolve your model

Embrace the LEGO mindset of starting small and thinking big.

The more experience your team gains with content modeling, the more comprehensive and capable your content model will become.

Flyvbjer and Gardner note: “Repetition also generates experience, making your performance better. This is called ‘positive learning.’”  They contrast it with negative learning, where “the more you learn, the more difficult and costly it gets.” When the model starts off complex, it gets too big to understand and manage. Teams may realize only one person ever understood how the model was put together, and that person moved to another company. 

Content modeling is about harnessing repeating patterns in content. The job of content modeling is to discern these patterns.

Content modeling should be a continuous activity. A content model isn’t a punch-list task that, once launched, is done. The model isn’t frozen. 

The parts that work right will be second nature, while the parts that are still rough will suggest refinement.

While content modeling expertise is still far from mainstream, there are growing opportunities to gain it. Teams don’t have to wait for a big re-platforming project.  Instead, they should start modeling with smaller projects. 

New low-cost and/or low-code tools are making it easier to adopt modular approaches to content. Options include static site generators (SSGs) like Jekyll or Hugo, open-source headless CMSs like Strapi, and no-code web-builders like Webflow. Don’t worry if these tools don’t match your eventual needs.  If you build a model that supports true modularity, it will be easy to migrate your content to a more sophisticated tool later. 

With smaller projects, it will be more evident that the content model can change as you learn new things and want to improve it. Like other dimensions of software in the SaaS era, content models can continuously be released with improvements.  Teams can enhance existing content types and add new ones gradually. 

The evolution of content models will also include fixing issues and improving performance, similar to the refactoring process in software.  You may find that some elements aren’t much used and can be removed. You can simplify or enrich your model.  Sometimes you’ll find similar types that can be folded into a single type – simplification through abstraction.  Other times, you’ll want to extend types to accommodate details that weren’t initially identified as being important. 

Continual iteration may seem messy and inefficient, and in some ways, it is. Those who approach a content model as one big thing wish the model to be born fully mature, timeless, and resistant to aging. In practice, content models require nurture and maintenance. Changing details in a large content model will seem less scary once you’ve had experience making changes on a smaller model.  And by working on them continuously, organizations learn how to make them better serve their needs. 

Regard changes to content models as a learning experience, not a fearful one.

Allow your model to scale up

Flyvbjer and Gardner note: “A block of Lego is a small thing, but by assembling more than nine thousand of them, you can build one of the biggest sets Lego makes, a scale model of the Colosseum in Rome. That’s modularity.”

My visit to the National Building Museum revealed how big a scale small LEGO bricks can build, as shown in this model of London’s St Pancras rail station. 

St Pancras rail station in LEGO by Warren Elsmore at National Building Museum

The magic of LEGO is that all bricks can connect to one another. The same isn’t necessarily true of content models. Many models reflect the idiosyncrasies of specific CMS platforms.  Each model becomes an isolated island that can’t be connected to easily.  That’s why content silos are a pervasive problem.

However, a modular content model focused on content and not assembly logic can easily connect to other modular models. 

A content model can enable content to scale.  But how does one scale the content model?

The good news is that the same process for developing a small model applies to developing a larger one.  When you embrace an iterative, learning-driven approach to modeling, scaling your model is much easier.  You understand how models work: which decisions deliver benefits and which ones can cause problems. You understand tradeoffs and can estimate the effort involved with alternatives.

One key to scaling a model is to play well with other models. In large organizations, different teams may be developing content models to support their respective web properties.  If these models are modular, they can connect.  Teams can share content.

It’s likely when there are two models, there will be overlaps in content types. Each model will have a content type defining a blog post, for example. Such situations offer an opportunity to rationalize the content type and standardize it across the teams. Eventually, separate teams may use a common content model supporting a single CMS. But until then, they can at least be using the same content type specifications.  They can learn from each other.

– Michael Andrews