Categories
Content Engineering

Landscape of Content Variation

Publishers understandably want to leverage what they’ve already produced when creating new content.  They need to decide how to best manage and deliver new content that’s related to — but different from — existing content. To create different versions of content, they have three options, which I will refer to as the template-based, compositional, and elastic approaches.

To understand how the three approaches differ, it is useful to consider a critical distinction: how content is expressed, as distinct from the details the content addresses.

When creating new content, publishers face a choice of what existing material to use again, and what to change.  Should they change the expression of existing content, or the details of that content?  The answer will depend on whether they are seeking to amplify an existing core message, or to extend the message to cover additional material.  That core message straddles between expression (how something is said) and details (specifics), which is one reason both these aspects, the style and the substance, get lumped together into a generic idea of “content”.  Telling an author to simply “change the content” does not indicate whether to change the connotation or denotation of the content.  They need more clarity on the goal of the change.

Content variation results from the interaction of the two dimensions:

  1. The content expression (the approach of written prose or other manifestations such as video)
  2. The details (facts and concrete information).

Both expression and details can vary.  Publishers can change both the expression and the details of content, or they can focus on just one of the dimensions.

The interplay of content expression and details can explain a broad range of content variation.  Content management professionals commonly explain content variation by referring to a more limited concept: content structure —  the inclusion and arrangement of chunk-size components or sections.  Content structure does influence content variation in many cases, but not in all cases. Expressive variation can result when content is made up of different structural components.  Variation in detail can take place within a common structural component.   But rearranging content structure is not the only, or even necessarily the preferred, way to manage content variation.  Much content lacks formal structure, even though the content follows distinguishable variations that are planned and managed.

The expression of content (for example, the wording used) can be either fixed (static, consistent or definitive) or fluid (changeable or adaptable).  A fixed expression is present when all content sounds alike, even if the particulars of the content are different.  As an example, a “form” email is a fixed expression, where the only variation is whether the email is addressed to Jack or to Jill.  When the expression of content is fluid,  in contrast, the same basic content can exist in many forms.  For example, an anecdote could be expressed as a written short story, as a dramatized video clip, or as a comic book.

Details in content can also be either fixed, or they can vary.  Some details are fixed, such as when all webpages include the same contact details.  Other content is entirely about the variation of the details.  For example, tables often look similar (their expression is fixed), though their details vary considerably.

Diagram showing how both expression and details in content can vary (revised).  NB: elastic content can also fluidly address a diverse range of details, but its unique power comes from its ability to express the same fixed details different ways.

Now let’s look at three approaches for varying content.  Only one relies on leveraging structures within content, while the other two exist without using structure.

Template-based content has a fixed expression.  Think of a form letter, where details are merged into a fixed body of text.  With template-based content, the details vary, and are frequently what’s most significant about the content.   Template-based content resembles a “mad libs” style of writing, where the basic sentence structure is already in place, and only certain blanks get filled in with information.  Much of the automated writing referred to as robo-journalism relies on templates.  The Associated Press will, for example, feed variables into a template to generate thousands of canned sports and financial earnings reports.  Needless to say, the rigid, fixed expression of template-based writing rates low on the creativity scale.  On the other hand, fixed expression is valuable when even subtle changes in wording might cause problems, such as in legal disclaimers.

Compositional content relies on structural components.  It is composed of different components that are fixed, relying on a process known as transclusion.  These components may include informational variables, but most often do not.  The expression of the content will vary according to which components are selected and included in the delivered content.  Compositional content allows some degree of customization, to reflect variations in interests and detail desired.  Content composed from different components can offer both expressive variation and consistency in content to some degree, though there is ultimately a intrinsic tradeoff in those goals.  Generally the biggest limitation of compositional content is that its range of variation is limited.  Compositional variation increases complexity, which tends to prioritize creating consistency in content instead of variation.  Compositional content can’t generate novel variation, since it must rely on existing structures to create new variants.

Elastic content is content that can be expressed in a multitude of ways.  With elastic content, the core informational details stay constant, but how these details are expressed will change. None of the content is fixed, except for the details.  In fact, so much variation in expression is possible that publishers may not notice how they can reuse existing informational details in new contexts.  Elastic content can even morph in form, by changing media.

Authors tend to repeat facts in content they create.  They may want to keep mentioning the performance characteristic of a product, or an award that it has won. Such proof points may appeal to the rational mind, but don’t by themselves stimulate  much interest.  To engage the reader’s imagination, the author creates various stories and narratives that can illustrate or reinforce facts they want to convey.  Each narrative is a different expression, but the core facts stay constant.  Authors rely on this tactic frequently, but sometimes unconsciously.  They don’t track how many separate narratives draw on the same facts. They can’t tell if a story failed to engage audiences because its expression was dull, or because the factual premise accompanying the narrative had become tired, and needs changing.  When authors track these informational details with metadata, they can monitor which stories mention which facts, and are in a better position to understand the relationships between content details and expression.

Machines can generate elastic content as well.   When information details are defined by metadata, machines can use the metadata to express the details in various ways.  Consider content indicating the location of a store or an event.  The same information, captured as a geo-coordinate value in metadata, can be expressed multiple ways.  It can be expressed as a text address, or as a map.  The information can also be augmented, by showing a photo of the location, or with a list of related venues that are close by.  The metadata allows the content to become versatile.

As real time information becomes more important in the workplace, individuals are discovering they want that information in different ways.  Some people want spreadsheet-like tools they can use to process and refine the raw alphanumeric values.  Others want data summarized in graphic dashboards.  And a growing number want the numbers and facts translated into narrative reports that highlight, in sentences, what is significant about the information.  Companies are now offering software that assesses information, contextualizes it, and writes narratives discussing the information.  In contrast to the fill-in-the-blank feeding of values in a template, this content is not fixed.  The content relies on metadata (rather than a blind feed as used in templates); the description changes according to the information involved.  The details of the information influence how the software creates the narrative.   By capturing key information as metadata, publishers have the ability to amplify how they express that information in content.  Readers can get a choice of what medium to access the information.

The next frontier in elastic content will be conversational interfaces, where natural language generation software will use informational details described with metadata, to generate a range of expressive statements on topics.  The success of conversational interfaces will depend on the ability of machines to break free from robotic, canned, template-based speech, and toward more spontaneous and natural sounding language that adapts to the context.

Weighing Options

How can publishers leverage existing content, so they don’t have to start from scratch?  They need to understand what dimensions of their content that might change.  They also need to be realistic about what future needs can be anticipated and planned for.  Sometimes publishers over-estimate how much of their content will stay consistent, because they don’t anticipate the circumstantial need for variation.

Information details that don’t change often, or may be needed in the future, should be characterized with metadata.  In contrast, frequently changing and ephemeral details could be handled by a feed.

Standardized communications lend themselves to templates, while communications that require customization lend themselves to compositional approaches using different structural components.  Any approach that relies on a fixed expression of content can be rendered ineffective when the essence of the communication needs to change.

The most flexible and responsive content, with the greatest creative possibilities, is elastic content that draws on a well- described body of facts.  Publishers will want to consider how they can reuse information and facts to compose new content that will engage audiences.

— Michael Andrews

Categories
Content Engineering

Your Content Needs a Metadata Strategy

What’s your metadata strategy?  So few web publishers have an articulated metadata strategy that a skeptic may think I’ve made up the concept, and coined a new buzzword.  Yet almost a decade ago, Kristina Halvorson explicitly cited metadata strategy as one of “a number of content-related disciplines that deserve their own definition” in her seminal  A List Apart article, “The Discipline of Content Strategy”.   She also cites metadata strategy in her widely read book on content strategy.  It’s been nearly a decade since Kristina’s article, but the discipline of content strategy still hasn’t given metadata strategy the attention it deserves.

A content strategy, to have a sustained impact, needs a metadata strategy to back it up.  Without metadata strategy, content strategy can get stuck in a firefighting mode.  Many organizations keep making the same mistakes with their content, because they ask overwhelmed staff to track too many variables.  Metadata can liberate staff from checklists, by allowing IT systems to handle low level details that are important, but exhausting to deal with.  Staff may come and go, and their enthusiasm can wax and wane.  But metadata, like the Energizer bunny, keeps performing: it can keep the larger strategy on track. Metadata can deliver consistency to content operations, and can enhance how content is delivered to audiences.

A metadata strategy is a plan for how a publisher can leverage metadata to accomplish specific content goals.  It articulates what metadata publishers need for their content, how they will create that metadata, and most importantly, how both the publisher and audiences can utilize the metadata.  When metadata is an afterthought, publishers end up with content strategies that can’t be implemented, or are implemented poorly.

The Vaporware Problem: When you can’t implement your Plan

A content strategy may include many big ideas, but translating those ideas into practice can be the hardest part.  A strategy will be difficult to execute when its documentation and details are too much for operational teams to absorb and follow.  The group designing the content strategy may have done a thorough analysis of what’s needed.  They identified goals and metrics, modeled how content needs to fit together, and considered workflows and the editorial lifecycle.  But large content teams, especially when geographically distributed, can face difficulties implementing the strategy.  Documentation, emails and committees are unreliable ways to coordinate content on a large scale.  Instead, key decisions should be embedded into the tools the team uses wherever possible.  When their tools have encoded relevant decisions, teams can focus on accomplishing their goals, instead of following rules and checklists.

In the software industry, vaporware is a product concept that’s been announced, but not built. Plans that can’t be implemented are vaporware. Content strategies are sometimes conceived with limited consideration of how to implement them consistently.  When executing a content strategy, metadata is where the rubber hits the road.  It’s a key ingredient for turning plans into reality.  But first, publishers need to have the right metadata in place before they can use it to support their broader goals.

Effective large-scale content governance is impossible without effective metadata, especially administrative metadata.  Without a metadata strategy, publishers tend to rely on what their existing content systems offer them, instead of asking first what they want from their systems.  Your existing system may provide only some of the key metadata attributes you need to coordinate and manage your content. That metadata may be in a proprietary format, meaning it can’t be used by other systems. The default settings offered by your vendors’ products are likely not to provide the coordination and flexibility required.

Consider all the important information about your content that needs to be supported with metadata.  You need to know details about the history of the content (when it was created, last revised, reused from elsewhere, or scheduled for removal), where the content came from (author, approvers, licensing rights for photos, or location information for video recordings), and goals for the content (intended audiences, themes, or channels).  Those are just some of the metadata attributes content systems can use to manage routine reporting, tracking, and routing tasks, so web teams can focus on tasks of higher value.

If you have grander visions for your content, such as making your content “intelligent”, then having a metadata strategy becomes even more important.  Countless vendors are hawking products that claim to add AI to content.  Just remember—  Metadata is what makes content intelligent: ready for applications (user decisions), algorithms (machine decisions) and  analytics (assessment).  Don’t buy new products without first having your own metadata strategy in place.  Otherwise you’ll likely be stuck with the vendor’s proprietary vision and roadmap, instead of your own.

Lack of Strategy creates Stovepipe Systems

A different problem arises when a publisher tries to do many things with its content, but does so in a piecemeal manner.  Perhaps a big bold vision for a content strategy, embodied in a PowerPoint deck, gets tossed over to the IT department.  Various IT members consider what systems are needed to support different functionality.  Unless there is a metadata strategy in place, each system is likely to operate according to its own rules:

  • Content structuring relies on proprietary templates
  • Content management relies on proprietary CMS data fields
  • SEO relies on meta tags
  • Recommendations rely on page views and tags
  • Analytics rely on page titles and URLs
  • Digital assets rely on proprietary tags
  • Internal search uses keywords and not metadata
  • Navigation uses a CMS-defined custom taxonomy or folder structure
  • Screen interaction relies on custom JSON
  • Backend data relies on a custom data model.

Sadly such uncoordinated labeling of content is quite common.

Without a metadata strategy, each area of functionality is considered as a separate system.  IT staff then focus on systems integration: trying to get different systems to talk to each other.  In reality, they have a collection of stovepipe systems, where metadata descriptions aren’t shared across systems.  That’s because various systems use proprietary or custom metadata, instead of using common, standards-based metadata.  Stovepipe systems lack a shared language that allows interoperability.  Attributes that are defined by your CMS or other vendor system are hostage to that system.

Proprietary metadata is far less valuable than standards-based metadata.  Proprietary metadata can’t be shared easily with other systems and is hard or impossible to migrate if you change systems.  Proprietary metadata is a sunk cost that’s expensive to maintain, rather than being an investment that will have value for years to come. Unlike standards-based metadata, proprietary metadata is brittle — new requirements can mess up an existing integration configuration.

Metadata standards are like an operating system for your content.  They allow content to be used, managed and tracked across different applications.  Metadata standards create an ecosystem for content.  Metadata strategy asks: What kind of ecosystem do you want, and how are you going to develop it, so that your content is ready for any task?

Who is doing Metadata Strategy right?

Let’s look at how two well-known organizations are doing metadata strategy.  One example is current and news-worthy, while the other has a long backstory.

eBay

eBay decided that the proprietary metadata they used in their content wasn’t working, as it was preventing them from leveraging metadata to deliver better experiences for their customers. They embarked on a major program called the “Structured Data Initiative”, migrating their content to metadata based on the W3C web standard, schema.org.   Wall Street analysts have been following eBay’s metadata strategy closely over the past year, as it is expected to improve the profitability of the ecommerce giant. The adoption of metadata standards has allowed for a “more personal and discovery-based buying experience with highly tailored choices and unique selection”, according to eBay.  eBay is leveraging the metadata to work with new AI technologies to deliver a personalized homepage to each of its customers.   It is also leveraging the metadata in its conversational commerce product, the eBay ShopBot, which connects with Facebook Messenger.  eBay’s experience shows that a company shouldn’t try to adopt AI without first having a metadata strategy.

eBay’s strategy for structured data (metadata). Screenshot via eBay

Significantly, eBay’s metadata strategy adopts the W3C schema.org standard for their internal content management, in addition to using it for search engine consumers such as Google and Bing.  Plenty of publishers use schema.org for search engine purposes, but few have taken the next step like eBay to use it as the basis of their content operations.  eBay is also well positioned to take advantage of any new third party services that can consume their metadata.

Australian Government

From the earliest days of online content, the Australian government has been concerned with how metadata can improve online content availability. The Australian government isn’t a single publisher, but comprises a federation of many government websites run by different government organizations.  The governance challenges are enormous.  Fortunately, metadata standards can help coordinate diverse activity.  The AGLS metadata standard has been in use nearly 20 years to classify services provided by different organizations within the Australian government.

The AGLS metadata strategy is unique in a couple of ways.  First, it adopts an existing standard and builds upon it.  The government identified areas where existing standards didn’t offer attributes that were needed.  The government adopted the widely used Dublin Core metadata standard, but added some additional elements that were specific to their needs (for example, indicating the “jurisdiction” that the content relates to).  Starting from an existing standard, they extended it and got the W3C to recognize their extension.

Second, the AGLS strategy addresses implementation at different levels in different ways.  The metadata standard allow different publishers to describe their content consistently.  It ensures all published content is inter-operable.  Individual publishers, such as the state government of Victoria, have their own government website principles and requirements, but these mandate the use of the AGLS metadata standard.  The common standard has also promoted the availability of tools to implement the standard.  For example, Drupal, which is widely used for government websites in Australia, has a plugin that provides support for adding the metadata to content.  Currently, over 700 sites use the plugin.  But significantly, because AGLS is an open standard, it can work with any CMS, not just Drupal.  I’ve also seen a plugin for Joomla.

Australia’s example shows how content metadata isn’t an afterthought, but is a core part of content publishing.  A well-considered metadata strategy can provide benefits for many years.  Given its long history, AGLS is sure to continue to evolve to address new requirements.

Strategy focuses on the Value Metadata can offer

Occasionally, I encounter someone who warns of the “dangers” of “too much” metadata.  When I try to uncover the source of the perceived concern, I learn that the person thinks about metadata as a labor-intensive activity. They imagine they need to hand-create the metadata serially.  They think that metadata exists so they can hunt and search for specific documents. This sort of thinking is dated but still quite common.  It reflects how librarians and database administrators approached metadata in the past, as a tedious form of record keeping.  The purpose of metadata has evolved far beyond record keeping.  Metadata no longer is primarily about “findability,” powered by clicking labels and typing within form fields. It is now more about “discovery” — revealing relevant information through automation.  Leveraging metadata depends on understanding the range of uses for it.

When someone complains about too much metadata, it also signals to me that a metadata strategy is missing.  In many organizations, metadata is relegated to being an electronic checklist, instead of positioned as a valuable tool.   When that’s the case, metadata can seem overwhelming.  Organizations can have too much metadata when:

  • Too much of their metadata is incompatible, because different systems define content in different ways
  • Too much metadata is used for a single purpose, instead of serving multiple purposes.

Siloed thinking about metadata results in stovepipe systems. New metadata fields are created to address narrow needs, such as tracking or locating items for specific purposes.  Fields proliferate across various systems.  And everyone is confused how anything relates to anything else.

Strategic thinking about metadata considers how metadata can serve all the needs of the publisher, not just the needs of an individual team member or role.  When teams work together to develop requirements, they can discuss what metadata is useful for different purposes. They can identify how a single metadata item can be in different contexts.  If the metadata describes when an item was last updated, the team might consider how that metadata might be used in different contexts.  How might it be used by content creators, by the analytics team, by the UX design team, and by the product manager?

Publishers should ask themselves how they can do more for their customers by using metadata.  They need to think about the productivity of their metadata: making specific metadata descriptions do more things that can add value to the content.  And they need a strategy to make that happen.

— Michael Andrews