Author: Michael Andrews

Wine, Content, and Domain Models

Post author By Michael Andrews
Post date October 7, 2014

Suppose your organization wants to become the preeminent source of information about a topic. It aims to give audiences the ability to look at any dimension of a topic they might be interested in. How would you offer this?

To deliver informationally rich content, numerous content items need to be associated to one another. Content needs to be modular, with components that work together. But how do these things relate to each other? Where does one start?

Content models define how units of content should interact. Content modelling can be difficult to grasp and practice, partly because it is not a single uniform method. It encompasses a spectrum of related approaches that can be adapted to different needs.

People sometimes start to model their content before they know all the content they really need. They focus on what content has been already created, and not explore what content is not yet available that might be of interests to users.

Content models are often more robust when they are backed by a domain model. A domain model enables content designers to untangle a messy topic and explore and define requirements and design solutions.

The role of content modelling

A content model is the end goal of a domain model. Rachel Lovinger has been instrumental in developing and advocating the practice of content modelling, so I will rely on her definition. She states: “A content model documents all the different types of content you will have for a given project. It contains detailed definitions of each content type’s elements and their relationships to each other.” She recommends using content models to bridge perspectives on a team.

“A content model helps clarify requirements and encourages collaboration between the designers, the developers creating the CMS, and the content creators.” — Rachel Lovinger

In addition to facilitating project delivery, content models improve how content is delivered to audiences. Content models can enable personalization, adaptive content, and content APIs. Cleve Gibbon, a collaborator of Rachel Lovinger in evangelizing content models, notes: “Great APIs are founded upon solid models. So if you’re building a Content API, be sure to create a content model FIRST that conveys the required level of structure and meaning.”

The spectrum of content modelling

Models can represent different dimensions of a topic: either conceptual, or formal and structural. Content models can indicate how to assemble content components. But first one needs to know how solidly your content types are defined.

On one end of the spectrum, you may have well defined, fixed content. In such cases, one can develop what Deane Barker calls a relational content model. He defines it as “the concept of how different, separately-managed pieces of content relate to each other. (This is distinct from ‘discrete content modeling,’ which is how you structure a single piece of content.” He explains the goal as “the idea of taking multiple discrete content objects (articles, sections, issues) and ‘rolling them up’ into a more complex content object (publication).”

On the other end of the spectrum, you may have fluid content, where the exact requirements are still emerging and many different hubs of content are possible. In such cases, a domain focused, ontology based form of modelling can be helpful. This approach has been used by the BBC for several large projects. Mike Atherton emphasizes the importance of the domain of the topic in content models: “A content model maps our subject domain, not our website structure.” He advises: “Concentrate on modelling real (physical and metaphysical) things not web pages.”

One way to consider the differences in a content model and a domain model is the metadata they emphasize. Rachel Lovinger states: “The Content Model is primarily concerned with structural metadata, while the Domain Model is largely concerned with descriptive metadata.”

A domain model and content model are complementary. A domain model helps you describe things that will be represented by content, while the content model helps you structure the content. Using both allows you to understand the relationship of a real world entity with a content entity.

A domain model is a useful place to start when content does not yet exist, or one is looking for a fresh redesign of content. Domain models may be considered as the prequel to content models. By focusing on entities in the real world, and the relationships between these entities, one can see opportunities to develop content associated with these entities, and what elements would be needed for that content. The correspondence of domain entity type, and content type, is illustrated in the table.

The relationship between a domain model and content model

Domain models in the real world: Italian wine

Domain models can clarify one’s understanding of a topic, and offer insights into how different items of information relate to each other. Domain modelling emerged as strategy in software development to bridge analysis and design of complex business domains by using a shared verbal and visual language between experts, endusers and developers. Domain models can be especially useful for complicated and messy topics. They would seem perfect for understanding Italian wine.

When you live in Italy, as I do, understanding Italian wine is a practical problem. Wine is ubiquitous, but understanding Italian wine is not self-evident. Walk into an Italian wine store and you are confronted with walls of bottles whose contents are largely unrecognizable. It’s not that all wine is difficult to understand. When I lived in New Zealand, I had a good idea what different wines were about. It’s Italian wine that is the challenge.

The famous wine critic Hugh Johnson once wrote: “the already bewildering complexity of Italian wines has become tangled enough to drive a critic to drink.” Italian wine is particularly hard to understand because of its heterogeneity. Even the imposition of standardized nomenclature to designate where a wine is from results in a bewildering array of non-standard implementations of these standards. Idiosyncratic traditions, politics, and rogue approaches mean that wines are described in great detail, but in richly differing ways.

At the core of why Italian wine is difficult to decipher is its product architecture: how specific wines are labelled. Consumers need an easy way to know the basic characteristics of a wine based on its label. Do consumers think of wine in terms of where it’s from (Burgundy) or what grapes it is made from (Chardonnay)?[1] High volume wine producers have attempted to solve the product architecture problem by promoting brand awareness of a grape variety or a region. What happens when consumers are not familiar with either the origin name or the grape name?

Unfortunately, Italian wine labels are uncharacteristically difficult to decipher. Italian labels will show the producer + (grape variety and/or geographic indication) + year. That seems reasonable enough, until the consumer realizes that the only items on the label they might recognize are the digits of the year. Even if they have familiarity with another proper name on the label, that is not sufficient to make a selection decision.

The most significant piece of information about the kind of wine is indicated by the grape variety and/or the geographic indication (a regional designation similar to an appellation in France). Between these two items, there are nearly 1000 different varietals and zones that indicate the basic composition of the wine. [2] To get a sense of how good the wine is, the most reliable information is the producer, and the year of vintage. Yet there are many thousands of wine producers in Italy of varying abilities, and the correlation of product quality to year of vintage is very specific to the variety of wine and where it was produced.

The complexity of Italian wine would seem tailored for digital content. But existing digital-only information sources on the web tend to be shallow — both in terms of their range of attributes, and their selective coverage.

Good information about wines, producers, and regions are available from several well known printed guides, such as those by L’Espresso, Gambero Rosso, Touring Club, Bibenda, and Slow Food. Despite the editorial quality of the content, the information is not as usable as it could be. Depending on the specific organization of the book, the information is stovepiped in one way or another. The editors of each guide assumes a fixed path of entry that generally leads to a producer profile. Users are expected to think like the editors to uncover information of interest to them.

In some cases there are iPad versions of these printed guides, but they don’t feel natively digital, and require lots of tapping to move around from screen to screen. They are less usable than the print version, because they are slower to move through, and one’s orientation can get lost when hopping between screens. The content, while structured editorially, is not structured digitally with digital metadata. There is no ability to move laterally through the content: navigation is hierarchical. Unfortunately shovelware that ports a printed product and dumps it into a tablet format is too common, due to the false promises embedded in Adobe InDesign.

What users need is not simply a catalog of items, but a way to make sense of the bigger picture, in addition to exploring the detail. The heavy focus on profiles means that the user doesn’t see easily how these items relate to other things. They also miss seeing collective behaviors of similar items, which is possible when one digitally aggregates items sharing the same metadata. Thinking through these relationships and behaviors is one benefit of domain modelling.

Understanding the domain

How do people think about a subject? Mike Atherton suggests: “Experts map the world, users mark points of interest.” It helps to know how experts think about a topic like wine, and then during design, figure out what more typical users consider high priority goals. What aspects of wine do people consider significant? How might different aspects be pulled together into interesting items of content?

The topic of wine is distinctive because many people want to become experts, in contrast to other products. Getting information about the product is rarely a perfunctory task, but a connoisseurial pastime. Some people want to develop a broad knowledge about all styles of wine, while other people want to have a deep knowledge about a few specific producers or product areas, perhaps tied to places they go on holiday. Many things people might be interested in are non-obvious. For example, soil characteristics can influence how a grape variety tastes. Others may be interested in the environmental credentials of a producer.

How to break things down so they can be managed

The most important task when developing a domain model is to identify appropriate entities. An entity is a thing, either tangible or conceptual, with a distinct identity. It’s not the same as an existing item of content — the content may not exist yet. Entities, to use the words of Cleve Gibbon, are “first class citizens in the business domain” — they are the actors in the drama on the stage.

Entities have attributes — characteristics. Attributes do not necessarily become a field in the content, but they often do. That decision needs to be made when the content is designed. Taste is certainly an attribute of wine, but is not necessarily a field in a description of a wine.

Once entities have been identified, it is necessary to determine where to put attributes, and whether to break entities into smaller units. Often, one discovers intermediate zones that straddle two entities. The horticultural characteristics of the vineyard reflect the interaction of the producer and the wine produced. The interplay between region and varietal defines the vintage for a given year. These intermediate areas may not deserve to be entities themselves, but one should consider how to make sure their role remains visible.

What a domain model for Italian wine looks like

It is helpful to first consider the relationships between entities, then examine the attributes associated with each entity.

When looking at entities, two things are important. First, how many instances are there for each entity type? The entity map shows that most of the entities, there are hundreds or even thousands of instances. This large number suggests that establishing meaningful relationships between entities will be important if users are to be successful navigating through such a large volume of content. Second, what is be essential character of relationships between entities? We want to know how many connections there are between entities: the more connections to other entities, the richer the potential interaction of information. We also want to know if the relationship between entities is a one-to-one relationship, a one-to-many relationship, or a many-to-many relationship. The “crow’s feet” in our entity map indicates numerous many-to-many relationships. That may make the design of content a bit more challenging, but it also indicates many interesting connections. Our content is a valuable resource when it’s not easy to see these connections in one’s head.

Relationships between different entity types associated with the domain of Italian wine.

Next, explore the attributes associated with each entity. The goal is to identify and associate attributes of entities. Each entity has a number of attributes. Some will be short fields, others will involve longer text descriptions. There is no right number of attributes, provided all attributes are meaningful. The number of attributes to implement in design will depend on both business and design decisions. There will be a business decision concerning the cost of acquiring the information related to the attribute, and the usefulness to consumers of that information. There will also be a design decision relating to which attributes to expose to which audiences.

Typical attributes of each entity type relating to domain of Italian wine

Our model shows attributes that are commonly associated with the domain of Italian wine. For example, it can be interesting to know the number of bottles produced of a wine. That can indicate how widely available the wine is to buy, or perhaps its scarcity (that one needs to reserve purchase). Some wine guides will indicate the total number of bottles according to producer, while others will indicate total number of bottles by label. This difference means that one can answer different questions, such as who is the largest producer within a geographic indication zone, or who is the largest producer of a specific kind of wine. Ideally, one would like data at both the producer and product levels, but that may not be easy to obtain for all producers.

Lessons from domain modelling

Even though domain modelling attempts to represent the real world, reality is often less orderly than we would like it to be.

Not everything can be easily expressed as a regularized attribute. Audiences will want to know: What does the wine taste like? It would be wonderful to provide a reliable, easy-to-understand way to explain taste that allows easy comparison between wines, zones, and producers. Sadly, taste is — surprise — a bit subjective. Different experts will say different things about the same wine, even when they agree on an overall judgment. Terminology is not standard either. The same words can mean different things. Critics may use the word “cherry” to describe a taste as “spicy black cherry” or as “cherry rhubarb.” There is no controlled vocabulary for wine, no limited set of descriptors with precisely defined and agreed meanings.

Example of geographic designation zones within a single region. Screenshot from a certification body website. — Map of geographic designation zones within a single region. Screenshot from a certification body website.

By their nature, models simplify reality. The geographic indication signifies where a wine is made, and the criteria by which it is made. Whereas most geographical entities are based on either political administrative geography or physical geography, geographic indications exist outside these frameworks. A geographic indication can straddle two administrative regions. It can exist in two different, discontinuous locations. Some geographic indication zones have subzones. Wine producers also can behave in complex ways. Sometimes a wine producer is a brand “house” that has vineyards in several locations, or a consortium that sources from different vineyards. The informational details associated with these exceptions may not be important to users, and can add design complexity.

The identity of items can be constructed in several ways. One needs to be able to distinguish one entity from others belonging to the same entity type — items need to be uniquely identified. Despite the challenges of deciphering Italian wine, specific entities fortunately are identified with meaningful, human readable names, rather than numeric product codes. The domain model can use existing identifiers, which are based on several approaches:

Collectively defined names (the names of regions, geographic indications, and grape varieties), though some producers use alternate names for grape varieties.
Self described (the name of producer), though sometimes producers choose to use both a house and proprietor name
Inherited identity (the environmental profile for a producer)
Names composed of compound attributes , such as dry sparkling rosato as a wine category entity.

Thinking about design

The domain model can support early design discussions. Many questions that are interesting to audiences will span two or more different entities. For example:

What year produced the best wine from a region?
What geographic indication commands the highest average prices?
What grape varieties produce the most wine?
What wines for a given year and geographic designation are ready to drink?

Some answers require computations of structured data. Questions of interest to audiences need to be translated into content types that will be represented in the content model.

In addition to supporting interesting exploration, the design needs to support common tasks. The domain model helps to identify information available to support common tasks. Some common points of entry audiences will seek when exploring wine include:

By rating
By price
By category
By variety

Users often focus on one specific criteria when starting the process of seeking information. In some cases, these are entities, in others, these are attributes. Considering task starting points can help identify potential groupings of content elements. Depending on the depth of content, these groupings may not be manageable for users without providing additional parameters to narrow the pool of candidate content. The most salient criteria is not the only factor that’s important to the user.

In contrast to starting points, another perspective is to consider the end goal of the task. Examining the end goal, the content designer can consider the orientation of different users. Users of wine information may be:

Bottle centric — interested in the characteristics of specific bottles of wine
Producer centric — interested in the story of the producer, perhaps with an intention of visiting them
Food centric — mostly interested in wine styles as a complement to food dishes.

Domain depth and domain scope

The depth of a domain reflects both the number of attributes for an entity type, and quantity of items. Both aspects can impact the design. The quantity of items will influence content types that presents lists and links. The number of attributes will impact content type structures for content items.

Content designers decide how much of the domain model to present to users. A fixed content type may show all attributes as part of in content type. With a flexible content type, attributes may be optionally available, or have serval variations. Designers may choose progressive disclosure of content that hides details, which are revealed only when wanted. Or they may implement an adaptive approach, where different variations of content types are shown depending on the interests of an audience segment, or device formats.

The other aspect of the domain model, thus far unmentioned, is how it might connect with other domains. The domain model offers the possibility of enlarging the scope addressed by considering related domains. Different variations of content may draw on common content, while including different content as well (see diagram). Three different apps may share common core content. But they provide different functionality depending on their focus (touring vineyards, pairing wine with food, or knowledge enhancement of wine). The domain model can also be used to guide the planning of releases of content and functionality.

Relationship between the depth of a domain, and its scope. Content can be deep, covering many attributes. And content can we wide, connecting with other domains.

Relating entities: Comparisons to other approaches

Domain modelling is not the only approach to sorting through complex content. Before closing this discussion, it is worth talking about two other well known approaches that look and behavior similarly, but have some differences.

Faceted search, an approach popular in library science and information architecture, allows users to locate specific content by filtering on facets. Facets can be attributes or entities. The idea is that users can locate content that has the qualities of A & B & C. Faceted search is a popular technique, common on ecommerce sites, and is often helpful. The utility of the technique rests on several assumptions. First, faceted search assumes users know the two to four most important criteria, and will get a manageable set of results. If the set of results is large, users generally take a satisficing approach, happy with the first result encountered that is minimally acceptable. Second, faceted search presumes that each facet is independent of each other, which in the case of wine isn’t true. It is possible to get null sets if facets aren’t deep. While faceted search has been implemented on some wine ecommerce sites, it is not an effective approach for helping users discover content they might be interested in but not know about, and tends to focus on a limited range of aspects.

Linked data is an approach to modelling content that has close associations to domain modelling, thanks to the BBC’s integration of the two approaches. To simplify, linked data allows users to find content with characteristic A that has B, which has C. Organizing content using a linked data approach has both benefits and drawbacks. One drawback is that queries can be path dependent. Whether results appear promising or discouraging depends on how you construct the query. Linked data queries are generally more open ended than predefined structured queries that answer fixed questions with predictable sets of results. A bigger concern is that linked data treats all aspects of an entity as other entities, and each entity gets its own page. But not all attributes are meaningful entities — things worthy of their own content destination. On the positive side, linked data is good for what-else questions. One can link outside of a domain to other domains, such as to geophysical data.

Model behavior

Models aren’t reality, according to the cliche. Domain models may appear esoteric to some people, given that they aren’t actually something implemented directly, but are an input to other deliverables. To get buy-in for domain models, it may be best to use it as a discussion document, and note that it will evolve into the content model. While it l lacks the appeal of being code-ready, a domain model can play an important role on a project. It can uncover hidden requirements and opportunities, help align different stakeholders around a common vision, and accelerate the design process.

— Michael Andrews

Chardonnay grapes originated in Burgundy. Even though most people associate Burgundy with red wine, there are also white Burgundy wines made from Chardonnay. ↩
A canonical list of varietals and zones is available from the databases of the intergovernmental wine organization OIV http://www.oiv.int/oiv/info/enbasededonneesIG ↩

Tags content model, domain model

Content Efficiency

Four approaches to content reuse

Post author By Michael Andrews
Post date September 12, 2014
1 Comment on Four approaches to content reuse

How organizations approach reusing content impacts their publishing efficiency, and their ability to serve audience needs. Four distinct approaches to content reuse exist, each of which focuses on different goals. Due to specialization in the content profession, content professionals may be familiar with only some content reuse approaches. To support broader organizational objectives effectively, content strategists should become familiar with all four alternative approaches to reuse, since each offers each unique benefits.

Why content reuse matters

While content reuse is a topic of active discussion in the content profession, no one definition for content reuse adequately captures its various meanings. In practice, there are four distinct types of content reuse:

Ad hoc reuse of assets
The planned reuse of content components
Enabling reuse of content across channels
Selective reuse through adaptive content

Nearly everyone agrees reusing content is a good thing. Content professionals sometimes invoke the phrase “single sourcing” to suggest the notion that one “source” can serve all needs, both internally and for audiences. What is being reused, exactly? Is the source a database? A file? A finished piece of content?

Many different specialities work with content. Each specialty is working to solve an aspect of reuse and will tend to promote its approach as a solution the core problems associated with poor content reuse. But specialists are not always aware of the larger picture needs of complex organizations or multidimensional audiences. Solution advocacy can sometimes create own silo problems!

When discussing content reuse, it is important to distinguish between reusing as-is content, recycling (repurposing) content, and providing on-demand, customized content. Is the source granular or whole? For example, is the source a whole video recording, or a collection of video snippets? Is the source a document, or a library of documents?

Different reuse approaches reflect different goals. All are valid, but none are complete. At present, no one approach will address all needs faced by enterprise scale publishers.

Specifying content

The term content is abstract and fuzzy, open to various interpretations. Content may be raw or finish, partial or complete. We need to understand different levels or states of content. Fortunately, we can draw on insights from library science to distinguish different levels of specificity by using a concept called the FRBR. [1]

The FRBR model provides levels to analyze content, divided according to how explicit the description of the content is. The key levels of concern to us are work, expression, and manifestation. If the content item is a book, it might be described as follows:

Work (Bible)
Expression (King James translation)
Manifestation (1994 Oxford University Press edition)

The work is the raw content, the underlying intellectual property. It might be a class of content such as a novel or symphony. It describes the content or asset.

The expression identifies a version of the content.

The manifestation specifies the content’s specific revision or a rendition, for example, the edition, format, mode of access, or date of publication.

The table below illustrates the hierarchy, with rough equivalents in content strategy.

FRBR Concept	Level of Identification	Rough Equivalent in Content Strategy	Example
Work	Described by a Title	Assets relating to a topic	Long, unedited video file
Expression	Uniquely ID’ed	Collection of content components relating to a topic	Tagged video clip highlights
Manifestation	Versioned	Finished content about topic	Linked series of transcript-captioned video segments

Different levels of content reflect different frequencies of change and target audiences. Assets don’t change; they are repurposed. Components can be revised, but there will only be one version of a component at a given time. Content composites seen by audiences may come in multiple versions, which can exist simultaneously.

Rather than describe everything as content, it is more helpful to separate different notions:

content (items audiences consume)
content components (recurring elements incorporated in audience facing content)
assets (intellectual property used to create finished content)

Delivering equivalent content to different platforms: COPE

As content channels have multiplied, publishers have needed to make their content available to different devices and different kinds of content customers. The approach known as COPE (Create Once, Publish Everywhere) addresses the issue. Rather than recreate multiple versions of the same content for different devices or platforms, publishers can use standards and structure to provide the same content through an API that can be accessed by a variety of applications. The same content is used in multiple contexts, often distributed simultaneously. Since reuse can imply using the same content at different points in time, the notion of “content once” being published everywhere may be better thought of as multi-use content distribution.

One goal of COPE is the wide dissemination of content across different channels. COPE started as a technology solution to address point-of-failure concerns when publishing to multiple parties from a single database of content. Over time, it has evolved into an approach to syndicate content to other parties.

What COPE does

In the COPE approach, a central content database provides multiple versions of the same content to different people and devices. The original idea didn’t foresee revisions to the content (hence: create once), and also presumed that core essence within content items pushed to different endpoints would be essentially the same. Different technical packages (formats and associated metadata) allow endusers to consume the version of content they want. Technical endusers (content partners and third party app developers) are able to choose which content items they want, but generally lack the ability to request specific components of content from within an item. The API disseminates a large, structured chunk, but not finely defined, reconfigurable chunks. Content consumers choose which content host to use to access the content. They might use their local radio station’s website, or NPR’s own app to access the same content.

Benefits and limitations of COPE

COPE is an effective approach to disseminate articles to multiple partners and platforms. Because of its push orientation, it is not optimized to offer personalized content that responds to specific requests from content consumers. As originally conceived, the body of the content is static.

Reusing common elements in different content products: the DITA model

While COPE is largely focused on formats and metadata, another reuse approach is focused on reusing components of content within the body-field of an item.

Publishers of technical content have championed reusing specific content components in different items of content. Technical documentation is repetitive. Much writing is redundant, where the same text is being repeated in many places. Technical writers sometimes speak about the ideal of WOOO: Write Once and Once Only.

Component reuse is closely associated with an approach called DITA (Darwin Information Typing Architecture), an XML schema originally developed by IBM. DITA is designed to address specific publishing issues with user assistance for technical products, though many DITA proponents argue it can be successfully used for other kinds of content.

For the most part, the motivations behind DITA have been writing efficiency and consistency, rather than audience needs. Few individuals will ever read the many minor variations of content possible with a DITA document, and content variations are largely defined by topic variants rather than reflect audience preferences.

Reusing Components through Transclusion

Most approaches that reuse content components rely on transclusion. Transclusion is the process of incorporating content into an item of content from another source by use of a link to that source. In its most simple form, it is similar to when one embeds an item of content in another, such as embedding a slideshow or YouTube video hosted elsewhere in an article you’ve written. In DITA, the process is called a conref or content reference. Transclusion is a core concept not only in DITA but also in MediaWiki, which powers Wikipedia among other sites. Transclusion allows the same content to be used in multiple locations in Wikipedia.

Transclusion can be applied to any item of content: a word or phrase, a paragraph, or a large section.

A related approach is to show and hide components depending on certain criteria, perhaps intended audience segment. Business customers might see a certain paragraph, while consumers wouldn’t see that paragraph. The process of showing and hiding XML nodes is called profiling in DITA. It allows the output of multiple documents (variations on the master document) from a single source.

Benefits and limitations of Transclusion

Reusing components is effective when there is a repetition of messages, and regular variations among specific components. It can provide efficiencies and consistency for content that is highly regular and needs to be delivered in a uniform manner. If business requirements mandate that all customers see the same terms and conditions in the content regardless of what content they see, transclusion can be an effective approach.

The weakness of transclusion is that it is not very flexible. DITA, for example, assumes a linear flow of content from the publisher to the content consumer. It presupposes content elements can be planned and compiled into well-structured formats. That vision implies the presence of regular content entities and that one can anticipate the exact circumstances of when these entities are required by endusers.

Embedding content through link referencing, or hiding content through profiling, is not very dynamic. The process can groan when the variations become complex. It is also difficult for the publisher to confidently say precisely what an audience wants, and so there is a tendency to deliver too much content because it is easy to include it. Transclusion, by itself, doesn’t adapt to specific audience demands for information, or marketers’ desire to change the messaging in response to CRM and real time analytics data. The motivation to write once only doesn’t accord with audience desires to pick and choose what content they want to see at a given time. It is not clear if the XML-based structure of DITA will be up to the demands of real time personalization associated with performance-based marketing.

Mark Baker noted recently some other shortcomings of transclusion:

“Reusing text where you would have been writing substantially the same text anyway is clearly the right thing to do. But taking all the various ways in which you might express an important idea and combining them into one expression is a bad idea. Your idea will have more impact and more reach if it is expressed in different ways and in different media for different audiences, different purposes, and different occasions.”

Asset Reuse: the DAM model

A third approach to content reuse relates to assets. Reusing assets allows organizations to exact more value from their intellectual property. It recognizes that rich assets can be potentially applicable to different contexts at different times. A systematic approach to asset reuse requires a centralized repository for the raw material that authors draw upon to create audience-facing content.

How Asset Reuse works

A growing number of web publishers — though still a minority — have repositories to hold digital assets that are used to create content for audiences. They may use:

A digital asset management (DAM) system for videos, audio, graphics and photos, including brand assets and templates
An enterprise content management (ECM) system for complex documents, such as legal documentation
A database or file server to store code or data files that can be repurposed

Such repositories differ in purpose for content management systems, which are geared toward the creation and management of content for audiences. Unlike a CMS, a DAM may contain content that is neither currently published, nor being readied for publication.

The varied types of assets that can be stored in a repository share certain characteristics. Assets frequently involve complex workflows. They may involve substantial editorial oversight, to produce and prepare for publication. Unique approvals may be required, such as for branding assets stored in a DAM, or legal copy stored in an ECM. Data, perhaps from a periodic customer survey, may be stored in databases that require running structured queries and reports before they can be made available for content authors to use. Photo archives may have permissions and licensing requirements that must be vetted before items are available for publication.

When considering asset reuse, it helps to know how stable the asset is. Elizabeth Keathley distinguishes between static assets and living assets.

Static assets are generally stable and don’t change often. If they do change, there will only be one version at a time, with a persistent ID. These assets may have associated use rights governing when and how they are used, and by whom. The asset creator may have an explicit goal of preventing derivative reuse, such as prohibiting unapproved modifications of brand assets.

Living assets can be repurposed to support different goals, and are sometimes converted into different formats. Living assets are commonly composed of compound asset parts and have elaborate workflows to produce them. They are not simply derivative of other assets but are substantially original. A living asset is broadly equivalent to a work in FRBR terminology. Other items of content are derived from a living asset, and these will have identities separate from the master asset. Because the structure of living assets is complex and irregular, they are not as readily broken into content components, especially if an exact need for elements in the asset cannot be predicted in advance. Also, the nature of repurposing content means that the approval process will be different than it is for content components involving planned reuse for defined purposes.

Benefits and limitations of DAMs

DAMs and other asset repositories can offer authors a richer library of content than available in CMSs. Unlike with a CMS, authors are not restricted to a narrow perspective where they only see and have access to currently published content.

DAMs have challenges as well. Unless actively managed, metadata descriptions can be poor, hindering asset retrieval. Some DAM systems are improving auto tagging of assets to reduce the burden on contributors. Another limitation is that DAM assets are generally not directly accessible by audiences, so audience requirements for access to this content needs to be understood and planned in advance.

A framework for content reuse

The conceptual diagram reflects different content reuse activities according to their purpose. It is not meant to show specific platforms or systems, which vary considerably in practice. Only a few publishers perform all these activities as part of an integrated end-to-end process. The path from potential assets to ready-to-consume content resembles a waterfall: one is dependent on what content is available upstream.

The limits of specialized solutions

Relying on one approach entails various potential pitfalls. Not having a DAM means that potentially valuable content assets are siloed within different organizational departments and not available to authors. A failure to plan for modular reuse of content components hinders efficiency and consistency, and hurts the audience experience as well. Relying on responsive web design might be effective to reach immediate consumers, but won’t allow partners to reuse your content the way an API would allow, and might therefore reduce the total reach of your content.

Many aggravations arise from a poor conceptual understanding of the granularity of content, and how frequently different elements change and are used within the organization. Authors may try to reuse content that is actually a compound object made up of different assets and components. They may actually need to only reuse some parts of the content.

A core issue with reuse is whether the content continues to be up-to-date and accurate. Unfortunately, just because something is currently published does not indicate it should be reused elsewhere. A table that complements an article might be sufficiently current to stay on a website, but really shouldn’t be incorporated in new content without updating. Content created for one audience may seem to offer a good blueprint for new but similar content for another audience. But in the course of repurposing this content, the authors may conclude that revisions are needed for content that is being reused. What is sufficiently current is often a judgment call based on resources and mission importance.

Publishers face another challenge: the tension between content modularity and integration. While technical documentation can generally be disaggregated into modular components, other content is more powerful when tightly integrated. Ideally content elements should support one another, rather than simply be presented together. But cross-dependency among elements make them less attractive candidates to manage as separate components. A reusable, adaptable template may be a better approach when elements tend to occur together in an integrated manner. Authors may want to reuse the structure of the body of the content without reusing the actual content components.

Adaptive content and reuse

The newest approach to content reuse is known as adaptive content. Unfortunately, there is no widely accepted definition of adaptive content, and content professionals tend to speak about adaptive content in different ways. The phrase provokes two obvious questions:

What adapts?
To what does it adapt?

Sometimes people will speak about “the content” adapting to “the device” the individual is using. That interpretation is not much different from responsive web design, and is not very ambitious. It should be possible to have the content itself change based on any number of criteria, such as contextual factors (location, time of day, user status), and various user preferences or behaviors. I would rather define adaptive content in terms of the goal it supports.

Adaptive content: content that changes what is presented to reflect the intentions of the content consumer.

How Adaptive Content works

Adaptive content relies on the use of algorithms and audience data to change the content. There are significant differences between preplanned content variations such as are specified in DITA, and enabling dynamic, on-demand variations associated with adaptive content. Adaptive content builds on transclusion and COPE, but extends it.

Content reuse to support adaptive content must accommodate on-demand access to content by individuals, to deliver content composed of components that reflect the interests and needs of an individual when they ask for them.

An early example of adaptive content is the NPR One app for audio content. Individuals indicate what kinds of programming they want, rather than having the publisher deciding that for them. NPR extends its API not only to content partners (local radio stations who add local content), but also to the end consumer of the content, giving them control over what content they receive through likes and shares. The app is adaptive, but not entirely a content on-demand solution, since it is based on streaming.

Benefits and limitations of adaptive content

To realize the goal of having content components available on demand, responding to user preferences in real time, will remove the problems associated with publishers making wrong guesses about what someone wants to view. The limitation of this approach is the complexity it introduces for publishers. They need to think even harder about where the value of their content resides, based on actual use analytics, and structure the content elements to allow retrieval. Web searchers can now cherry-pick information in the search results to get the exact content items they want from articles marked up in schema.org. Such behavior provides a preview of how content will need to become adaptive to user needs.

Conclusion

Content reuse is rich with possibilities. Different content specializations are working to improve reuse. It is useful to understand different approaches. By combining approaches, one can support an integrated strategy that improves both internal goals such as efficiency and governance, and external goals such as personalization and engagement.

— Michael Andrews

FRBR stands for Functional Requirements for Bibliographic Records. FRBR’s focus is on bibliographic records for long-form content such as books, sound recordings, and films. Its focus is different from that of content strategy, so it will not be exactly equivalent. It offers helpful insights as long as we don’t expect literal compliance to its terminology. My apologies to librarians if I run roughshod over these concepts. ↩

Tags adaptive content, content reuse, COPE, DAM, DITA