Category: Content Effectiveness

Predicting Content Attention and Behavior

Post author By Michael Andrews
Post date May 2, 2018
1 Comment on Predicting Content Attention and Behavior

Audiences, at times, seem inscrutable. We want to know how audiences will respond to content, but audiences don’t behave consistently. Sometimes they skim content; sometimes they read it closely. Even if we wish it were otherwise, we can’t escape the messy reality that audiences in many respects don’t behave in a single consistent way. But do they behave predictably? Can we predict want kind of content will engage online audiences, if we could account for known variables? To date, progress untangling this problem has been limited. But we have reason to be optimistic it won’t always be this way. A more data-centric approach to content strategy could help us understand what variables influence audience behavior.

The biggest weakness in content strategy today is that it lacks predictive explanatory power. Whenever someone advances a proposition about what audiences want or will do, it is easy to find counter examples of when that doesn’t hold. Nearly all categorical assertions about what people want from content don’t survive even minimal scrutiny. Do audiences want more or less content? Do they want simple or detailed explanations? Do they want to engage with the content, or get the content in the most digestible form possible? Such binary questions seem reasonable to ask, and call for reasonable answers in return. But such questions too often prompt simplistic answers that promise a false sense of certainty. Content behavior is complex — just like human behavior in general. Yet that doesn’t mean it is not possible to learn some deeper truths — truths that may not be complete and exhaustive, but are nonetheless accurate and robust. What we need is better data that can explain complexity.

To provide predictive explanatory power, content strategy guidelines should be based on empirical data that can be reproduced by others. Guidelines should be based on data that covers a breadth of situations, and has a depth of description. That’s why I was so excited to read the new study presented last week at 2018 World Wide Web Conference by Nir Grinberg of Northeastern University, entitled “Identifying Modes of User Engagement with Online News and Their Relationship to Information Gain in Text.” The research provides a rare large scale empirical analysis of content, which reveals many hidden dimensions that will be useful to apply and build on. I encourage you to read the study, though I caution that the study at times can be dense, filled with academic and statistical terminology. I will summarize some of its highlights, and how they can be useful to content strategy practitioners.

Grinbert’s study looked at “a large, client-side log dataset of over 7.7 million page views (including both mobile and non-mobile devices) of 66,821 news articles from seven popular news publishers.” By looking at content on such a large scale (nearly 8 million page views), we can transcend the quirks of the content we deal with in our own projects. We want to understand if the features of our content are typical of content generally, or are characteristics that apply to only some kinds of content.

The study focused content from news websites that specialize in different topics. It does not represent the full spectrum of content that professionals in content strategy address, but it does cover a range of genres than are commonly discussed. The study covered seven distinct genres:

Financial news
Technology
How To
Science
Women
Sports
Magazine features

Grinbert was motivated by a desire to improve the value of content. “Post-hoc examination of the extent to which readers engaged with articles can enable editors to better understand their audience interests, and inform both the coverage and writing style of future articles.”

Why do analytics matter? Content that audiences use is content that audiences value. The question is how to measure the audience use of content, after they click on a link. Page views are not a meaningful metric, since many views “bounce”. Other metrics draw controversy. Is a long time on a page desirable or not desirable? With simple metrics, the metric can become hostage to one’s own ideological worldview about what’s best for users, instead being a resource to learn what users are really trying to accomplish.

First, how can we measure attention? The study considered six metrics available in analytics relating to attention:

Depth — how far scrolled in an article, a proxy for how much of the content was viewed or read
Dwell time — total user time on a page (good for non-reading engagement such as watching a video)
Engagement — how much interaction happens on a page (for example, cursor movements, highlighting)
Relative depth — how much of an article was visible on a user’s screen
Speed — speed of scrolling, a proxy of how quickly the readers “read” the content
Normalized engagement — engagement relative to article length

The metrics that are “relative” and “normalized” attempt to control for differences between the absolute values of shorter and longer content.

Next, what might these metrics say about audience behavior? Through a cluster analysis, the study found these indicators interact to form five content engagement patterns:

Shallow (not getting far in an article)
Idle (short period of activity followed by period of inactivity, followed by more activity)
Scan (skimming an article quickly)
Read (reading the article for comprehension)
Long read (engaging with supplementary materials such as comments)

So how do specific behaviors relate to engagement patterns? The study showed that the indictors were associated with specific engagement patterns.

Depth (ranked from low to high depth of scrolling)

Shallow
Idle
Scan
Read
Long read

Dwell time (ranked from short to long dwell time)

Scan
Read
Long read
Ide
Shallow

Engagement (ranked to low to high engagement)

Shallow
Scan
Idle
Read
Long read

Relative depth (ranked for short to long relative depth)

Shallow
Idle
Scan
Read
Long read

Speed (ranked from fast to slow)

Scan
Read
Long read
Idle
Shallow

Normalized engagement (ranked from low to high)

Shallow
Idle
Scan
Read
Long read

So what does this mean for different kinds of content? “We found substantially more scanning in Sports, more idling in “How To”, and more extensive reading for long-form magazine content.” That may not sound like a profound conclusion, but it feels valid, and it’s backed by real world data. This gives us markers to plan with. We have patterns to compare. Is your content more like sports, a how-to, or a feature?

For sports, readers scan, often just check scores or other highlights, rather than read the full text. They are looking for some specific information, rather than a complete explanation. Sports is a genre is closely associated with scanning. When sports is presented in long form, such as done on the now defunct Grantland website, it only appeals to a niche. ESPN found Grantland unprofitable. Grantland violated the expectations of the genre.

Magazines were most likely to be read shallowly, where only the first few sentences are read, as well as the most likely to be read thoroughly, where even comments are read. This shows that the reader makes investment decision about whether the content looks sufficiently interesting to read in depth. They may leave a tab open, hoping to get back to the article, but never doing so. But sometimes, a preview summary such as an abstract can provide sufficient detail for most people, and only some will want to read the entire text.

The study found a “relatively high percent of Idle engagements in How To articles. The few articles we examined from this site gave instructions for fixing, making, or doing something in the physical world. It is therefore plausible that people disengage from their digital devices to follow instructions in the physical world.”

How the Study Advances our Practice

The study considers how reading characteristics converge into common reading patterns, and how different genres are related to distinct reading patterns.

The study brings more a sophisticated use of metrics to infer content attention. It shows how features of content influence attention and behavior. For example “total dwell time on a page is associated with longer articles, having more images and videos.” Not all content is text. How to measure use of video or images, or exploring data, are important considerations.

We have concrete parameters to define engagement patterns. We may casually talk about skimming, but what does that mean exactly? Once we define it and have a way to measure it, we can test whether content is skim-able, and compare it to less skim-able content.

Solid detailed data helps us separate what is happening from why it may be happening. Slow reading speed is necessarily not an indication that they material is difficult to read. Fast reading speed doesn’t necessarily indicate the topic is boring. Readers may be involved with other activities. They may have different knowledge already that allows them to skim. Instead of debating what is happening, we can focus on the more interesting topic of why it might be happening, and how to address it. And with benchmark data, teams can test alternative content designs and see how the performance changes.

How Content Strategy can build on the study

The study shows that more robust analytics can allow us to directly compare utilization characteristics of content from different sources, and compare the utilization characteristics of different genres and formats of content. Standardized data allows for comparisons.

The study suggests more sophisticated ways to measure attention, and suggests that attention patterns can depend on the genre of content. It also identified six content behaviors that could be useful for classifying content utilization. These elements could contribute to a more rigorous approach to using analytics to assess audience content needs.

A framework using detail metrics and patterns can help use baseline what’s actually happening, and compare it with what might be desirable.

For example, what kinds of content elicit shallow engagement? Is shallow engagement ever a good, or at least an opportunity? Perhaps people start then abandon an article because it is the wrong time for them to view it. Maybe they’d benefit from a “save for later” feature. Or alternatively, maybe the topic is valuable, but the content is uninviting, which grinds the engagement to a halt. With more a sophisticated ability to describe content behavior, we can consider alternative explanations and scenarios.

The study also opens up the issue of whether content should conform to typical behavior, or whether content should try to encourage a more efficient behavior. If How To content involves idle periods, should the content be designed so that people can start and stop reading it easily? Or should the content be designed so that the viewer knows everything they need to do before they begin (perhaps by watching a video that drills how to do the critical steps), so they can complete the task without interruption? I’m sure many people already have opinions about this issue. More precise analytics can allow those opinions to become testable hypotheses.

The big opportunity is the ability to compare data between content professionals, something that’s not possible with qualitative feedback. Today, we have conferences where different people present case studies. But it is hard to compare the learnings of these case studies because there are no common metrics to compare. Cases studies can also be hard to generalize, because they tend to focus on process rather than focus on common features of content. Two people can follow the same process, but have different outcomes, if the features of their content are different.

Like the field of usability, content strategy has the opportunity to build a set of evidence-based best practices. For example, how does having a summary paragraph at the start of an article influence whether the entire article is read? Different content professionals, looking at different content topics, could each test such a question and compare their results. That could lead to evidence-backed advice concerning how audiences will likely react to a summary.

The first step toward realizing this vision is having standard protocols for common analytics tools like Google Analytics, so that different website data are comparable. It’s a fascinating opportunity for someone in the content strategy community to move forward with. I’m too deep in the field of metadata to be able to work on it myself, but I hope others will become interested in development of a common analytics framework .

— Michael Andrews

Tags analytics, patterns

Content Effectiveness

Untangling Content ROI

How to measure content ROI is a recurring question in forums and at conferences. It’s a complex topic — I wish it were simple. Some people present the topic in a simple way, or claim only one kind of measurement matters. I don’t want to judge what other people care about: only they know what’s most important to their needs. But broad, categorical statements about content ROI tend to mislead, because content is complicated and organizational goals are diverse. I can’t provide a simple formula to calculate the value of content, but I hope to offer ideas on how to evaluate its impact.

The Bad News: The ROI of Content is Zero

First, I need to share with you some bad news about content that nearly everyone is hiding from you. There is no return on investment from content. If you don’t believe me, ask your CPA how you can depreciate your content.

A widespread misconception about content ROI is that content is an investment. Yet accountants don’t consider most content as an investment. They consider content as an expense. The corporations that hire the accountants consider content as an expense as well. In the eyes of accountants, content isn’t an asset that will provide value over many years. It is a cost to be charged in the current year. From a financial accounting perspective, you can’t have a return on investment when the item is considered as an expense, instead of as an investment.

Many years ago I took an accounting class at Columbia Business School. I remember having a strong dislike of accounting. Accounting operates according to its own definitions. It may use words that we use in everyday conversation, but have specific ideas about what those words mean. Take the word “asset.” Many of us in the content strategy field love to talk about content assets. Our content management systems manage content assets. We want to reuse content assets. The smart use of content assets can deliver greater value to organizations. But what we refer to as a content asset is not an asset in an accounting sense. When we speak of value, we are not necessarily using the word in the way an accountant would.

I warned that broad statements about ROI are dangerous, and that content is complicated. There are cases where accountants will consider content as an investment — if you happen to work at Disney. Disney creates content that delivers monetary value over many years. They defy the laws of content gravity, creating content that often makes money over generations. Most of us don’t work for Disney. Most of us make content that has a limited shelf life. Until we can demonstrate content value over multiple years, our content will be treated as an expense.

So the first task toward gaining credibility in the CFO office is to talk about the return on content in a broader way. Just because content is an expense, that doesn’t mean it doesn’t offer value. Advertising is an expense that corporations willingly spend billions on. Few people talk about advertising as an investment: it’s a cost of business, accepted as necessary and important.

The Good News: Content Influences Profitability

Content is financially valuable to businesses. It can be an asset — in the commonsense meaning of the word. It’s entirely appropriate to ask what the payoff of content is, because creating content costs money. We need ways to talk about the relationship between the costs of content, and revenues they might influence.

Profitability is determined by the relationship between revenues and costs. Content can influence revenues in multiple ways. Content is a cost, but that cost can vary widely according to how the content is created and managed. The overall goal is to use content to increase revenues while reducing the costs of producing content, where possible. The major challenge is that the costs associated with producing content are often not directly linked to the revenue value associated with the content. As a result, it can be hard to see the effects on revenues of content creation costs. Content’s influence on profitability is often indirect.

Various stakeholders tend to focus on different financial elements when evaluating the value of content. Some will seize on the costs of the content. How can it be done more cheaply? Others will focus exclusively on the revenue that’s related to a set of content items. How many sales did this content produce? These are legitimate concerns for any business. But narrowly framed questions can have unintended consequences. They can lead to optimizing of one aspect of content to the detriment of other aspects. Costs and revenues can involve tradeoffs, where cost-savings hurt potential revenue. Costs also involve choices about kinds of content to produce, so that choices to spend on content supporting one revenue opportunity can involve a decision not to produce content supporting another revenue opportunity. For example, a firm might prioritize content for current customers over the needs of content for future customers, especially if revenue associated with current customers is easier to measure.

The key to knowing the value of content is to understand its relationship to profitability.

Customers Generate Profits, Not Content

Spreadsheets tend to represent things and not people. There are costs associated with different activities, or different outputs, such as content. There are revenues, actual and forecast, associated with products and services. The customers that actually spend money for these products and services are often represented only indirectly. But they are the link between one set of numbers (expenses involved with stuff they see and use) and another set (revenues associated with stuff they buy, which is generally not the content they see).

Unfortunately, the financial scrutiny of content items tends to obscure the more important issue of customer value. Content is not valuable or costly in its own right. Its financial implications are meaningful only with respect to the value of the customers using the content, and their needs. The financial value of content is intrinsically related to the expected profitability of the customer.

The financial value of content is clear only when seen from the perspective of the customer. Let’s look at a very simplified customer lifecycle. The customer first enters a stage of awareness of a brand and its products. Then the customer may move to a stage where she considers the product. Finally, if all goes well, she may become an advocate for the brand and its products. At each stage, content is important to how and what the customer feels, and how likely he or she may be to take various actions. So what kind of content is most important? Content to support awareness, content to support consideration, or content to support advocacy? Asked as an abstract hypothetical, the question poses false choices. The business context is vital as well. Is it more important to get a specific sale, or to acquire a new customer? Such questions involve many other issues, such as buying frequency, brand loyalty, purchase lead times, product margins, etc.

There can be no consideration of a product without awareness, and no advocacy without favorable consideration (and use). And awareness is diminished without advocacy by other customers. The lifecycle shows that the customer’s value is not tied to one type of content — it is cultivated by many types. At the same time, it is clear that content is only playing a supporting role. The customer is not evaluating the content: she is evaluating the brand and its products. Content is an amplifier of customer perception. The content doesn’t create the sale — the product needs to fulfill a customer need. While bad content can hurt revenues for otherwise excellent companies, content doesn’t have the power to make a bad company overcome poor quality products and services. Content’s role is to bring focus to what customers are interested in learning about.

Conversion is a Process, Not an Event

Marketing has become more focused on metrics, and so it is not surprising that content is being measured in support of sales. A/B testing is widely used to measure what content performs better in supporting sales. Marketers are looking at how content can increase revenue conversion. This has often resulted in a tunneling of vision, to focus on the content on the product pages. Conversion is seen as an event, rather than as a process.

Below is a landing page for a product I heard about, and was interested in possibly purchasing. It represents a fairly common pattern for page layouts for cloud based subscription services. The page is simple, and unambiguous about what the brand wants you to do. The page is little more than a button asking you to sign up (and, in the unlikely event you missed the button on the page, a second button is provided at the top). I presume that this page has been tested, and the designers decided that less content resulted in more conversions per session. If people have few places to go, they are more likely to sign up than if they get distracted by other pages. What’s harder to judge is how many people didn’t sign up because of the dearth of information.

Some online purchases are impulsive. Impulse online purchases tend to be for inexpensive items, or from brands the customer has used before and is confident in knowing what to expect. Most other kinds of purchases involve some level of evaluation of the product, or of the seller, sometimes over different sessions. In the case if this product, the brand decided that it could encourage impulsive sign-ups by offering a two-week free trial. This model is known as “buy before you try”, since you are presumed to have bought the product at sign-up, as your subscription is automatically renewed until you say otherwise.

A focus on conversion will often result in offering trials in lieu of content. Free trials can be wonderful ways to experience a product. I enjoy sampling a new food item in a grocery, knowing I can walk away. But trials often involve extra work for prospective customers. Online, my trial comes with strings attached. I need to supply my email address. I need to create an account, and make up a new password for a product I don’t know I want. If it is a buy-before-you-try type trial, I’ll be asked for my credit card, and hope there is no drama if I do decide to cancel. And I’m being forced to try the product on their schedule, and not my own.

Paradoxically, content designed to convert may end up not converting. The brand provides little information about their service, such as what one could expect after signing up. The only information available is hidden in a FAQ (how we love those), where you learn that the service will cost $100 a year — not an impulse buy for most people. When prospective customers feel information is hidden, they are less likely to buy.

Breaking the Taboo of Non-Actionable Content

There is a widespread myth that all content must be designed to produce a specific action by the audience. If the content didn’t produce an action, then nothing happened, and the content is worthless. It’s a seductive argument that appeals to our desire to be pragmatic. We want to see clear outcomes from our content. We don’t want to waste money creating content that doesn’t deliver results for our organization. So the temptation is to purge all content that doesn’t have an action button on it. And if we decide we have to keep the content, we should add action buttons so we have something we can measure.

I don’t want to minimize the problem of useless content that offers no value to either the organization or to audiences. But it is unrealistic to expect all pages of content to contribute directly to a revenue funnel. By all means weed out pages that aren’t being viewed. But audiences do look at content with no intention to take action right away. And that’s fine.

Creating content biased for action only makes sense when the content is discussing the object of the action. Otherwise, the call to action is incongruous with the content. A UX consultant may tell a nonprofit that people have trouble seeing the “donate now” button. But the nonprofit shouldn’t compensate by putting a “donate now” button on every page of their website — it looks pushy, and is unlikely to increase donations.

Conversion metrics measure an event, and can miss the broader process. Most analytics are poor at tracking behavior across different sessions. It is hard to know what happened between sessions — we only see events, and not the whole process. Even sophisticated CRM technology can only tell part of the story. It can’t tell us why people drop out, and if inadequate content played a role. It can’t tell us if people who bought supplemented their knowledge of the product with other sources of information — talking to colleagues or friends, or seeing a third party evaluation. To compensate for these gaps in our knowledge of customer behavior, businesses often try to force customers to make a decision, before they seem to disappear.

By far the biggest limitation of analytics is that they can’t measure mental activity easily. We don’t know what customers are thinking as they view content, and therefore we tend to care only about what they do. The opacity of mental activity leads some people to believe that the opinions of customers aren’t important, and that only their behavior counts.

The Financial Value of Customer Opinion

Customers have an opinion of a brand before they buy, and after they buy. Those opinions have serious revenue implications. They shape whether a person will buy a product, whether they will recommend it, and whether they will buy it again. Content plays an important role in helping customers form an opinion of a brand and product. But it’s hard to know precisely what content is responsible for what opinions that in turn result in revenue-impacting decisions. Humans just aren’t that linear in their behavior. Often many pieces of content will influence an opinion, sometimes over a period of time.

Just because one can’t measure the direct revenue impact of content items does not mean these items have no revenue impact. A simple example will illustrate this. Most organizations have an “about us” page. This page doesn’t support any revenue generating activity. It doesn’t even support any specific customer task. Despite not having a tightly defined purpose, these pages are viewed. They may not get the highest traffic, but they can be important for smaller or less well known organizations. People view these pages to learn who the organization is, and to assess how credible they seem. People may decide whether or not to contact an organization based on the information on the “about us” page.

Non-transactional content is often more brand-oriented than product-oriented. Such content doesn’t necessarily talk about the brand directly, but will often provide an impression of the brand in the context of talking about something of interest to current and potential customers. These impressions influence how much trust a customer feels, and their openness to any persuasive messaging. Overall content also shapes how loyal customers feel. Do they identify with being a customer of a brand, or do they merely identify has being someone who is shopping, or as someone who was a past-purchaser of a product?

Another type of non-transactional content is post-purchase product information. A focus on content for conversion can overlook the financial implications of the post-purchase experience. People often make purchase decisions based on a general feeling about a brand, plus one or two key criteria used to select a specific product. If they are looking to book a hotel, they have an expectation about the hotel chain, and may look for the price and location of rooms available. They may not want to deal with too many details while booking. But after booking, they may focus on the details, such as the availability of WiFi and hairdryers. If information about these needs isn’t available, the customer may be disappointed with his decision. Other forms of post-purchase product information include educational materials relating to using a product or service, on-boarding materials for new customers, and product help information.

The financial value of non-transactional content will vary considerably for two reasons. First, no one item of content will be decisive in shaping a customer’s opinion. Many items, involving different content types, can be involved. Second, the level of content offered can be justified only in terms of the customer’s value to the organization. Content that’s indirectly related to revenues is easiest to justify when it’s important to developing customer loyalty. Perhaps the product is high value, has high rates of repurchase, or involves a novel approach to the product category that requires some coaching to encourage adoption. Developing non-transactional content makes most financial sense when aimed at customers who will have a high lifetime value.

Measuring the impact of content that influences customer opinions is hard — much harder than measuring content designed around defined outcomes, such as the conversions on product pages. But with clear goals, sound measurement is possible. Content that’s not created to support a concrete customer action needs to be linked to specific brand and customer goals. Customer goals will consider broader customer journeys where the brand and product are relevant, and where is there is a realistic opportunity to present content around these moments. Appropriate timing is often critical for content to have an impact. The goals of a brand will reflect a detailed examination of the customer lifecycle, and a full understanding of the future revenue implications of different stages and the brand’s delivery of services prior to and following revenue events.

The Ultimate Goal: Content that Supports Higher Margins

The two most common approaches to “Content ROI” involve improving conversion rates, and reducing content costs. These tactics are incremental approaches —useful when done properly, although potentially counterproductive if done poorly.

To realize the full revenue potential of content, one can’t be a prisoner of one’s metrics. The things that are easiest to quantify financially are not necessarily the most important financial factors. Many organizations fine-tune their landing pages with A/B testing. Many of the changes they make are superficial: small visual and wording changes. They are important, and have real consequences, lifting conversions. But they only scratch the surface of the content customers consider. The placement and color of buttons gets much attention partly because they are relatively simple things to measure. That does not imply they are the most important things — only that their measurement is simple to do, and the results are tangible.

Conversion metrics measure the bottom of the marketing funnel: making sure people don’t drop out after they’ve reached the point of purchase page. What’s harder to do, but potentially more financially valuable, is to expand the funnel by focussing on who enters it. Content can attract more people to consider a brand and its products, and attract more profitable customers as well.

The biggest opportunity to increase revenues is by attracting people who would be unlikely to ever reach your product landing page. How to do this is no mystery — it’s just hard to measure, and so gets de-emphasized by many metrics-driven organizations. The first approach is to offer educational content, so that prospective buyers can learn about the benefits of a product or service without all those pesky calls-to-action. People interested in educational content are often skeptics, who need to be convinced a solution or a brand is the right fit. The second approach is through personalization. The approach of intelligent content points to many ways in which content can be made less generic, and more relevant to specific customers. Many potential customers can’t see the relevance of the product or brand, and accordingly don’t even consider them in any detail, because existing content is too generic.

But profitability is not just about units of sale. Profitability is about margins.

The first avenue to improving margins is reducing the cost of service. Many content professionals focus on reducing the cost of producing content, which can potentially harm content quality if done poorly. The bigger leverage can come from using content to reduce the cost of servicing customers. Well-designed and targeted content can reduce support costs — a big win, provided the quality is high, and customers prefer to use self-service channels, instead of feeling forced to use them.

The second avenue to improving margins involves pricing. Earlier I noted that the financial value of content depends on the financial value of the customers for which the content is intended. A corollary holds true as well: the financial value of prospective customers is influenced by the content they see. Valuable content can attract valuable customers. It’s not only the volume of sales, it’s about the margin each sale results in.

Customers who see the brand as being credible and as being leaders are prepared to pay a premium over brands they see as generic. This effect is most pronounced in the service industry, where experience is important to customer satisfaction, and content is important to experience. Imagine you are looking to hire a professional services firm: a lawyer, an accountant (who appreciates the value of content), or perhaps a content strategist (maybe me!). What you read about them online affects how you view their competency. And those impressions will impact how much you are prepared to pay for their services.

These effects are real, but require a longer period to realize. Long-term projects may not be appealing to organizations that only care about quarterly numbers, or to product managers who are plotting their next job hop. But for those committed to improving the utility of content offered to prospective customers, the financial opportunity is big.

Discovering Value

When seen from the perspective of how brand credibility affects margins, content marketing that often doesn’t seem linked to any specific outcome, now matters significantly. It is not simply who knows about your firm that matters: it is about how they evaluate your capabilities, and what they are prepared to pay for your product or service. Potential customers not only need to be aware of a firm, and have a correct understanding of what it offers, they need to have a favorable impression of it as well.

Content that provides a distributed rather than direct financial contribution needs its own identity. Perhaps we should call it margin-enhancing content. Such content enables brands to be more profitable, but does so indirectly. The task of modeling and monitoring the impact of such content requires a deep awareness of how pieces may interact with and influence each other. By its nature, estimating the strength of these relationships will be inexact. But the upside of endeavoring to measure them is great. And through experience and experimentation, the possibilities for more reliable measurements can only improve.

Measurement is important, but it’s not always obvious how to do it. For much of human history, people were unaware of radiation, because it could not be directly seen. Eventually, the means to detect and measure it were developed. The process of measuring the financial value of content involves a similar process of investigation: looking for evidence of its effects, and experimenting with ways to measure it more accurately.

— Michael Andrews

Tags ROI

Big Content Content Effectiveness

Connecting Organizations Through Metadata

Post author By Michael Andrews
Post date August 7, 2015

Metadata is the foundation of a digitally-driven organization. Good data and analytics depend on solid metadata. Executional agility depends on solid metadata. Yet few organizations manage metadata comprehensively. They act as if they can improvise their way forward, without understanding how all the pieces fit together. Organizational silos think about content and information in different ways, and are unable to trace the impact of content on organizational performance, or fully influence that performance through content. They need metadata that connects all their activities to achieve maximum benefit.

Babel in the Office

Let’s imagine an organization that sells a kitchen gadget.

The copywriter is concerned with how to attract interest from key groups. She thinks about the audience in terms of personas, and constructs messages around tasks and topics of interest to these people.

The product manager is concerned with how different customer segments might react to different combinations of features. She also tracks the features and price points of competitors.

The data analyst pours over shipment data of product stock keeping units (SKU) to see which ZIP codes buy the most, and which ones return the product most often.

Each of these people supports the sales process. Each, however, thinks about the customer in a different way. And each defines the product differently as well. They lack a shared vocabulary for exchanging insights.

A System-generated Problem

The different ways of considering metadata are often embedded in the various IT systems of an organization. Systems are supposed to support people. Sometimes they trap people instead. How an organization implements metadata too often reveals how bad systems create suboptimal outcomes.

Organizations generate content and data to support a growing range of purposes. Data is everywhere, but understanding is stove-piped. Insights based on metadata are not easy to access.

We can broadly group the kinds of content that audiences encounter into three main areas: media, data, and service information.

External audiences encounter content and information supplied by many different systems

Media includes articles, videos and graphics designed to attract and retain customers and encourage behaviors such as sharing, sign-ups, inquiries, and purchases. Such persuasive media is typically the responsibility of marketing.

Customer-facing data and packaged information support pre- and post-sales operations. It can be diverse and will reflect the purpose of the organization. Ecommerce firms have online product catalogs. Membership organizations such as associations or professional groups provide events information relating to conferences, and may offer modular training materials to support accreditation. Financial, insurance and health maintenance organizations supply data relating to a customer’s account and activities. Product managers specify and supply this information, which it is often the core of the product.

Service-related information centers on communicating and structuring tasks, and indicating status details. Often this dimension has a big impact on the customer experience, such as when the customer is undergoing a transition such as learning how to operate something new, or resolving a problem. Customer service and IT staff structure how tasks are defined and delivered in automated and human support.

Navigating between these realms is the user. He or she is an individual with a unique set of preferences and needs. This individual seeks a seamless experience, and at times, a differentiated one that reflects specific requirements.

Numerous systems and databases supply bits of content and information to the user, and track what the user does and requests. Marketing uses content management and digital asset management systems. Product managers feed into a range of databases, such as product information systems or event management systems. Customer service staff design and maintain their own systems to support training and problem resolution, and diagnose issues. Customer Relationship Management software centralizes information about the customer to track their actions and identify cross selling and up selling opportunities. Customer experience engines can draw on external data sources to monitor and shape online behaviors.

All these systems are potential silos. They may “talk” to the other systems, but they don’t all talk in a language that all the human stakeholders can understand. The stakeholders instead need to learn the language of a specific ERP or CRM application made by SAP, Oracle or Salesforce.

Metadata is Too Important for IT to Own

Data grows organically. Business owners ask to add a field, and it gets added. Data can be rolled up and cross tabulated, but only to an extent. Different systems may have different definitions of items, and coordination relies on the matching of IDs between systems.

To their credit, IT staff can be masterful in pulling data from one system and pushing it into another. Data exchange — moving data between systems — has been the solution to de-siloing. APIs have made the task easier, as tight integration is not necessary. But just because data are exchanged, does not mean data are unified.

The answer to inconsistent descriptions of customers and content has been data warehousing. Everything gets dumped in the warehouse, and then a team sorts through the dump to try to figure out patterns. Data mining has its uses, but it is not a helpful solution for people trying to understand the relationships between users and items of content. It is often selective in what it looks at, and may be at a level of aggregation that individual employees can’t use.

Employees want visibility into the content they define and create, and know how customers are using it. They want to track how content is performing, and change content to improve performance. Unfortunately, the perspectives of data architects and data scientists are not well aligned with those of operational staff. An analyst at Gartner noted that businesses “struggle to govern properly the actual data (and its business metadata) in the core business systems.”

A Common Language to Address Common Concerns

Too much measurement today concerns vaguely defined “stuff”: page views, sessions, or short-lived campaigns.

Often people compare variants A and B without defining what precisely is different between them. If the A and B variations differ in several different properties, one doesn’t learn which aspects made the winning variant perform better. They learn which variant did better, but not what attributes of the content performed better. It’s like watching the winner horse at a race where you see which one won, but not knowing why.

A lot of A/B testing is done because good metadata isn’t in place, so variations need to be consciously planned and crafted in an experiment. If you don’t have good metadata, it is difficult to look retrospectively to see what had an impact.

In the absence of shared metadata, the impact of various elements isn’t clear. Suppose someone wanted to know how important the color of the gadget shown in a promotional video is on sales. Did featuring the kitchen gadget in the color red in a how-to promotional video increase sales compared to other colors? Do content creators know which color to feature in a video, based on past viewing stats, or past sales? Some organizations can’t answer these questions. Others can, but have to tease out the answer. That’s because the metadata of the media asset, the digital platform, and the ordering system aren’t coordinated.

Metadata lets you do some forensics: to explore relationships between things and actions. It can help with root cause analysis. Organizations are concerned with churn: customers who decide not to renew a service or membership, or stop buying a product they had purchased regularly. While it is hard to trace all the customer interactions with an organization, one can at least link different encounters together to explore relationships. For example, do the customers who leave tend to have certain characteristics? Do they rely on certain content — perhaps help or instructional content? What topics were people who leave most interested in? Is there any relationship between usage of marketing content about a topic, and subsequent usage of self-service content on that topic?

There is a growing awareness that how things are described internally within an organization need to relate to how they are encountered outside the organization. Online retailers are grabbling with how to synchronize the metadata in product information management systems with the metadata they must publish online for SEO. These areas are starting to converge, but not all organizations are ready.

Metadata’s Connecting Role

Metadata provides meaningful descriptions of elements and actions. Connecting people and content through metadata entails identifying the attributes of both the people and the content, and the relationships between them. Diverse business functions need uniform ways to describe important attributes of people and content, using a common vocabulary to indicate values.

The end goal is having a unified description that provides both a single view of the customer, and gives the customer a single unified view of the organization.

Challenges

Different stakeholders need different levels of detail. These differences involve both the granularity of facets covered, and whether information is collected and provided at the instance level or in aggregation. One stakeholder wants to know about general patterns relating to a specific facet of content or type of user. Another stakeholder wants precise metrics about a broad category of content or user. Brands need to establish a mapping between the interests of different stakeholders to allow a common basis to trace information.

Much business metadata is item-centric. Customers and products have IDs, which form the basis of what is tracked operationally. Meanwhile, much content is described rather than ID’d. These descriptions may not map directly to operational business metadata. Operational business classifications such as product lines and sales and distribution territories don’t align with content description categories involving lifestyle-oriented product descriptions and personas. Content metadata sometimes describes high level concepts that are absent in business metadata, which are typically focused on concrete properties.

The internal language an enterprise uses to describe things doesn’t match the external language of users. We can see how terminology and focus differs in the diagram below.

Businesses and audiences have different ways of thinking

Not only do the terminologies not match, the descriptors often address different realms. Audience-centric descriptions are often associated with outside sources such as user generated content, social media interactions, and external research. Business centric metadata, in contrast, reflects information captured on forms, or is based on internal implicit behavioral data.

Brands need a unified taxonomy that the entire business can use. They need to become more audience-centric in how they think about and describe people and products. Consider the style of products. Some people might choose products based on how they look: after they buy one modern-style stainless product, they are more inclined to buy an unrelated product that also happens to have the same modern stainless style because they seem to go together in their home. While some marketing copy and imagery might feature these items together, they aren’t associated in the business systems, since they represent different product categories. From the perspective of sales data, any follow-on sales appear as statistical anomalies, rather than as opportune cross-selling. The business doesn’t track products according to style in any detail, which limits its ability to curate how to feature products in marketing content.

The gap between the businesses’ definition of the customer, and the audience’s self-definition can be even wider. Firms have solid data about what a customer has done, but may not manage information relating to people’s preferences. Admittedly it is difficult to know precisely the preferences of individuals in detail, but there are opportunities to infer them. By considering content as an expression of individual preferences and values, one can infer some preferences of individuals based on the content they look at. For example, for people who look at information on the environmental impact of the product, how likely are they to buy the product compared with people who don’t view this content?

Steps toward a Common Language

Weaving together different descriptions is not a simple task. I will suggest four approaches that can help to connect metadata across different business functions.

First, the entire business should use the same descriptive vocabulary wherever possible. Mutual understanding increases the less jargon is used. If business units need to use precise, technical terminology that isn’t audience friendly, then a synonym list can provide a one-to-one mapping of terms. Avoid having different parties talk in different ways about things that are related and similar, but not identical. Saying something is “kind of close” to something else doesn’t help people connect different domains of content easily.

Second, one should cross-map different levels of detail of concern to various business units. Copywriters would be overwhelmed having to think about 30 customer segments, though that number might be right for various marketing analysis purposes. One should map the 30 segments to the six personas the copywriter relies on. Figure out how to roll up items into larger conceptual categories, or break down things into subcategories according to different metadata properties.

Third, identify crosscutting metadata topics that aren’t the primary attributes of products and people, but can play a role in the interaction between them. These might be secondary attributes such as the finish of a product, or more intangible attributes such as environmental friendliness. Think about themes that connect unrelated products, or values that people have that products might embody. Too few businesses think about the possibility that unrelated things might share common properties that connect them.

Fourth, brands should try to capture and reflect the audience-centric perspective as much as possible in their metadata. One probably doesn’t have explicit data on whether someone enjoys preparing elaborate meals in the kitchen, but there could be scattered indications relating to this. People might view pages about fancy or quick recipes — the metadata about the content combined with viewing behavior provides a signal of audience interest. Visitors might post questions about a product suggesting concern about the complexity of a device — which indicate perceptions audiences have about things discussed in content, and suggest additional content and metadata to offer. Behavioral data can combine with metadata to provide another layer of metadata. These kinds of approaches are used in recommender systems for users, but could be adapted to provide recommendations to brands about how to change content.

An Ambitious Possibility

Metadata is a connective tissue in an organization, describing items of content, as well as products and people in contexts not related to content. As important as metadata is for content, it will not realize its full potential until content metadata is connected to and consistent with metadata used elsewhere in the organization. Achieving such harmonization represents a huge challenge, but it will become more compelling as organizations seek to understand how content impacts their overall performance.

—Michael Andrews

Tags metadata