A recent post on Google’s webmaster blog illustrates how metadata needs to address both the structure of web content, and the meaning of that content.
People who work in SEO talk about structured data a lot, while those who work in content strategy talk about structured content. These topics are obviously related, but the terminology used by each party obscures how each topic relates to the other. My take: both structured data and structured content are different dimensions of metadata. Structured data is generally descriptive metadata identifying entities discussed in the content. Structured content provides the foundation for structural metadata that indicates the logic and organization of the content. Both descriptive and structural metadata are important in content, and they should ideally be integrated together.
The Google blog advises publishers to include structured data in their content. The below screenshot shows how this advice is presented.
The advice presented follows a pattern:
Advice to follow
Best practices to implement advice (shown in green)
Actions not to do (shown in pink)
Some other items of advice in the post include another element:
Practices to avoid when implementing advice (shown in yellow)
We can see that the post follows good structure that is easy to scan and understand, and provides a foundation to reuse the information in other contexts. Now, let’s look at the post’s source code. This is where we’d expect to see the structured data associated with the content.
Disappointingly, no structured data is associated with the specific items of advice. The details of the advice are marked up with “class” attributes intended to style the content, but not to identify the meaning of the content. The only structured data on the page relates to the blog post in general (such as its author).
Imagine how the content could be reused if structured data identified the meaning of the advice. Someone might type a search looking for tips on “mistakes when using schema.org,” “why use schema.org,” or “schema.org best practices” and get specific bullets of content relating to their query.
In this example, the post’s author has done nothing wrong, though an opportunity has been missed nonetheless. Currently, schema.org doesn’t have any entity types that address advice statements that would contain sub-elements such as Rationale, Do, Avoid, and Don’t. The closest types are related to Questions and Answers, which are slightly different in their structure.
Because the structured data used in SEO, particularly schema.org, tends to focus on descriptive metadata, it has less coverage of other dimensions of metadata such as structural metadata indicating the role of content elements, or technical, administrative and rights metadata. All these kinds of metadata are important to address, to allow content to be shared and reused across different platforms and in different contexts. Fortunately, schema.org has been evolving quickly, and its coverage is improving every month. This expansion will allow for genuinely integrated metadata that indicates both the meaning and the structure of the content.
Metadata is a rich and important topic for everyone concerned with content published on the web. If you are interested in learning more about the many dimensions of metadata, you may be interested in my forthcoming book, Metadata Basics for Web Content, which will be available in early 2017 on Amazon.
The act of sorting seems so familiar you may think little about it. We sort our possessions to organize them. Some people even sort the sox in their drawer according to color or occasion. We sort to organize things, and more fundamentally, to prioritize our content. When considered in relation to IT, sorting is the programmatic prioritization of content using a simple ordering procedure. Simple sorting routines can offer much value, even if more sophisticated techniques to prioritize content are available.
Most discussion of sorting focuses on its technical dimensions — the rules for sorting correctly. Developers study various algorithms to optimize sorting. Editors must follow detailed rules to correctly sort entries appearing in indexes. Interaction designers focus on how to implement sorting options on user interfaces. Sorting is also utilized in statistical operations, though the strict criteria applied in statistical analysis is different from how people will think about sorting content.
In contrast to its technical dimensions, the experiential dimensions of sorting receive less attention. Besides considering how to sort items, we must also consider why audiences want items sorted in different scenarios. Every day, when we write lists, we make decisions that reflect our understanding of the purpose of sorting . Do we present the list as unordered bullets, or do we number the list items? If we number the items, what do the numbers represent? All of our content faces such existential dilemmas — indicating to others how items are prioritized.
Interaction designers often assume that users will want to sort items appearing in a list. Such an assumption confuses a want with a need. In many cases users don’t want to interact to get specific views of content. They may need different views, but don’t want unnecessary work to get those views. Consider the below screenshot, from a website devoted to user interface design patterns. A spreadsheet paradigm is imposed on web content. Audiences are given the option to sort content on any criteria, but have no guidance about which criteria are important.
This pattern is widely implemented. The user interface presents the illusion of control, but offers little to help audiences understand insights in the content. It presents unnecessary work for audiences, and provides an uncertain payoff. Screens like this continue getting made because sorting functionality is ubiquitous and easy to implement. Adding it seems cost-free. The publisher never did the hard work of asking why the audience wanted the information sorted. What value does a sort offer the audience? How can computers provide such sorting without requiring the audiences to specify it manually?
Sorting tends to be discussed as being a widget and labeling issue, debating the options that are possible, and the user confusion that can result from having those options. Instead of worrying about the clarity of the widgets and labels, a better approach is to make editorial decisions that remove that complexity, and focus on the value sorting can offer.
Sorting practices can be explored in terms of five goals they support:
Sorting to Locate
Sorting to Rank
Sorting to Sequence
Sorting to Sample
Sorting to Profile and Evaluate
While these approaches differ in emphasis, they share a common goal of prioritizing content by making a judgment about what’s important to the audience. Users prefer sorted content because it reduces the amount of content they need to view and consider. Sorting doesn’t remove content: it highlights certain content, allowing other content to be ignored. Sorting should make using content easier for audiences, which is why that task needs to be delegated to computers whenever possible.
Prioritizing by Index Value: Sorting to Locate
Indexes are markers used to locate content. With indexes, the chief role of sorting is “findability”.
Alphabetic sorting is the most widely used form of sorting. Though clearly useful, alphabetic sorting’s value is sometimes presumed when it has none. In many situations, alphabetic sorting merely provides a semblance of order without providing any true value beyond psychological comfort. The chief value of alphabetic sorting arises when the user already knows about an item and expects to find it on the list. It can help locate the item, and confirm its inclusion. Developers have use for reverse alphabetic sorting, but audiences rarely benefit from reverse alphabetic ordering. Except in rare cases, it makes no sense to offer audiences a choice of sorting either in ascending or descending alphabetical order.
Another kind of locational sorting involves sorting items into nominal (named) categories. For example, the fact checking website PoliFact identifies statements made by American politicians to see if they have “flipped” their previous position. One can sort statements according to whether is a statement is: 1. Partially (half) flipped; 2. Fully “flopped”; or 3. Not flipped. Such sorting helps audiences locate statements they might be interested in, to judge if a view is simply getting new attention, or whether it is a new position.
Prioritizing by Frequency: Sorting to Rank
Sorting is especially useful with quantitative values. Sorting can rank items on the basis of numeric values associated with the item. Sorting on the quantity determines the ordinal ranking.
In contrast to alphabetic sorting, descending order (from high to low) is frequently useful for quantitative values. People are often looking for content relating to the highest rated item, or best performing one.
In addition to sorting by explicit quantitive values such as price, less obvious applications of rank-based sorting exist that utilize hidden or implicit information. Behavioral values can summarize activity relating to the content, such as when articles are sorted by number of comments. The most common form of behavioral values are consumption-related popularity values, These come in many forms, and are widely used to sort content. Examples included sorting by:
Most frequently bought together
Another use of quantitative sorts is to create “buckets” that rank content in summary form. Buckets are derived data that are not explicitly visible in the content, where clustering (summarizing frequency) is performed in tandem with sorting. Publishers create buckets covering ranges of values to summarize how frequently items appear. Items are sorted into buckets defined by ranges (e.g., Number of items “below $100,” “$101 – 499,” “$500 and above”). When applied to the sorting of content, these intervals don’t need to be equal in size.
In addition to ranking by a single criterion, publishers can provide different ranking perspectives that consider alternative criteria. Alternative-criteria rankings can be complex, so care is needed to hide the complexity from audiences. Imagine that content addresses five products: A, B, C, D, and E. A consumer ratings website might sort-rank items according to different criteria:
Best picks for budget buyers: A, D
Best picks for power users: E , C
Best picks for novices: A, B
In some respects, this kind of sorting is similar to the sorting of columns in a spreadsheet. In a spreadsheet, one can rank alternative criteria by selecting different columns to reorder rows, to see how rankings change when various criteria are considered. What’s different here is that editorial choices are being made instead of forcing the reader to decide what’s important to focus on. Each category represents a theme rather than formal attribute. For example, what’s best for power users might be a combination of the number of features a product offers, and the extent the performance of key features are above average. That kind of score can be computed behind the scenes. Once items are ranked, only the top two items are presented, to keep the focus on what’s best in each category.
Another ranking pattern is the “Top N by category” pattern. Many publishers will curate content according to how the content ranks within different categories. A news site might present a list of top five articles in sports, in health, in politics and in business. The curatorial decisions relate to what categories are most important in the larger body of content, how many items (N) to show, and on what basis (number of views, comments, etc.)
A rank sorting can be applied to any content that involves a scale. While most often rankings are based on numeric scores, they can also be used with qualitative scales such as good, better, and best. Sorts based on qualitative ranking are known as enumerations.
Prioritizing by Time: Sorting to Sequence
Much content has a time dimension, and can be compared to other content according to various time frames. These comparisons are made through sequence-related sorting.
Chronological sorts use dates to sort content. They can be valuable in either ascending (earlier to later) or descending (recent to older) order. Ascending chronological order is useful when content items reference and build on each other. For example, I find live blogged stories easier to follow when they are in chronological order, since many statements will depend on what was said in prior statements. Many live blogs choose reverse chronological order, however. The benefit of reverse chronological order is to highlight the newest information, which is assumed to be more important than older information. Reverse chronological order works best when each item is independent of the others, so some live bloggers try to make their statements be understandable without referring to other content items.
An important way to sort content is by how stale or fresh it is. Computer scientists refer to stale content as LRU or “Least Recently Used,” where LRU content is considered obsolete and is purged from the computer’s cache memory. The concept of freshness is captured by a term borrowed from accounting known as LIFO or “Last In, First Out”. Content that is new or changed is generally more valuable than older content. People look for new content, or content that’s been updated.
LIFO is useful as a way to sort behavioral data. In many situations, the content someone most recently viewed will be the most likely content they will want to use again. Imagine someone constantly checking some pages relating to the stock performance of different companies they own or are considering buying. Because they routinely check these pages, the sorting would present these pages as a list ordered according to how recently they were viewed, with the most recently viewed at the top of the list. LIFO behavioral sorting is dynamic, changing as audience interests do, so that new items get added immediately, fading interests soon disappear from the list. It is more flexible, and less work, than having the audience create a custom list.
Sequences are a special kind of list where the order is predefined. Content relating to a sequence should be automatically sorted. Sequences can take various forms:
Procedural with antecedent dependencies, such as “Step 1, Step 2”
A big use of sequences is to position content in time, providing a context for the sorted content. Three kinds of content sequences are:
Before/After sequences position a topic within a spectrum of time. When items are classified according to their stage, they can be compared. Items at the same stage are similar, and one can locate content about events that preceded that stage, and identify content about other entities that are in a more advanced stage. An example of such a sequence sorting would be articles about a class of new drugs, where different companies are in different stages of market introduction.
Now/Next sequences are similar to Before/After, but are more focused on content relating to a single person or entity, rather than a range of entities. Content can be sorted according to what matches the current context, and list content that will be relevant to the subsequent stage. An example of Now/Next sorting is content about repairing a product.
Lower/Higher sequences define time in terms of proficiency required. The content is ranked according to the abilities of the reader. Publishers frequently classify content according to the level of expertise needed to understand the content. Content might be classified as Beginner, Basic, Intermediate, Advanced, or Expert. The sequence associated with those labels is generally easy to understand. Alternatively, the publisher could rank the difficulty of the content according to a color code:
White belt (= Beginner)
Yellow belt (= Basic)
Blue belt (= Intermediate)
Purple belt (= Advanced)
Black belt (= Expert)
Such rankings can be useful provided the audience knows the level they are at. They might graduate from one tier after viewing all content in that tier. The content sorting might identify and show all Yellow belt content that the reader has not yet seen.
Prioritizing by Novelty: Sorting to Sample
While sorting is generally thought of as involving either ascending or descending order, publishers can also sort items in a statistically random order. This ensures that items presented are unique each time a page loads.
For audiences, random sorting provides novelty, offering something that may not have been encountered previously. Random sorting and selection can make content viewing more interesting provided the item pool represents content of potential interest.
Random presentation of content can be used to discover if certain content receives greater than expected attention. Publishers might discover through random promotion of content that audiences are interested in topics that previously did not receive much attention when prioritized by frequency of views.
Prioritizing by Relationships: Sorting to Profile and Evaluate
Sometimes audiences need to sort content to see the relationships between items. One example would be to see what’s more general and what’s more specific. For example, Wikipedia uses a hierarchy based on categories, subcategories, and pages. The entry for a category will be broader in scope than an entry for a page that’s not a category.
For many topics, audiences understand which items are broader than others. But for more specialized fields, automated sorting of topics from broader to narrower is useful. Suppose someone encountered content relating to enzymes. They see content on the following topics:
The list has no order. Unless the reader is a specialist, they would not know which topic is the most specific. The sorting order from broader to narrower would be Digestives > Acid preparations > Betaine hydrochloride. Hierarchies provide context for the content, indicating what is background and what is detail.
Websites typically present hierarchies to audiences as an input into a task to complete. The website will require audiences to assess visual relationships in a menu, and select the narrower option in a manual process of “drilling down”. Alternatively, publishers can incorporate such sorting into an automated presentation of content, where the content order is predetermined for the audience. Two common patterns are to show broader topics followed by narrower examples, and to show a narrow example then present the broader topic it represents. Instructional content often utilizes non-visible, programmatic hierarchies to guide presentation of content, to ensure comprehension, or to encourage the review of key concepts.
Nested sorts involve sorting on two or more criteria, such as sorting on a qualitative category and a numeric value together. They are difficult for users to specify themselves, so are best offered pre-packaged. Nested sorts are useful for dynamic content that will change frequently. PolitiFact, the Pulitzer-Prize winning website, rates statements by politicians according to their veracity. A nested sort allows the audience to see how many statements by a politician were associated with different categories of truthfulness:
Pants on Fire!
Automated Editorial Sorting
Content designers and content engineers should consider how computers can order lists to deliver the greatest audience benefits. This approach can be described as automated editorial sorting. It is automated, in the sense that a computer algorithm performs the sort without requiring user interaction. And it is editorial in the sense that prioritization delivered by the sort reflects a judgment concerning what information is most valuable to highlight.
Sorting should not be treated as generic functionality that can be applied indiscriminately to any kind of content. Sorting should provide context for audiences. To be valuable, sorting should surface the content that audiences consider to be their highest priority. It is not enough to give audiences tools to dig out that information themselves. Audiences expect publishers to anticipate what they need, and present content to them in a well ordered manner.
I recently visited the Smithsonian’s American History Museum to see an exhibit on food in American culture. I noticed a Tappan microwave oven in an exhibit case, the kind of microwave that was in use during my early childhood. If I ever needed evidence that I’m getting older, it’s seeing something from my childhood in the Smithsonian’s collection. My family didn’t own a Tappan microwave, but I recall a neighboring family did. When it came to microwave technology, my family wasn’t what, in today’s parlance, would be called an early adopter.
We take microwaves for granted today, but in the early years of microwaves they were exotic. They were radically different from conventional ovens, and expensive: originally over a thousand dollars. Selling something so “disruptive” to families required making them seem enticing, simple, and idiot-proof. The product needed to promise to be easy, and deliver on that promise. We all want to feel competent, even when using a microwave oven. We don’t want exploding liquids or gooey muck being our payback for committing to a new technology.
In addition to its historical cultural significance to the Smithsonian’s curators, this particular model was also notable for an unusual feature not normally seen on ovens of any kind. At the base of the microwave was a drawer that contained recipe cards. I started to wonder if the designers included the recipe card drawer as content marketing to get hesitant shoppers to buy the microwave, or as product content designed to make sure owners get full satisfaction from their decision.
Content marketing and product content are two widely used terms that are sometimes applied to similar circumstances. Are they distinct ideas, do they overlap, or are they fundamentally the same thing?
Let’s consider another example that’s more current. Last week I got a sample of shampoo to try out. Unlike the normal shampoo I use from the same brand, this shampoo involved two parts. That doesn’t sound too challenging: I just needed to figure out which part to use first. As I’m about to step into the shower, I open the package and see instructions. Fortunately they weren’t long, as I wasn’t wearing my eye glasses at this point. I can see the instructions mention the phrase “apply vigorously”. Every time I’ve ever read instructions for shampoo or any other soap they implore me to apply the stuff vigorously. The instructions seemed to convey no information worth noting. However, at the end of the instructions is a call-to-action telling me to go to a website to watch a video that provides more detailed information on how to use the product. I suppose some people have waterproof tablets to watch videos in their showers these days, but I again am not an early adopter in this area. A week later, I have half a container of Part Two left over, while Part One is finished. I still have not watched the video.
Is the video for the shampoo content marketing encouraging me to try the product, or product content telling me how to use the product?
In the view of some people, trying to make a distinction between content marketing and product content is counterproductive. They will recommend integrating the pre-sales and post-sales experiences. Many people who develop instructional information for products argue that this content is increasingly important to customer purchase decisions. I agree that many synergies are possible between content focused on pre-sales needs, and those focused on post-sales needs. But I don’t believe we can yet declare that distinctions between pre- and post-sales content have disappeared.
Historically, there was a clear division between content to support marketing and content to support product use. Marketing content made people want the product, while support content told people how to use the product. The terms content marketing and product content have emerged over the past decade to address new priorities. Products and services can involve a growing number of features that consumers expect will work together to solve their high level problems and make their lives more enjoyable. Consumers expect proof for outcomes promised, and to understand differences between choices offered. Content marketing focuses on promoting the value of using a product or service in the context of the customer’s life situation, instead of making vague promises or touting meaningless advantages as was traditional in marketing content. Product content highlights choices and options available, instead of having a remedial focus as customer service content historically has. Both content marketing and product content aim to be useful to customers, but they still have distinct roles.
Content marketing is still largely focused on pre-sales, or in encouraging repeat sales. Content marketing collateral is generally distributed and accessed separately from the product or service it concerns. Product content — any information relating to specific decisions customers must make relating to product features — will frequently occur after the purchase, or at least very near the time of purchase. Often, product content is embedded in the product itself, rather than being accessed separately. In the case of services, product content is often integrated in smartphone apps that let customers use the service, and choose options.
Two critical questions face corporations today:
When is the content accessed?
Where is it accessed?
When and where content is accessed has become more murky because sales is increasingly a process, rather than a discrete event. In the past, the period before the sale, and after the sale, were distinct. Today, sales is an ongoing process of evaluation. Companies may sell platforms on which to sell additional products and services, such as when Amazon’s Kindle displays ads for book titles it promotes. Customers need to configure products prior to purchase, and may reconfigure them after becoming a customer. Many digital products are sold as services that have a limited duration, and must be renewed. Many products are sold on a trial basis, where customers can try before they commit to buying. A growing range of content can be embedded in product user interfaces or service apps, but often companies need to rely on email and web channels to communicate, educate, and complete transactions. The product is not always the ideal channel for the audience to consider the content.
These questions don’t have predefined answers. They require thinking deeply about the ultimate purpose of the content. Even if content can support multiple goals, helping existing customers use a service while encouraging them to expand their usage, it doesn’t follow that all these goals be given equal emphasis. When the same content seems like it exists to serve several different purposes, it can confuse both audiences, and stakeholders in organizations.
Let’s return to the example of the video explaining the shampoo. I initially wasn’t aware the video existed. The content wasn’t in the right channel for me to access it when I became aware of it. I wasn’t clear if it was promotional content, or truly instructional content. I didn’t know if I needed to see it before using the product, while using the product, or perhaps after using the product.
The content’s purpose also impacts how organizations divide responsibility for the content. Who was responsible for the video, marketing or customer service? Sometimes it’s not obvious who should own the content, because organizations can’t dictate to customers what to do. I routinely get a message from a cloud service that I’m approaching a storage limit, and can buy more storage. But I may wish instead to learn how to reduce my usage of storage, rather than hear about how I can get more of it. I’m annoyed that I seem to be hitting the limit, since I’m not aware I’m using the service that much. This is a common situation, where companies look to up-sell at a moment when customers are starting to doubt the value of the service itself. There’s a mismatch of views about the purpose of content needed.
When designing content, companies must always be clear about the customer’s purpose. Even though good support content can increase customer loyalty, support content is not the same as marketing content. Customers have different purposes when looking at marketing content and support content. They want it at different times, and often through different channels. Both content marketing and product content are becoming more user focused. These content types are inter-related and should be coordinated. Yet content marketing and product content still serve distinct roles, and it’s important to offer the right details at the right time in the right channel. Be wary of those who repeat the slogan that all content is marketing content: they are likely to deliver the wrong content to audiences.