Categories
Intelligent Content

Wine, Content, and Domain Models

Suppose your organization wants to become the preeminent source of information about a topic. It aims to give audiences the ability to look at any dimension of a topic they might be interested in. How would you offer this?

To deliver informationally rich content, numerous content items need to be associated to one another. Content needs to be modular, with components that work together. But how do these things relate to each other? Where does one start?

Content models define how units of content should interact. Content modelling can be difficult to grasp and practice, partly because it is not a single uniform method. It encompasses a spectrum of related approaches that can be adapted to different needs.

People sometimes start to model their content before they know all the content they really need. They focus on what content has been already created, and not explore what content is not yet available that might be of interests to users.

Content models are often more robust when they are backed by a domain model. A domain model enables content designers to untangle a messy topic and explore and define requirements and design solutions.

The role of content modelling

A content model is the end goal of a domain model. Rachel Lovinger has been instrumental in developing and advocating the practice of content modelling, so I will rely on her definition. She states: “A content model documents all the different types of content you will have for a given project. It contains detailed definitions of each content type’s elements and their relationships to each other.” She recommends using content models to bridge perspectives on a team.

“A content model helps clarify requirements and encourages collaboration between the designers, the developers creating the CMS, and the content creators.” — Rachel Lovinger

In addition to facilitating project delivery, content models improve how content is delivered to audiences. Content models can enable personalization, adaptive content, and content APIs. Cleve Gibbon, a collaborator of Rachel Lovinger in evangelizing content models, notes: “Great APIs are founded upon solid models. So if you’re building a Content API, be sure to create a content model FIRST that conveys the required level of structure and meaning.”

The spectrum of content modelling

Models can represent different dimensions of a topic: either conceptual, or formal and structural. Content models can indicate how to assemble content components. But first one needs to know how solidly your content types are defined.

On one end of the spectrum, you may have well defined, fixed content. In such cases, one can develop what Deane Barker calls a relational content model. He defines it as “the concept of how different, separately-managed pieces of content relate to each other.  (This is distinct from ‘discrete content modeling,’ which is how you structure a single piece of content.” He explains the goal as “the idea of taking multiple discrete content objects (articles, sections, issues) and ‘rolling them up’ into a more complex content object (publication).”

On the other end of the spectrum, you may have fluid content, where the exact requirements are still emerging and many different hubs of content are possible. In such cases, a domain focused, ontology based form of modelling can be helpful. This approach has been used by the BBC for several large projects. Mike Atherton emphasizes the importance of the domain of the topic in content models: “A content model maps our subject domain, not our website structure.” He advises: “Concentrate on modelling real (physical and metaphysical) things not web pages.”

One way to consider the differences in a content model and a domain model is the metadata they emphasize. Rachel Lovinger states: “The Content Model is primarily concerned with structural metadata, while the Domain Model is largely concerned with descriptive metadata.”

A domain model and content model are complementary. A domain model helps you describe things that will be represented by content, while the content model helps you structure the content. Using both allows you to understand the relationship of a real world entity with a content entity.

A domain model is a useful place to start when content does not yet exist, or one is looking for a fresh redesign of content. Domain models may be considered as the prequel to content models. By focusing on entities in the real world, and the relationships between these entities, one can see opportunities to develop content associated with these entities, and what elements would be needed for that content. The correspondence of domain entity type, and content type, is illustrated in the table.

The relationship between a domain model and content model
The relationship between a domain model and content model

Domain models in the real world: Italian wine

Domain models can clarify one’s understanding of a topic, and offer insights into how different items of information relate to each other. Domain modelling emerged as strategy in software development to bridge analysis and design of complex business domains by using a shared verbal and visual language between experts, endusers and developers. Domain models can be especially useful for complicated and messy topics. They would seem perfect for understanding Italian wine.

When you live in Italy, as I do, understanding Italian wine is a practical problem. Wine is ubiquitous, but understanding Italian wine is not self-evident. Walk into an Italian wine store and you are confronted with walls of bottles whose contents are largely unrecognizable. It’s not that all wine is difficult to understand. When I lived in New Zealand, I had a good idea what different wines were about. It’s Italian wine that is the challenge.

The famous wine critic Hugh Johnson once wrote: “the already bewildering complexity of Italian wines has become tangled enough to drive a critic to drink.” Italian wine is particularly hard to understand because of its heterogeneity. Even the imposition of standardized nomenclature to designate where a wine is from results in a bewildering array of non-standard implementations of these standards. Idiosyncratic traditions, politics, and rogue approaches mean that wines are described in great detail, but in richly differing ways.

At the core of why Italian wine is difficult to decipher is its product architecture: how specific wines are labelled. Consumers need an easy way to know the basic characteristics of a wine based on its label. Do consumers think of wine in terms of where it’s from (Burgundy) or what grapes it is made from (Chardonnay)?[1]  High volume wine producers have attempted to solve the product architecture problem by promoting brand awareness of a grape variety or a region. What happens when consumers are not familiar with either the origin name or the grape name?

Unfortunately, Italian wine labels are uncharacteristically difficult to decipher. Italian labels will show the producer + (grape variety and/or geographic indication) + year. That seems reasonable enough, until the consumer realizes that the only items on the label they might recognize are the digits of the year. Even if they have familiarity with another proper name on the label, that is not sufficient to make a selection decision.

The most significant piece of information about the kind of wine is indicated by the grape variety and/or the geographic indication (a regional designation similar to an appellation in France). Between these two items, there are nearly 1000 different varietals and zones that indicate the basic composition of the wine. [2]  To get a sense of how good the wine is, the most reliable information is the producer, and the year of vintage. Yet there are many thousands of wine producers in Italy of varying abilities, and the correlation of product quality to year of vintage is very specific to the variety of wine and where it was produced.

The complexity of Italian wine would seem tailored for digital content. But existing digital-only information sources on the web tend to be shallow — both in terms of their range of attributes, and their selective coverage.

Good information about wines, producers, and regions are available from several well known printed guides, such as those by L’Espresso, Gambero Rosso, Touring Club, Bibenda, and Slow Food. Despite the editorial quality of the content, the information is not as usable as it could be. Depending on the specific organization of the book, the information is stovepiped in one way or another. The editors of each guide assumes a fixed path of entry that generally leads to a producer profile. Users are expected to think like the editors to uncover information of interest to them.

In some cases there are iPad versions of these printed guides, but they don’t feel natively digital, and require lots of tapping to move around from screen to screen. They are less usable than the print version, because they are slower to move through, and one’s orientation can get lost when hopping between screens. The content, while structured editorially, is not structured digitally with digital metadata. There is no ability to move laterally through the content: navigation is hierarchical. Unfortunately shovelware that ports a printed product and dumps it into a tablet format is too common, due to the false promises embedded in Adobe InDesign.

What users need is not simply a catalog of items, but a way to make sense of the bigger picture, in addition to exploring the detail. The heavy focus on profiles means that the user doesn’t see easily how these items relate to other things. They also miss seeing collective behaviors of similar items, which is possible when one digitally aggregates items sharing the same metadata. Thinking through these relationships and behaviors is one benefit of domain modelling.

Understanding the domain

How do people think about a subject? Mike Atherton suggests: “Experts map the world, users mark points of interest.” It helps to know how experts think about a topic like wine, and then during design, figure out what more typical users consider high priority goals. What aspects of wine do people consider significant? How might different aspects be pulled together into interesting items of content?

The topic of wine is distinctive because many people want to become experts, in contrast to other products. Getting information about the product is rarely a perfunctory task, but a connoisseurial pastime. Some people want to develop a broad knowledge about all styles of wine, while other people want to have a deep knowledge about a few specific producers or product areas, perhaps tied to places they go on holiday. Many things people might be interested in are non-obvious. For example, soil characteristics can influence how a grape variety tastes. Others may be interested in the environmental credentials of a producer.

How to break things down so they can be managed

The most important task when developing a domain model is to identify appropriate entities. An entity is a thing, either tangible or conceptual, with a distinct identity. It’s not the same as an existing item of content — the content may not exist yet. Entities, to use the words of Cleve Gibbon, are “first class citizens in the business domain” — they are the actors in the drama on the stage.

Entities have attributes — characteristics. Attributes do not necessarily become a field in the content, but they often do. That decision needs to be made when the content is designed. Taste is certainly an attribute of wine, but is not necessarily a field in a description of a wine.

Once entities have been identified, it is necessary to determine where to put attributes, and whether to break entities into smaller units. Often, one discovers intermediate zones that straddle two entities. The horticultural characteristics of the vineyard reflect the interaction of the producer and the wine produced. The interplay between region and varietal defines the vintage for a given year. These intermediate areas may not deserve to be entities themselves, but one should consider how to make sure their role remains visible.

What a domain model for Italian wine looks like

It is helpful to first consider the relationships between entities, then examine the attributes associated with each entity.

When looking at entities, two things are important. First, how many instances are there for each entity type? The entity map shows that most of the entities, there are hundreds or even thousands of instances. This large number suggests that establishing meaningful relationships between entities will be important if users are to be successful navigating through such a large volume of content. Second, what is be essential character of relationships between entities? We want to know how many connections there are between entities: the more connections to other entities, the richer the potential interaction of information. We also want to know if the relationship between entities is a one-to-one relationship, a one-to-many relationship, or a many-to-many relationship. The “crow’s feet” in our entity map indicates numerous many-to-many relationships. That may make the design of content a bit more challenging, but it also indicates many interesting connections. Our content is a valuable resource when it’s not easy to see these connections in one’s head.

Relationships between different entity types associated with the domain of Italian wine.
Relationships between different entity types associated with the domain of Italian wine.

Next, explore the attributes associated with each entity. The goal is to identify and associate attributes of entities. Each entity has a number of attributes. Some will be short fields, others will involve longer text descriptions. There is no right number of attributes, provided all attributes are meaningful. The number of attributes to implement in design will depend on both business and design decisions. There will be a business decision concerning the cost of acquiring the information related to the attribute, and the usefulness to consumers of that information. There will also be a design decision relating to which attributes to expose to which audiences.

Typical attributes of each entity type relating to domain of Italian wine
Typical attributes of each entity type relating to domain of Italian wine

Our model shows attributes that are commonly associated with the domain of Italian wine. For example, it can be interesting to know the number of bottles produced of a wine. That can indicate how widely available the wine is to buy, or perhaps its scarcity (that one needs to reserve purchase). Some wine guides will indicate the total number of bottles according to producer, while others will indicate total number of bottles by label. This difference means that one can answer different questions, such as who is the largest producer within a geographic indication zone, or who is the largest producer of a specific kind of wine. Ideally, one would like data at both the producer and product levels, but that may not be easy to obtain for all producers.

Lessons from domain modelling

Even though domain modelling attempts to represent the real world, reality is often less orderly than we would like it to be.

Not everything can be easily expressed as a regularized attribute. Audiences will want to know: What does the wine taste like? It would be wonderful to provide a reliable, easy-to-understand way to explain taste that allows easy comparison between wines, zones, and producers. Sadly, taste is — surprise — a bit subjective. Different experts will say different things about the same wine, even when they agree on an overall judgment. Terminology is not standard either. The same words can mean different things. Critics may use the word “cherry” to describe a taste as “spicy black cherry” or as “cherry rhubarb.” There is no controlled vocabulary for wine, no limited set of descriptors with precisely defined and agreed meanings.

Example of geographic designation zones within a single region.  Screenshot from a certification body website.
Map of geographic designation zones within a single region. Screenshot from a certification body website.

By their nature, models simplify reality. The geographic indication signifies where a wine is made, and the criteria by which it is made. Whereas most geographical entities are based on either political administrative geography or physical geography, geographic indications exist outside these frameworks. A geographic indication can straddle two administrative regions. It can exist in two different, discontinuous locations. Some geographic indication zones have subzones. Wine producers also can behave in complex ways. Sometimes a wine producer is a brand “house” that has vineyards in several locations, or a consortium that sources from different vineyards. The informational details associated with these exceptions may not be important to users, and can add design complexity.

The identity of items can be constructed in several ways. One needs to be able to distinguish one entity from others belonging to the same entity type — items need to be uniquely identified. Despite the challenges of deciphering Italian wine, specific entities fortunately are identified with meaningful, human readable names, rather than numeric product codes. The domain model can use existing identifiers, which are based on several approaches:

  • Collectively defined names (the names of regions, geographic indications, and grape varieties), though some producers use alternate names for grape varieties.
  • Self described (the name of producer), though sometimes producers choose to use both a house and proprietor name
  • Inherited identity (the environmental profile for a producer)
  • Names composed of compound attributes , such as dry sparkling rosato as a wine category entity.

Thinking about design

The domain model can support early design discussions. Many questions that are interesting to audiences will span two or more different entities. For example:

  • What year produced the best wine from a region?
  • What geographic indication commands the highest average prices?
  • What grape varieties produce the most wine?
  • What wines for a given year and geographic designation are ready to drink?

Some answers require computations of structured data. Questions of interest to audiences need to be translated into content types that will be represented in the content model.

In addition to supporting interesting exploration, the design needs to support common tasks. The domain model helps to identify information available to support common tasks. Some common points of entry audiences will seek when exploring wine include:

  • By rating
  • By price
  • By category
  • By variety

Users often focus on one specific criteria when starting the process of seeking information. In some cases, these are entities, in others, these are attributes. Considering task starting points can help identify potential groupings of content elements. Depending on the depth of content, these groupings may not be manageable for users without providing additional parameters to narrow the pool of candidate content. The most salient criteria is not the only factor that’s important to the user.

In contrast to starting points, another perspective is to consider the end goal of the task. Examining the end goal, the content designer can consider the orientation of different users. Users of wine information may be:

  • Bottle centric — interested in the characteristics of specific bottles of wine
  • Producer centric — interested in the story of the producer, perhaps with an intention of visiting them
  • Food centric — mostly interested in wine styles as a complement to food dishes.

Domain depth and domain scope

The depth of a domain reflects both the number of attributes for an entity type, and quantity of items. Both aspects can impact the design. The quantity of items will influence content types that presents lists and links. The number of attributes will impact content type structures for content items.

Content designers decide how much of the domain model to present to users. A fixed content type may show all attributes as part of in content type. With a flexible content type, attributes may be optionally available, or have serval variations. Designers may choose progressive disclosure of content that hides details, which are revealed only when wanted. Or they may implement an adaptive approach, where different variations of content types are shown depending on the interests of an audience segment, or device formats.

The other aspect of the domain model, thus far unmentioned, is how it might connect with other domains. The domain model offers the possibility of enlarging the scope addressed by considering related domains. Different variations of content may draw on common content, while including different content as well (see diagram). Three different apps may share common core content. But they provide different functionality depending on their focus (touring vineyards, pairing wine with food, or knowledge enhancement of wine). The domain model can also be used to guide the planning of releases of content and functionality.

Relationship between the depth of a domain, and its scope.  Content can be deep, covering many attributes.  And content can we wide, connecting with other domains.
Relationship between the depth of a domain, and its scope. Content can be deep, covering many attributes. And content can we wide, connecting with other domains.

Relating entities: Comparisons to other approaches

Domain modelling is not the only approach to sorting through complex content. Before closing this discussion, it is worth talking about two other well known approaches that look and behavior similarly, but have some differences.

Faceted search, an approach popular in library science and information architecture, allows users to locate specific content by filtering on facets. Facets can be attributes or entities. The idea is that users can locate content that has the qualities of A & B & C. Faceted search is a popular technique, common on ecommerce sites, and is often helpful. The utility of the technique rests on several assumptions. First, faceted search assumes users know the two to four most important criteria, and will get a manageable set of results. If the set of results is large, users generally take a satisficing approach, happy with the first result encountered that is minimally acceptable. Second, faceted search presumes that each facet is independent of each other, which in the case of wine isn’t true. It is possible to get null sets if facets aren’t deep. While faceted search has been implemented on some wine ecommerce sites, it is not an effective approach for helping users discover content they might be interested in but not know about, and tends to focus on a limited range of aspects.

Linked data is an approach to modelling content that has close associations to domain modelling, thanks to the BBC’s integration of the two approaches. To simplify, linked data allows users to find content with characteristic A that has B, which has C. Organizing content using a linked data approach has both benefits and drawbacks. One drawback is that queries can be path dependent. Whether results appear promising or discouraging depends on how you construct the query. Linked data queries are generally more open ended than predefined structured queries that answer fixed questions with predictable sets of results. A bigger concern is that linked data treats all aspects of an entity as other entities, and each entity gets its own page. But not all attributes are meaningful entities — things worthy of their own content destination. On the positive side, linked data is good for what-else questions. One can link outside of a domain to other domains, such as to geophysical data.

Model behavior

Models aren’t reality, according to the cliche. Domain models may appear esoteric to some people, given that they aren’t actually something implemented directly, but are an input to other deliverables. To get buy-in for domain models, it may be best to use it as a discussion document, and note that it will evolve into the content model. While it l lacks the appeal of being code-ready, a domain model can play an important role on a project. It can uncover hidden requirements and opportunities, help align different stakeholders around a common vision, and accelerate the design process.

— Michael Andrews


  1. Chardonnay grapes originated in Burgundy. Even though most people associate Burgundy with red wine, there are also white Burgundy wines made from Chardonnay.  ↩
  2. A canonical list of varietals and zones is available from the databases of the intergovernmental wine organization OIV http://www.oiv.int/oiv/info/enbasededonneesIG  ↩

 

Categories
Intelligent Content

Data Types and Data Action

We often think about content from a narrative perspective, and tend to overlook the important roles that data play for content consumers. Specific names or numeric figures often carry the greatest meaning for readers. Such specific factual information is data. It should be described in a way that lets people use the data effectively.

Not all data is equally useful; what matters is our ability to act on data. Some data allows you to do many different things with it, while other data is more limited. The stuff one can do with types of data is sometimes described as the computational affordances of data, or as data affordances.

The concept of affordances comes from the field of ecological psychology, and was popularized by the user experience guru Donald Norman. An affordance is a signal encoded in the appearance of an object that suggests how it can be used and what actions are possible. A door handle may suggest that is should be pushed, pulled or turned, for example. Similarly, with content we need to be able to recognize the characteristics of an item of data, to understand how it can be used.

Data types and affordances

The postal code is an important data type in many countries. Why is it so important? What can you do with a postal code? How people use postal codes provides a good illustration of data affordances in action.

Data affordances can be considered in terms of their purpose-depth, and purpose-scope, according to Luciano Floridi of the Oxford Internet Institute. Purpose-depth relates to how well the data serves its intended purpose. Purpose-scope relates to how readily the data can be repurposed for other uses. Both characteristics influence how we perceive the value of the data.

A postal code is a simplified representation of a location composed of households. Floridi notes that postal codes were developed to optimize the delivery of mail, but subsequently were adopted by other actors for other purposes, such as to allocate public spending, or calculate insurance premiums.

He states: “Ideally, high quality information… is optimally fit for the specific purpose/s for which it is elaborated (purpose–depth) and is also easily re-usable for new purpose/s (purpose–scope). However, as in the case of a tool, sometimes the better [that] some information fits its original purpose, the less likely it seems to be repurposable, and vice versa.” In short, we don’t want data to be too vague or imprecise, and we also want the data to have many ways it can be used.

Imagine if all data were simple text. That would limit what one could do with that data. Defining data types is one way that data can work harder for specific purposes, and become more desirable in various contexts.

A data type determines how an item is formatted and what values are allowed. The concept will be familiar to anyone who works with Excel spreadsheets, and notices how Excel needs to know what kind of value a cell contains.

In computer programming, data types tell a program how to assess and act on variables. Many data types relate to issues of little concern to content strategy, such as various numeric types that impact the speed and precision of calculations. However, there is a rich range of data types that provide useful information and functionality to audiences. People make decisions based on data, and how that data is characterized influences how easily they can make decisions and complete tasks.

Here are some generic data types that can be useful for audiences, each of which has different affordances:

  • Boolean (true or false)
  • Code (showing computer code to a reader, such as within the HTML code tags)
  • Currency (monetary cost or value denominated in a currency)
  • Date
  • Email address
  • Geographic coordinate
  • Number
  • Quantity (a number plus a unit type, such as 25 kilometers)
  • Record (an identifier composed of compound properties, such as 13th president of a country)
  • Telephone number
  • Temperature (similar to quantity)
  • Text – controlled vocabulary (such as the limited ranged of values available in a drop down menu)
  • Text – variable length free text
  • Time duration (number of minutes, not necessarily tied to a specific date)
  • URI or URN (authoritative resource identifier belonging to a specific namespace, such as an ISBN number)
  • URL (webpage)

Not all content management systems will provide structure for these data types out of the box, but most should be supportable with some customization. I have adapted the above list from the listing of data types supported by Semantic MediaWiki, a widely used open source wiki, and the data types common in SQL databases.

By having distinct data types with unique affordances, publishers and audiences can do more with content. The ways people can act on data are many:

  • Filter by relevant criteria: Content might use geolocation data to present a telephone number in the reader’s region
  • Start an action: Readers can click-to-call telephone numbers that conform to an international standard format
  • Sort and rank: Various data types can be used to sort items or rank them
  • Average: When using controlled vocabularies in text, the number of items with a given value can be counted or averaged
  • Sum together: Content containing quantities can be summed: for example, recipe apps allow users to add together common ingredients from different dishes to determine the total amount of an ingredient required for a meal
  • Convert: A temperature can be converted into different units depending on the reader’s preference

The choice of data type should be based on what your organization wants to do with the content, and what your audience might want to do with it. It is possible to reduce most character-based data to either a string or a number, but such simplification will reduce the range of actions possible.

Data verses Metadata

The boundary between data and metadata is often blurry. Data associated with both metadata and the content body-field have important affordances. Metadata and data together describe things mentioned within or about the content. We can act on data in the content itself, as well as act on data within metadata framing the content.

Historically, structural metadata outside the content played a prominent role indicating the organization of the content that implied what the content was about. Increasingly, meaning is being embedded with semantic markup within the content itself, and structural metadata surrounding the content may be limited. A news article may no longer indicate a location in its dateline, but may have the story location marked up within the article that is referenced by content elsewhere.

Administrative metadata, often generated by a computer and traditionally intended for internal use, may have value to audiences. Consider the humble date stamp, indicating when an article was published. By seeing a list of most recent articles, audiences can tell what’s new and what that content is about, without necessarily viewing the content itself.

Van Hooland and Verborgh ask in their recent book on linked data: “[W]here to draw the line between data and metadata. The short answer is you cannot. It is the context of the use which decides whether to considered data as metadata or not. You should also not forget that one of the basic characteristics of metadata: they are ever extensible …you can always add another layer of metadata to describe your metadata.” They point out that annotations, such as reviews of products, become content that can itself be summarized and described by other data. The number of stars a reviewer gives a product, is aggregated with the feedback of other reviewers, to produce an average rating, which is metadata about both the product and the individual reviews on which it is based.

Arguably, the rise of social interaction with nearly all facets of content merits an expansion of metadata concepts. By convention, information standards divide metadata into three categories: structural metadata, administrative metadata and descriptive metadata. But one academic body suggests a fourth type of metadata they call “use metadata,” defined as “metadata collected from or about the users themselves (e.g., user annotations, number of people accessing a particular resource).” Such metadata would blend elements of administrative and descriptive metadata relating to readers, rather than authors.

Open Data and Open Metadata

Open data is another data dimension of interest to content strategy. Often people assume open data refers to numeric data, but it is more helpful to think of open data as the re-use of facts.

Open data offers a rich range of affordances, including the ability to discover and use other people’s data, and the ability to make your data discoverable and available to others. Because of this emphasis on the exchange of data, how this data is described and specified is important. In particular, transparency and use rights issues with open data are a key concern, as administrative metadata in open data is a weakness.

Unfortunately, discussion of open data often focuses on the technical accessibility of data to systems, rather than the utility of data to end-users. There is an emphasis on data formats, but not on vocabularies to describe the data. Open data promotes the use of open formats that are non-proprietary. While important, this focus misses the criticality of having shared understandings of what the data represents.

To the content strategist, the absence of guidelines for metadata standards is a shortcoming in the open data agenda. This problem was recognized in a recent editorial in the Semantic Web Journal entitled “Five Stars of Linked Data Vocabulary Use.” Its authors note: “When working with data providers and software engineers, we often observe that they prefer to have control over their local vocabulary instead of importing a wide variety of (often under-specified, not regularly maintained) external vocabularies.” In other words, because there is not a commonly agreed and used metadata standard, people rely on proprietary ones instead, even when they publish their data openly, which has the effect of limiting the value of that data. They propose a series of criteria to encourage the publication of metadata about vocabulary used to describe data, and the provision of linkages between different vocabularies used.

Classifying Openness

Whether data is truly open depends on how freely available the data is, and whether the metadata vocabulary (markup) used to describe it is transparent. In contrast to the Open Data Five Star frameworks, I view how proprietary the data is as a decisive consideration. Data can be either open or proprietary, and the metadata used to describe the data can be based either on an open or proprietary standard. Not all data that is described as “Open” is in fact non-proprietary.

What is proprietary? For data and metadata, the criteria for what is non-proprietary can be ambiguous, unlike with creative content, where the creative commons framework governs rights for use and modifications. Modification of data and its metadata is of less concern, since such modifications can destroy the re-use value of the content. Practicality of data use and metadata visibility are the central concerns. To untangle various issues, I will present a tentative framework, recognizing that some distinctions are difficult to make. How proprietary data and metadata is often reflects how much control the body responsible for this information exerts. Generally, data and metadata standards that are collectively managed are more open than those managed by a single firm.

Data

We can grade data into three degrees, based on how much control is applied to its use:

  1. Freely available open data
  2. Published but copyrighted data
  3. Selectively disclosed data

Three criteria are relevant:

  1. Is all the data published?
  2. Does a user need to request specific data?
  3. Are there limits on how the data can be used?

If factual data is embedded within other content (for example, using RDFa markup within articles), it is possible that only the data is freely available to re-use, while the contextual content is not freely available to re-use. Factual data cannot be copyrighted in the United States, but may under certain conditions be subject to protection in the EU when a significant investment was made collecting these facts.

Rights management and rights clearance for open data are areas of ongoing (if inconclusive) deliberation among commercial and fee-funded organizations. The BBC is an organization that contributes open data for wider community use, but that generally retains the copyright on their content. More and more organizations are making their data discoverable by adopting open metadata standards, but the extent to which they sanction the re-use of that data for purposes different from it’s original intention is not always clear. In many cases, everyday practices concerning data re-use are evolving ahead of official policies defining what is permitted and not permitted.

Metadata

Metadata is either open or proprietary. Open metadata is when the structure and vocabulary that describes the data is fully published, and is available for anyone to use for their own purposes. The metadata is intended to be a standard that can be used by anyone. Ideally, they have the ability to link their own data using this metadata vocabulary to data sets elsewhere. This ability to link one’s own data distinguishes it from proprietary metadata standards.

Proprietary metadata is one where the schema is not published or is only partially published, or where the metadata restricts a person’s ability to define their own data using the vocabulary.

Examples

Freely Available Open Data

  • With Open Metadata. Open data published using a publicly available, non-proprietary markup. There are many standards organizations that are creating open metadata vocabularies. Examples include public content marked up in Schema.org, and NewsML. These are publicly available standards without restrictions on use. Some standards bodies have closed participation: Google, Yahoo, and Bing decide what vocabulary to include in Schema, for example.
  • With Proprietary Metadata. It may seem odd to publish your data openly but use proprietary markup. However, organizations may choose to use a proprietary markup if they feel a good public one is not available. Non-profit organizations might use OpenCalais, a markup service available for free, which is maintained by Reuters. Much of this markup is based on open standards, but it also uses identifiers that are specific to Reuters.

Published But Copyrighted Data

  • With Open Metadata. This situation is common with organizations that make their content available through a public API. They publish the vocabularies used to describe the data and may use common standards, but they maintain the rights to the content. Anyone wishing to use the content must agree to the terms of use for the content. An example would be NPR’s API.
  • With Proprietary Metadata. Many organizations publish content using proprietary markup to describe their data. This situation encourages web-scraping by others to unlock the data. Sometimes publishers may make their content available through an API, but they retain control over the metadata itself. Amazon’s ASIN product metadata would be an example: other parties must rely on Amazon to supply this number.

Selectively Disclosed Proprietary Data

  • With Open Metadata. Just because a firm uses a data vocabulary that’s been published and is available for others to use, it doesn’t mean that such firms are willing to share their own data. Many firms use metadata standards because it is easier and cheaper to do so, compared with developing their own. In the case of Facebook, they have published their Open Graph schema to encourage others to use it so that content can be read by Facebook applications. But Facebook retains control over the actual data generated by the markup.
  • With Proprietary Metadata. Applies to any situation where firms have limited or no incentive to share data. Customer data is often in this category.

Taking Action on Data

Try to do more with the data in your content. Think about how to enable audiences to take actions on the data, or how to have your systems take actions to spare your audiences unnecessary effort. Data needs to be designed, just like other elements of content. Making this investment will allow your organization to reuse the data in more contexts.

— Michael Andrews

Categories
Intelligent Content

Are UI Cards good for content?

A user interface format known as cards is gaining popularity with mobile and web designers, and is being used in a variety of contexts.  UI cards can simplify screen design, but they don’t always promote good practices for content.   It helps to know when using cards supports content, and when cards can result in problems for audiences.

What are UI cards?

UI cards are boxes of a fixed width that are containers for content.  They may or may not have a fixed height. They can be placed beneath one another, or beside one another, and occasionally, stacked on top of each other.  The metaphor informing the design of UI cards is the index card or the flash card.  Unlike with physical cards, it is rare that one shuffles UI cards: the position of the card is normally determined by the application.

The recent interest in cards echoes a similar interest in the index card metaphor in the early days of personal computing 25 or 30 years ago.  Apple developed an index-card based tool called Hypercard.  The novelist Robin Sloan considers the Hypercard design as “pretty cool” but notes the negative complaints it created: “It squeezes information into screen-sized ‘cards’ and doesn’t allow scrolling through long documents. The academics agree: cards are too limited. Cards are lame.”  Around the same time, Microsoft released a product called PowerPoint that was also based on the index card metaphor.  It too spawned complaints about how well slides present information.

Today, many leading digital brands use UI cards.   Unlike the earlier card-based software, today’s UI cards only take up a piece of the screen, instead of the entire screen. UI cards have quickly become a common interaction design pattern, and some UX designers even refer to a “card architecture” design concept.

Examples of UI cards

Designers tend to use UI cards in one of two ways: to hold a piece of information from a stream of messages, or to showcase items that are part of a collection.

Streams of messages and notifications are generally not related to one another.  Cards help to separate distinct messages from one another, and provide a focal point for users to see updates.  Streamed content can be either summaries or previews.  Google Now uses cards to show summary information, such as sports scores for a team one follows.  Both Facebook and Twitter use cards that show a preview of content available elsewhere, such as an article.

screenshots of Google Now
Examples of cards for Google Now showing updates for user’s stream

Collections are another major use of UI cards.  Whereas UI cards for streams work well on mobile platforms, UI cards for collections work better on tablets or laptops.  Each card is like a small poster that provides a teaser for content.  These cards often highlight visual content with limited substantive information, and provide a way to scan content, but users must navigate elsewhere to get details.  Pinterest is the best-known example of this kind of card, though many other sites use the card collection concept.  The cards aren’t necessarily intrinsically related to one another: the relationship between cards in a collection is defined by their curator — either the user collecting them, or a brand that is displaying the group.

Why are UI cards suddenly popular?

Cards mesh with current sensibilities about digital content: they package content into bite-sized pieces that work on mobile and social networks.  From a front-end perspective, using cards provides an easy-to-apply layout for a responsive web design (RWD) framework.  With their fixed width, cards can be used in one, two or three column layouts, allowing the same card designs to be used on smartphone, tablet, and laptop screens.   Fixed width cards are well suited to the flexible grids used in RWD.  Visual designers seem to like cards.  When using a fixed length, cards form a perfect grid; when using variable lengths, cards form a masonry layout that looks like a vertically oriented brick wall.

Cards are also popular because they seem to tame content.  Cards are modular, so different content items can be stacked together without calling attention to their differences.  Cards minify content, by forcing content to adapt to the constraints of the card boundary and card visual layout.  Designers like that cards allow different content to be mixed and matched without worrying that the design will become cluttered.

Interaction designers like cards as well.  Cards can break down content into digestible bits that allow users to interact with it.  By offering a container for content, cards provide the suggestion that the content is tangible and personal to the user.  Like a Rolodex card, UI cards can contain personal details, such as one’s flight info.  Like baseball cards, UI cards can be collected, and displayed on a virtual corkboard.  They can be traded as well, or more accurately, shared, through social media.  And cards provide containers to break down sequences of interaction, such as Google recommends for its Google Glasses.

screenshot of google glass cards
Sequence of cards for Google Glass. Source: Google

UI Cards as Interaction Design

Most of the impetus for UI cards has come from the interaction design (IxD) community, rather than the content strategy community.  IxD has various goals for promoting UI cards, but these goals are separate from the goals of content strategists.  When IxD refers to UI cards as “content cards” they rarely are focused on the details of the content inside the cards.

UI cards offer appealing aesthetics: they are well behaved on a grid, and provide fluid behavior during interaction transitions.  Cards present an illusion that everything one needs to know is right there in from of them.  They mimic a bulletin board that can be scanned quickly.  When a user wants to engage with the card, many interaction possibilities are available.  The card can be animated, swiped, or flipped. In contrast to the tedium of scrolling, UI cards allow a range of interaction possibilities.

While offering some fun possibilities, UI cards can privilege interaction over the substance of the content on the cards.  IxD’s approach to content is to design it from the outside in: start with the desired layout and behavior, and then expect the content to fit.  By focusing on the container for content rather than the content itself, interaction design can create problems for content.  The size of the container can dictate the size of the content, rather than the other way around.  The tyranny of form over substance is most acute with fixed height designs that follow a precise grid layout.  Such layouts can sometimes cause content to get dumbed down, much the way PowerPoint presentations can squeeze meaning and break flow from narratives because the format limits how much text can comfortably be placed on a slide.   One application called Citia tries to summarize books on square shaped cards, but the pacing of information on the cards is uneven: some cards are dense with information, others less so.

UI cards can contribute to content usability problems that may not be immediately evident.  Users often like UI cards when they encounter them, and don’t notice their limitations.  They see tidy cards often with colorful thumbnail images.  The cards seem optimized to make good first impressions.  But often, the cards end up squashing the content that must go in them, or omitting content details that don’t fit the layout vision.  The most common problems are truncating content, or hiding content below a hidden fold on the card that’s not visually obvious.  Paradoxically, while cards are meant to provide all needed content with a quick glance, they may have vertical scrolling behaviors within them.  Occasionally, they will even have a line of text or icons within the card that requires horizontal scrolling.  Users don’t know if tapping for more information when content is truncated will result in a hover state bubble, or take them away from the card.  Some content cards hide information by placing content on the backside of the card, requiring the user to tap the flip the card — if the user knows to do that.  So for all the talk about cards being a design pattern and architecture, there seems to be a lack of convention for how to design them.   Users can’t confidently predict how any new card they encounter will behave.

screenshot of cards on a recipe site
Collection of cards from a recipe site. Note that the title on one card is truncated, as are the ingredients on many cards. To see the non-visible ingredients, the user must tap the “and more” text. This is an example of content being squeezed into available space.

Another issue with UI cards is their lack of hierarchy.  When all cards are the same size, all cards look equally important, whether they have detailed information, time sensitive information, sparse information, or optional information.  Visual designers like to feature photo thumbnails on cards to help users scan content for items of interest.  That approach can work well, provided the photo conveys real information and is not simply decoration.  Cards about movies might show images from the movie or pictures of the actor.  But if all the cards are about the same topic, say articles about running pinned on Pinterest, and all the cards show clichéd stock photos of a person running intensely, then the photos provide no information to help the user scan many similar looking cards.  Instead, the user must look at every card individually and try to determine relevance based on limited text descriptions.

Cards as content units

An alternative way to look at cars is as units of content, rather than containers for content.  By focusing on units of content, we can ask: what does the card represent?

Content for cards can be pulled from 3rd parties not related to the card publisher, or content can be provided by the card publisher directly.  With 3rd party content, the card publisher often has limited control over the content.  They may capture the first 30 characters of the title, a thumbnail of the first image provided, or the first four lines of the article.  In many cases the card is simply a preview of another unit of content: a full article available elsewhere.

When brands use cards to present their own content, they should be able to choose content elements more carefully. But having control over the content does not assure that the right content is selected for a card.  Suppose the idea is to provide a card to show the weather forecast.  While simple in concept, it still requires choices about what specific information to show, and how that information is accessed.  When the form is already fixed, the choices for the content are restricted.  A severe weather alert is issued, perhaps an impeding hurricane, but there is no space on the card to show that information.  The card becomes a blob that can’t be enriched with new content elements, because the format prevents such additions.  Brands try to work out what content relating to a topic they can fit on a card, rather than ask what content they want to present, and how might cards help them do that.

Too often, content is simply chopped up to fit on cards.   For cards to be useful, the content on them should have meaning outside of the card container.   This requires designing cards for content from the inside out, starting with the content.

Two recent content applications are pointing to how cards can represent meaningful chunks of content.  The field of journalism is under enormous pressure to become more relevant to audiences, and to increase the speed and efficiency of providing updates.  Cards are being embraced by a new journalistic genre called explanatory news that has emerged to address these challenges.  The news sites Vox and Circa are showing how cards can represent meaningful chunks of content that can be combined to form larger units of meaning.

screenshot of cards from Vox
Example of Vox cards used to explain background content.

Vox uses cards that are attached to yellow-highlighted words “to offer deeper explanations of key concepts.”  Stacks of cards “combine into detailed — and continually updated — guides to online news stories.”    Vox cards are topics, focused around a question, and answer why the topic and question matter.  Vox’s idea is that while news developments relating to a topic are always happening, some core material will be relevant context for any developments about a topic.  The cards signal: this is not breaking news, but it is important.   The linking of the cards provides a way for users to navigate through related topics.

Circa uses a concept similar to cards it calls atoms.  Circa is also working to provide more context for news, and decided to use atoms as a way to structure the information relating to a story, so that the atoms link together to form an ongoing narrative.  Each atom is a unit of content, “a fact, quote, statistic, event or image” and each unit is tagged with metadata.  The tagging of atoms allows readers to identify what’s new relating to a developing story, and to follow specific topics of interest.

Screenshot of Circa atoms
Examples of Circa’s ‘atoms’ which are similar to cards. Each atom represents a different unit of content.

Unlike other card examples,  Vox and Circa are using card-like structures to re-imagine the creation and delivery of narrative content.  The separation of content between updates and background information can be applied to other content genres.  Gartner analyst Todd Berkowitz notes the potential use of cards for content marketing: “I like the notion of being able to update stacks. From a content creator’s standpoint, this is easier than trying to update a whitepaper or eBook. The individual cards also offer re-usability. The Vox Cards can be tweeted, shared on Facebook and with Google+ circles, but providers would also want to add a button to share on LinkedIn. So when you add a new card, a provider could blog about it and also tweet it.”

Cards don’t mean updates are now trivial.  While it is easy to add a new card with new information to the stack, brands need to make sure the information on previous stacks is still current.  For some topics, the background information can be changing as well, even if less frequently than the updates.

Emerging best practices

It’s best to consider cards an interesting experiment, rather than a design pattern that reflects the distilled wisdom of many years of practice.  Still, cards hold great promise, and implemented intelligently, can provide benefits for both audiences and authors.

The value of cards comes from their ability to break apart larger clumps of content into more meaningful units that can be managed and recombined more easily.   As Circa has shown, cards can represent different kinds of information, as well as different topics.  Cards can offer an alternative to conventional articles.  Cards can promote a new workflow for content that is more focused on revising only what is necessary.  Ideally, cards could drive wider adoption of structured modular content.

For cards to realize their potential, designers and authors should stop thinking about UI cards as being like index cards, flash cards, or playing cards. UI cards will often need to be elastic to accommodate content intended for them, and they need to be intelligent, not just animated.  In recent years IxD has become critical of skeuomorphism in design — the reliance on physical metaphors in digital design.  UI cards aren’t made of paper stock, shouldn’t look or behave as if they were.

– Michael Andrews