Categories
Content Experience

Structuring Content through Timeboxing

Much of our lives is structured around time.   When we think about time, we often think about events or schedules.  When does something happen?   Time provides structure for experience.  

In our lives online, content shapes our experience.  How then can time bring structure to content we consume online?

Content strategists generally think about structuring content in terms of sequences, or logical arrangement.   I’d like to introduce another way to think about structuring content: in terms of time.  But instead of focusing on timing — when something happens —  we can focus on how much time something requires.  The focus on the  dimension of duration has developed into a discipline known as timeboxing.

Timeboxing and Content

Timeboxing is a concept used in agile planning and delivery.   Despite its geeky name, all of us use timeboxes everyday, even if we don’t use the term.  We schedule 15 minute conversations, or half-hour meetings, and then decide what’s the most important material to cover in that block of time, and what material needs to wait to be dealt with elsewhere. Despite our best wishes, circumstances commonly dictate our available time to discuss or consider an issue, rather than the reverse.  

With roots in project management, timeboxing is most often applied to the production of content, rather than its consumption.  But timeboxing can be used for any kind of task or issue.

“Timeboxing allocates a fixed time period, called a timebox, within which planned activity takes place” 

Wikipedia

The consumption of content can also be timeboxed.  Content that is consumed on a schedule is often timeboxed.  Some examples of timeboxing content experience include:

  • The BBC offers a “Prayer for the day” lasting two minutes
  • Many broadcasters offer timed 15 or 30 second updates on markets, weather and traffic
  • The PechaKucha format for presentations of 20 slides for 20 seconds each, for a total of 6 minutes, 40 seconds, to maximize the number of speakers and keep presentations interesting
  • Time-limited conversations in “speed networking” in order to maximize the number of conversations had

Producers (those who offer content) decide how much content to make available in order to synchronize when it can be consumed.  There’s a close association between fixing the duration of content, and scheduling it at a fixed time.  The timing of content availability to users can help to timebox it.  

Limits without schedules

But timeboxing doesn’t require having a schedule.  In an agile methods context, timeboxing inverts the customary process of planning around a schedule.  Instead of deciding when to do something, and then figuring out how long can be allotted to a task, the timeboxing approach first considers how long a task requires, and then schedules it based on when it can be addressed.  How long something takes to express is many cases more important than when something happens.

This is a radical idea.  For many people, timeboxing — limiting the time allowed for content — is not our natural tendency.   Many of us don’t like restricting what we can say, or how long we can talk about something.  

Timeboxing tends to happen when there’s a master schedule that must be followed.  But when access to content doesn’t depend on a schedule, timeboxing is ignored. Even audio content loses the discipline of having a fixed duration when it is delivered asynchronously.  A weekly podcast will often vary considerably in length, because it is not anchored to a schedule forcing it to fit in a slot that is followed by another slot.

Authors, concerned about guarding their independence, often resist imposing limits on how much they can say.  Their yardstick seems to be: the duration of the content should match whatever they as authors think is necessary to convey a message.  The author decides the appropriate duration —  not an editor, a schedule, or the audience.  

Without question, the web liberates content producers from the fixed timetable of broadcast media.  The delivery of digital content doesn’t have to follow a fixed schedule, so the duration of content doesn’t have to be fixed either.  The web transcends the past tyranny of physical limitations.  Content can be consumed anytime. Online content has none of the limits on space, or duration, that physical media imposed.  

Schedules can be thought of as contracts. The issue is not only whether or not the availability of content follows a schedule, but who decides that schedule.  

Online content doesn’t need to be available according to a set schedule, unlike broadcasters who indicate that  you can listen to certain content at 9 am each day. 

A schedule may seem like tyranny, forcing authors to conform to artificial limitations about duration, and restricting audiences to content of a limited duration.  But schedules have hidden benefits.  

Setting expectations for audiences

When content follows a schedule, and imposes a duration, the audience knows what to expect.  They tacitly accept the duration they are willing to commit to the content.  They know ahead of time what that commitment will be, and decided it is worth doing.

The more important question is how consumers of content can benefit from timeboxing, not how producers of content can benefit.  Timeboxing content can measure the value of content in terms of the scarcest commodity audiences have: their time.

How should one measure the time it takes to consume content?  A simple standard would be listening duration: the amount of time it would take for the content to be read aloud to you in a normal speaking pace.  

We read faster than we talk.  We are used to hearing words more slowly than we would read them.  If we are a “typical” person, we read 250 words/minute.  We speak and listen at 150 words/minute.

Listening can’t be sped-up, unlike reading. And having to have something repeated is annoying for a listener.  For content presented on a screen, there is generally no physical limits to how much content can be displayed.  Content creators rely on the audience’s ability to scan the content, and to backtrack, to locate and process information of interest to them.  Content read aloud doesn’t offer that freedom.  Listening duration provides a more honest measurement of how long content takes to consume.

The duration of content is important for three reasons.  It influences:

  1. The commitment audiences will make to trying out the content
  2. The attention they may be able to offer the content
  3. The interest they find the content offers

Audience commitments 

Most of us have seen websites that provide a “time to read” indicator.  Videos and podcasts also indicate how many minutes the content lasts.  These signals help audiences choose content that matches their needs — do they have enough time now, or do they wait until later?  This is content the right level of detail, or it is too detailed?  

Screenshot from Christian Science Monitor
A news article from the Christian Science Monitor gives readers a choice of a long and short version.

Content varies widely in length and the amount of time required to consume it.  One-size does not fit all content.  Imagine if publishers made more of an effort to standardize the length of their content, so that specific content types were associated with a specific length of time to read or listen.

Timeboxing recognizes that tasks should fit time allotments.  Timeboxing can encourage content designers to consider the amount of time they expect audiences will give to the content.

Audiences have limited time to consume content.  That means they can’t commit to looking or listening to content unless it fits their situation.

And when they consume content, they have limited attention to offer any specific message.  

Audience attention

Currently, some publishers provide authors with guidelines about how many words or characters to use for different content elements.  Most often, these guidelines are driven by the internal needs of the publisher, rather than being led by audience needs.  For example, error messages may be restricted to a certain size, because they need to fit within the boundaries of a certain screen.  The field for a title can only be of a certain length, as that’s what is specified in the database.  These limits do control some verbosity.  But they aren’t specifically designed around how long it would take audiences to read the message.  And limiting characters or words by itself doesn’t mean the content will receive attention from audiences.

Time-to-read is difficult to calculate.  Instead of counting words or characters, publishers try to guess the time those words and characters consume.  That is not a simple calculation, since it will partly depend on the familiarity of the content, and how easily it can be processed by audiences.  Tight prose may be harder comprehend, even if it is shorter.

Since much text is accompanied by visuals, the number of words on a screen may not be a reliable indication of how long it takes to consume the content.  Apple notes: 

“Remember that people may perform your app’s actions from their HomePod, using ‘Hey Siri’ with their AirPods, or through CarPlay without looking at a screen. In these cases, the voice response should convey the same key information that the visual elements display to ensure that people can get what they need no matter how they interact with Siri.” 

Apple Human Interface Guidelines

The value of structuring content by length of time, rather than number of characters or words, is easiest to appreciate when it comes to voice interaction.  Voice user interfaces rely on a series of questions and answers, each of which needs to be short enough to maintain the attention of both the user and the bot processing the questions. Both people and bots have limited buffers to hold inbound information.  The voice bot may always be listening for a hot word that wakes it up — so that it really starts to pay attention to what’s being said.  Conversely, the user may be listening to their home speaker’s content in a distracted, half-hearted way, until they hear a specific word or voice that triggers their attention.

Matching the audience’s capacity to absorb information

Attention is closely related to retention. Long, unfamiliar content is hard to remember.  Many people know about a famous study done in the 1950s by a Professor Miller about the “magical number seven” relating to memory spans.  The study was path breaking because it focused on how well people can remember “contents”, and proposed creating chunks of content to help people remember.  It is likely the beginning of all discussion of about chunks of content.  Discussing this study, Wikipedia notes: a memory “span is lower for long words than it is for short words. In general, memory span for verbal contents (digits, letters, words, etc.) strongly depends on the time it takes to speak the contents aloud.”  The famous Miller experiment introduced time (duration) as a factor in retention.  It is easier to recall shorter duration content than longer duration.  

We can extend this insight when considering how different units of content can influence audiences in other ways, beyond what they remember.  Duration influences what audiences understand, what they find useful, and what they find interesting.  

Exceeding the expected time is impolite. When audience believe content takes “too long” to get through, they are annoyed, and will often stop paying attention. They may even abandon the content altogether.

The amount of attention people are willing to give to content will vary with the content type.  For example, few people want to read long entries in a dictionary, much less listen to a definition read aloud.

Some content creators use timeboxing as their core approach, as is evident in the titles of many articles, books and apps.  For example, we see books promising that we can “Master the Science of Machine Learning in Just 15 Minutes a Day.”  Even when such promises may seem unrealistic, they feel appealing.  As readers, we want to tell publishers how much time we are able and willing to offer

The publisher should work around our time needs, and deliver the optimal package of material that can be understood in a given amount of time.   It doesn’t help us to know the book on machine learning is less than 100 pages, if we can’t be sure how difficult the material is to grasp.  The number of pages, words, and characters, is an imperfect guide to how much time is involved.

Audience interest

Another facet of structuring content by time is that it signals the level of complexity, which is an important factor in how interesting audiences will find the content.  If a book promises to explain machine learning in 15 minutes a day, that may sound more interesting to a reader without an engineering background than a book entitled “The Definitive Guide to Machine Learning” which sounds both complicated and long.

What is the ideal length of a content type, from an audience perspective?  How long would people want to listen to (or read attentively) different content types, if they had a choice?  For the purposes of this discussion, let’s assume the audience is only moderately motivated. They would like to stop as soon as their need for information is satisfied.

Time-delimited content types can offer two benefits to audiences:

  1. Pithiness
  2. Predictable regularity

Content types define what information to include, but they don’t necessarily indicate how much information to include.  The level of detail is left to individual authors, who may have different standards of completeness.  

When content becomes bloated, people stop paying attention.  There’s more information than they wanted.    

Making content more even

Another problem is when content is “lumpy”: some content relating to a certain purpose is long-winded, while other content is short.  A glossary has short definitions  for some  words but other definitions are several paragraphs.    We find this phenomenon in different places.  On the same website, people move between short web pages that say very little and long pages that scroll forever. 

Paradoxically, the process of structuring content into discrete independent units can have the effect of creating units of uneven duration.  The topic gets logically carved up.  But the content wasn’t planned for consistency in length.  Each element of content is independent, and acts differently, requiring more or less time to read or hear.  

 Audiences may give up if they encounter a long explanation when they were expecting a short one.  It only takes one or two explanations that are abnormally long for audiences to lose confidence in what to expect.  A predictable experience is broken.

Timeboxing content messages encourages content designers to squeeze in as much impact in the shortest possible time.

Message units, according to duration

If people have limited free time, have limited attention to offer, and have lukewarm interest, the content needs to be short — often shorter than one might like to create.

We can thus think about content duration in terms of “stretch goals” for different types of content.  Many people will be happy if the content offered can be successful communicating a message while sticking to these durations.  

While no absolute guidelines can be given for how long different content should be, it is nonetheless useful to make some educated guesses, and see how reliable they are.  We can divide durations into chunks that increase by multiples of three, to give different duration levels.  We can then consider what kinds of information can reasonably be conveyed within such a duration.  

  • 3-5 seconds: Concisely answer “What is?” or “Who is?” with a brief definition or analogy (“Jaws in outer space”)  
  • 10-15 seconds: Provide a short answer or tip, make a suggestion, provide a short list of options.
  • 30 seconds:  Suggest a new idea, explain a concept, offer an “elevator pitch”
  • 1-3 minutes: To discuss several things, explain the context or progression of something — an episode or explainer 

For writers accustomed to thinking about the number of words, thinking how long it would take to listen to a message involves a shift.  Yet messages must match an expected pattern.  Their duration is a function of how much new material is being introduced.  

Creating messaged based on their duration helps to counterbalance a desire for completeness.  Messages don’t have to be complete.  They do have to be heard, and understood.  There’s always the opportunity for follow up.  And while less common, if a message is too short, it is also disappointing.  

Testing the duration

For short chunks of content addressing routine purposes, content designers should craft their messages to be appealing to the distracted content consumer. They can ask:

  • How easy is the message to follow when read aloud?
  • Does the message stand on its own, without exhausting the audience?
  • Can people ask a question, and get back an answer that matches how much time they are willing and able to offer?

I expect that the growing body of research relating to voice user interaction (VUI) will add to our understanding of the role of duration in content structure.  Feel free to reach out to me on social or community channels if you’d like to share experience or research relating to this theme.   It’s an evolving area that deserves more discussion.

— Michael Andrews

Categories
Content Experience

How Content Can Answer Unanticipated Questions

How can publishers answer questions that audiences may have, when they don’t always know what will interest people? This is not a trick question. To be agile, publishers need to plan for flexibility.   They need to prepare content for scenarios they can’t anticipate in advance.

Content design has never been more important.  People have less time than ever to deal with unwanted content.  But content design should not be about spoon-feeding audiences answers to pre-approved questions.  Content design should instead empower audiences to consume the precise content they need.  Publishers should enable audiences to decide the answer that matches their need.  Publishers shouldn’t believe they can always anticipate what audiences need.  They can’t  always package content to match a  known need.  Recent developments in search technology are shaking up thinking about how to provide answers to audiences.

The Limitations of Questions as Templates for Content Development

Current practices presume a certain process.  We should start with a list of questions that users have, then write content answering those questions. The question will tell us what content to create. This approach, however, has limitations which may not be obvious.

I’ve long been an advocate and practitioner of user research.  It makes no sense to create content users indicate they have absolutely no interest in.  But user research is merely a starting point for considering user questions.  It should not be the final arbiter of what could be important to users.

“People are really fascinating and interesting … and weird! It’s really hard to guess their behaviors accurately. ” — Peter Koechley, Upworthy

Many user questions can’t be guessed — or discovered — in advance.  When doing user research, organizations can be over-confident about what questions they think users will have in the future.  User research probes the motivational level of interests and needs, rather than the more granular informational level of specific questions.  User research helps to  understand users, but it will simplify user needs into personas.  The diversity, and contextual complexity, that spawn the range of real word user questions gets smoothed over.  Qualitative user research data is too broad to uncover the full range of potential questions in detail.  Quantitative data analysis of past online queries can provide more granular insights, But even quantitative data won’t predict all situations, especially when novel situations arise.

Two common approaches to question-templated content development are:

  • The “top tasks” approach
  • The long-tail approach.

Some content strategists favor the top task approach  — especially those who focus on task-oriented transactional content.

Many SEOs favor the long tail approach — especially those who want to promote awareness-orientated marketing content.

The top tasks approach makes assumptions about essential user questions, based on past user behavior with a website.  An organization may decide that the top 10 search queries drive 90% of web traffic, so those 10 questions are the ones to offer answers.  Each question gets one answer.  It’s a rearview approach that assumes no curiosity on the part of audiences.  Audience needs exist only as an extension of their interaction with the organization.  All questions considered relevant relate to user tasks linked to that specific organization.

The hidden assumptions of the top tasks approach are:

  • Everyone has the same questions
  • Because everyone has the same questions, everyone should get the same answers
  • If different people start to ask different questions, publishers can ignore those questions, because they aren’t top questions.

Providing homogenized answers to homogenized questions is appealing to homogenized organizations.  Especially to  government offices, banks, or tech support units.  But cookie cutter content can seem like it’s created by a faceless organization.  Standardized answers don’t satisfy customer’s growing expectations. They expect more personalized service.

The long tail approach tries to anticipate user questions by crafting answers for many question variations.  Each variation addresses an ever narrower ranges of questions. The idea is to get an inventory of questions all kinds of people are asking, and then develop answers to all these questions, so there is something for everyone.  On the surface, this approach seems to deliver more individualized answers.  But we will see, that is not always the case.

Both the top tasks, and long tail, approaches assume that each question has one answer.  A content item exists to answer that one specific question.

In practice, the formula that one question has one answer doesn’t hold.   Different questions lead to the same content.  Type question variations on Google, and Google rewards you with the same links going to the same content.  Not all question variations are substantially different.  If you type “How to fly a kite” in Google, you can see related questions such as “How to fly a kite step-by-step” or “How to fly a kite by yourself”.  You’ll also find “long tail” questions such as “How to fly a kite with little wind” or even more optimistically, “How to fly a kite with no wind”.

The notion of a related search is vague.  It could be a search query that is essentially equivalent to another, but phrased differently.  It could be question that implies distinctions or details that may not be present in the information or that may not even be crucial.  Suppose we imagine content addressing “How to fly a kite for firefighters” and another on “Easy steps to kite flying for bus drivers”.  We’d likely find the essence of this long tail content is no different from the more general answer.  The idea that long tail content is necessarily more relevant is fiction.

The other characteristic of question-templated content is that the questions and answers are pre-assembled and frozen.  If we phrase a question differently, such as “What’s different about kite flying for bus drivers?”, we aren’t likely to get an answer.  At most, we’ll get content talking about kite flying that for some reason mentions bus drivers.  The content creator decides what content the reader will get, instead of the reader deciding.

Content design should be built on a foundation of compositional content.  What content is assembled and delivered can be based on the specific question asked.  Suppose you want to ask “How to tell someone to ‘go fly a kite’ ”?  When decomposed, the question reveals two distinct sub-questions.  One sub-question concerns how to deliver a message in general, covering tone or medium.  The other sub-question concerns what message alternatives are available about a specific issue — in this example, the desire to get someone else to change their behavior.

In principle, machines can assemble an answer to such a complex question, even though no person has created an answer to that specific question already.   The machine would draw on two components.  One would component address points to make about an issue; and the other component would address ways to deliver those points.

A compositional topic could be rich in variations that would yield different answers.  It could address: “How to tell a colleague…” or “How to tell a nosy relative…,” or whomever.  The answer could include components about the general aspects of the issue, which could be supplemented with some advice specific to the question variation.

For those familiar with structured content, the use of components to create content variations will seem familiar.  The difference here is that users initiate the assembly of components in novel configurations.  We don’t know in advance what the user wants, so we therefore have to provide them with the raw material to supply the answer to their unknown query.

Information Generates Questions

Part of the reason people can be unpredictable in their questions is that their interests and understanding evolve over time.  Sometimes the facts of a situation can change as well.

Laura E. Davis, digital news director of USC’s Annenberg Media Center, recently wrote about “Writing answers before you know the question.”   Her question flips the assumption that most writers have: that writers know reader questions ahead of time, and the task of the writer is to provide answers to them.  Most writers expect that information presented will follow the questions audiences ask.  But the reverse is also true. Information, or the expectation of information, sparks questions.  Sometimes writers will never have thought of the questions their readers might have.

Davis cites several trends that are making audience questions less predictable.  Audiences are becoming more conversational in how they access content.  Questions can unfold in a conversation, without knowing where they may lead.  Events can unfold quickly, and not conform to a tidy summary answer. These issues gain importance as conversational interfaces become more common.  “As we move forward, more and more, we’ll be writing answers before we know the question.”

In conversation, questions and answers flow spontaneously.  How can content become more spontaneous?  How can content prepare for a “zero UI” future, as Davis puts it?  We’ll look at two approaches, metadata and machine reading, which publishers can combine to offer laser precision in answers.

‘Literate machines’ will provide dynamic answers

Historically, questions asked online were answered by a list of hyperlinks.  Even today, many chatbots provide an answer by pointing to a hyperlink of content the reader must read.   When a computer points a user to a document title (in the form of a hyperlink), it generally is pointing the user to pre-assembled content.  Pre-assembled content runs a high risk of not being exactly what the user is looking for.

Yet the more recent trend is to provide answers directly, instead of answering queries by providing links to documents.  Everyone is familiar with Google’s instant answers. This approach is being adopted most of the other major tech companies as well.  How answers are being delivered is transforming quickly.

Advances in semantic technology and AI are allowing both questions and answers to become more iterative, and fluid.  Users may not consider a single answer to a question they pose as complete. They may want several pieces of information to give them a complete understanding.  To give users complete answers, machines stitch together several fragments from different source.  Audiences can ask clarifying or follow up questions to fill out their knowledge, and contextual answers will appear.

Semantic metadata facilitates machine discovery and understanding of information.  Metadata is powerful because it can relate information from different sources. Publishers can include their information as part of a relevant answer to a user query.  For example, suppose a user asks “What local cinemas are showing films made before 1960 this evening?”  There may not be a single item of content providing that answer.  But metadata from different content can assemble an answer.  The listings of local cinemas can be combined with data about films from a film encyclopedia (to filter by year).  The ability of metadata to assemble information from many sources upends the expectation of some publishers, who believe they must provide comprehensive information about topics to answer any audience question.  Instead, their goal should be to focus on providing comprehensive information that they are uniquely positioned to offer, and to link through metadata to other sources that provide related information that might arise in a question asked by users.

The question in this example may seem arbitrary — and it is.  Why would someone want to watch films made before 1960?  What special about 1960?  Why not 1965?  Or 1950?  Because the question, seen from the outside, seems arbitrary, no one will create content specifically to answer this question.  The variations in how the question could be framed are limitless.  Which is why metadata is powerful in providing answers to questions that may be infrequently asked, or have never been asked previously.  Just because a question is novel does not mean it is unimportant.

Given the quantity of content that’s created, someone may have written content that provides part of an answer to a question.  But that answer could be buried within a larger discussion that isn’t the focus of the user’s question.  If you are curious where a new film start grew up, there might not be specific content answering that question.  But he or she may have mentioned it in passing during an interview about their latest film.  How might you locate that information without reading various interviews in full?

Machine reading comprehension (MRC) is an emerging technique that promises to transform how content is used.  Its premise is simple but awe inspiring.  Machines can read texts just like humans do, and understand what the text means.  They can do this at incredible speeds, so that can locate specific statements quickly, interpreting what the statement means, relating it to questions or statement made elsewhere.   Machine reading does not require structure, but it presumably benefits from having structure.

Amy Webb at NYU demonstrated how machine reading comprehension works in a recent presentation (here at minute 34) . Reading a book, MRC can extract the meaning.  Yes, someday soon computers will be able to speed-read War and Peace and be able to tell us what the novel is about (beyond the obvious, that it’s about Russia.)

slide with text
Slide from Amy Webb presentation on machine reading comprehension (MRC) at ONA17 conference.

MRC has been a keen research focus of many firms developing audio interfaces.  Audioburst is a new service that digests the transcripts of audio interviews.  Users can ask Alexa a question about a news topic, and Alexa can query Audioburst to find snippets of content relevant to the query, and will combine and play back different audio clips from different radio programs related to the question.

Microsoft has been at the forefront of MRC research.   I want to highlight some of their work because they are combining MRC with semantic metadata in products that are widely used.

“We’re trying to develop what we call a literate machine: A machine that can read text, understand text and then learn how to communicate, whether it’s written or orally.” — Kaheer Suleman of Microsoft

Microsoft notes: “Machine reading comprehension systems also could help people more easily find the information they need in car manuals or dense tax code documents.”

MRC is being used in Microsoft products such as Cortana (the voice assistant similar to Alexa or Siri), and Bing (the search engine that competes with Google).

A recent news article states: “Microsoft’s virtual assistant Cortana will get an upgrade as well, allowing it to make use of machine reading comprehension to summarize search results. ”

Earlier this month, Bing announced it would use MRC: “Bing’s comparison answers understand entities, their aspects, and using machine reading comprehension, reads the web to save you time combing through numerous dense documents.”

screenshot of Bing blog post on MRC
How Bing uses machine reading to provide multifaceted answers based on text from different sources

 

For Bing users this means:

  • “If there are different authoritative perspectives on a topic, such as benefits vs drawbacks, Bing will aggregate the two viewpoints from reputable sources”
  • “If there are multiple ways to answer a question, you’ll get a carousel of intelligent answers.”
  • “If you need help figuring out the right question to ask, Bing will help you with clarifying questions.”

As the Microsoft examples highlight, the notion that there is only one best answer to a question is no longer a given.  People want different perspectives, and different levels of detail.  Literate machines can help people retrieve answers that match their interests.

Conclusion

Information-rationing is not in the best interests of content consumers.  Content strategists have long warned of the dangers of providing too much information.  But too much information isn’t necessarily the problem.  No one complains about Wikipedia having too much information.

My advice to content creators is this.  If you have unique information to share, you should publish it.  Even if you’re not sure whether users have a pre-existing need to look for that information, it could be valuable.  Self-censorship does not make sense.  At the same time, content creators should not feel they must create a complete or definitive presentation of a topic.  Increasingly, machines will be able to stitch together information from different sources for the benefit of users.  Content creators should focus on what they know best.  Duplicating information that exists elsewhere benefits no one.

We can’t predict what information people will need in the future. Content that is information-rich is worthwhile content.  We need to make such information accessible, so audience can retrieve it when it is be needed.  We need to help make machines literate.

— Michael Andrews