Categories
Content Engineering

Paradata: where analytics meets governance

Organizations aspire to make data-informed decisions. But can they confidently rely on their data? What does that data really tell them, and how was it derived? Paradata, a specialized form of metadata, can provide answers.

Many disciplines use paradata

You won’t find the word paradata in a household dictionary and the concept is unknown in the content profession.  Yet paradata is highly relevant to content work. It provides context showing how the activities of writers, designers, and readers can influence each other.

Paradata provides a unique and missing perspective. A forthcoming book on paradata defines it as “data on the making and processing of data.” Paradata extends beyond basic metadata — “data about data.” It introduces the dimensions of time and events. It considers the how (process) and the what (analytics).

Think of content as a special kind of data that has a purpose and a human audience. Content paradata can be defined as data on the making and processing of content.

Paradata can answer:

  • Where did this content come from?
  • How has it changed?
  • How is it being used?

Paradata differs from other kinds of metadata in its focus on the interaction of actors (people and software) with information. It provides context that helps planners, designers, and developers interpret how content is working.

Paradata traces activity during various phases of the content lifecycle: how it was assembled, interacted with, and subsequently used. It can explain content from different perspectives:

  • Retrospectively 
  • Contemporaneously
  • Predictively

Paradata provides insights into processes by highlighting the transformation of resources in a pipeline or workflow. By recording the changes, it becomes possible to reproduce those changes. Paradata can provide the basis for generalizing the development of a single work into a reusable workflow for similar works.

Some discussions of paradata refer to it as “processual meta-level information on processes“ (processual here refers to the process of developing processes.) Knowing how activities happen provides the foundation for sound governance.

Contextual information facilities reuse. Paradata can enable the cross-use and reuse of digital resources. A key challenge for reusing any content created by others is understanding its origins and purpose. It’s especially challenging when wanting to encourage collaborative reuse across job roles or disciplines. One study of the benefits of paradata notes: “Meticulous documentation and communication of contextual information are exceedingly critical when (re)users come from diverse disciplinary backgrounds and lack a shared tacit understanding of the priorities and usual practices of obtaining and processing data.“

While paradata isn’t currently utilized in mainstream content work, a number of content-adjacent fields use paradata, pointing to potential opportunities for content developers. 

Content professionals can learn from how paradata is used in:

  • Survey and research data
  • Learning resources
  • AI
  • API-delivered software

Each discipline looks at paradata through different lenses and emphasizes distinct phases of the content or data lifecycle. Some emphasize content assembly, while others emphasize content usage. Some emphasize both, building a feedback loop.

Conceptualizing paradata
Different perspectives of paradata. Source: Isto Huvila

Content professionals should learn from other disciplines, but they should not expect others to talk about paradata in the same way.  Paradata concepts are sometimes discussed using other terms, such as software observability. 

Paradata for surveys and research data

Paradata is most closely associated with developing research data, especially statistical data from surveys. Survey researchers pioneered the field of paradata several decades ago, aware of the sensitivity of survey results to the conditions under which they are administered.

The National Institute of Statistical Sciences describes paradata as “data about the process of survey production” and as “formalized data on methodologies, processes and quality associated with the production and assembly of statistical data.”  

Researchers realize how information is assembled can influence what can be concluded from it. In a survey, confounding factors could be a glitch in a form or a leading question that prompts people to answer in a given way disproportionately. 

The US Census Bureau, which conducts a range of surveys of individuals and businesses, explains: “Paradata is a term used to describe data generated as a by-product of the data collection process. Types of paradata vary from contact attempt history records for interviewer-assisted operations, to form tracing using tracking numbers in mail surveys, to keystroke or mouse-click history for internet self-response surveys.”  For example, the Census Bureau uses paradata to understand and adjust for non-responses to surveys. 

Paradata for surveys
Source: NDDI 

As computers become more prominent in the administration of surveys, they become actors influencing the process. Computers can record an array of interactions between people and software.

 Why should content professionals care about survey processes?

Think about surveys as a structured approach to assembling information about a topic of interest. Paradata can indicate whether users could submit survey answers and under what conditions people were most likely to respond.  Researchers use paradata to measure user burden. Paradata helps illuminate the work required to provide information –a topic relevant to content professionals interested in the authoring experience of structured content.

Paradata supports research of all kinds, including UX research. It’s used in archaeology and archives to describe the process of acquiring and preserving assets and changes that may happen to them through their handling. It’s also used in experimental data in the life sciences.

Paradata supports reuse. It provides information about the context in which information was developed, improving its quality, utility, and reusability.

Researchers in many fields are embracing what is known as the FAIR principles: making data Findable, Accessible, Interoperable, and Reusable. Scientists want the ability to reproduce the results of previous research and build upon new knowledge. Paradata supports the goals of FAIR data.  As one study notes, “understanding and documentation of the contexts of creation, curation and use of research data…make it useful and usable for researchers and other potential users in the future.”

Content developers similarly should aspire to make their content findable, accessible, interoperable, and reusable for the benefit of others. 

Paradata for learning resources

Learning resources are specialized content that needs to adapt to different learners and goals. How resources are used and changed influences the outcomes they achieve. Some education researchers have described paradata as “learning resource analytics.”

Paradata for instructional resources is linked to learning goals. “Paradata is generated through user processes of searching for content, identifying interest for subsequent use, correlating resources to specific learning goals or standards, and integrating content into educational practices,” notes a Wikipedia article. 

Data about usage isn’t represented in traditional metadata. A document prepared for the US Department of Education notes: “Say you want to share the fact that some people clicked on a link on my website that leads to a page describing the book. A verb for that is ‘click.’ You may want to indicate that some people bookmarked a video for a class on literature classics. A verb for that is ‘bookmark.’ In the prior example, a teacher presented resources to a class. The verb used for that is ‘taught.’ Traditional metadata has no mechanism for communicating these kinds of things.”

“Paradata may include individual or aggregate user interactions such as viewing, downloading, sharing to other users, favoriting, and embedding reusable content into derivative works, as well as contextualizing activities such as aligning content to educational standards, adding tags, and incorporating resources into curriculum.” 

Usage data can inform content development.  One article expresses the desire to “establish return feedback loops of data created by the activities of communities around that content—a type of data we have defined as paradata, adapting the term from its application in the social sciences.”

Unlike traditional web analytics, which focuses on web pages or user sessions and doesn’t consider the user context, paradata focuses on the user’s interactions in a content ecosystem over time. The data is linked to content assets to understand their use. It resembles social media metadata that tracks the propagation of events as a graph.

“Paradata provides a mechanism to openly exchange information about how resources are discovered, assessed for utility, and integrated into the processes of designing learning experiences. Each of the individual and collective actions that are the hallmarks of today’s workflow around digital content—favoriting, foldering, rating, sharing, remixing, embedding, and embellishing—are points of paradata that can serve as indicators about resource utility and emerging practices.”

Paradata for learning resources utilizes the Activity Stream JSON, which can track the interaction between actors and objects according to predefined verbs called an “Activity Schema” that can be measured. The approach can be applied to any kind of content.

Paradata for AI

AI has a growing influence over content development and distribution. Paradata is emerging as a strategy for producing “explainable AI” (XAI).  “Explainability, in the context of decision-making in software systems, refers to the ability to provide clear and understandable reasons behind the decisions, recommendations, and predictions made by the software.”

The Association for Intelligent Information Management (AIIM) has suggested that a “cohesive package of paradata may be used to document and explain AI applications employed by an individual or organization.” 

Paradata provides a manifest of the AI training data. AIIM identifies two kinds of paradata: technical and organizational.

Technical paradata includes:

  • The model’s training dataset
  • Versioning information
  • Evaluation and performance metrics
  • Logs generated
  • Existing documentation provided by a vendor

Organizational paradata includes:

  • Design, procurement, or implementation processes
  • Relevant AI policy
  • Ethical reviews conducted
Paradata for AI
Source: Patricia C. Franks

The provenance of AI models and their training has become a governance issue as more organizations use machine learning models and LLMs to develop and deliver content. AI models tend to be ” black boxes” that users are unable to untangle and understand. 

How AI models are constructed has governance implications, given their potential to be biased or contain unlicensed copyrighted or other proprietary data. Developing paradata for AI models will be essential if models expect wide adoption.

Paradata and document observability

Observing the unfolding of behavior helps to debug problems to make systems more resilient.

Fabrizio Ferri-Benedetti, whom I met some years ago in Barcelona at a Confab conference, recently wrote about a concept he calls “document observability” that has parallels to paradata.

Content practices can borrow from software practices. As software becomes more API-focused, firms are monitoring API logs and metrics to understand how various routines interact, a field called observability. The goal is to identify and understand unanticipated occurrences. “Debugging with observability is about preserving as much of the context around any given request as possible, so that you can reconstruct the environment and circumstances that triggered the bug.”

Observability utilizes a profile called MELT: Metrics, Events, Logs, and Traces. MELT is essentially paradata for APIs.

Software observability pattern
Software observability pattern.  Source: Karumuri, Solleza, Zdonik, and Tatbul

Content, like software, is becoming more API-enabled. Content can be tapped from different sources and fetched interactively. The interaction of content pieces in a dynamic context showcases the content’s temporal properties.

When things behave unexpectedly, systems designers need the ability to reverse engine behavior. An article in IEEE Software states: “One of the principles for tackling a complex system, such as a biochemical reaction system, is to obtain observability. Observability means the ability to reconstruct a system’s internal state from its outputs.”  

Ferri-Benedetti notes, “Software observability, or o11y, has many different definitions, but they all emphasize collecting data about the internal states of software components to troubleshoot issues with little prior knowledge.”  

Because documentation is essential to the software’s operation, Ferri-Benedetti  advocates treating “the docs as if they were a technical feature of the product,” where the content is “linked to the product by means of deep linking, session tracking, tracking codes, or similar mechanisms.”

He describes document observability (“do11y”) as “a frame of mind that informs the way you’ll approach the design of content and connected systems, and how you’ll measure success.”

In contrast to observability, which relies on incident-based indexing, paradata is generally defined by a formal schema. A schema allows stakeholders to manage and change the system instead of merely reacting to it and fixing its bugs. 

Applications of paradata to content operations and strategy

Why a new concept most people have never heard of? Content professionals must expand their toolkit.

Content is becoming more complex. It touches many actors: employees in various roles, customers with multiple needs, and IT systems with different responsibilities. Stakeholders need to understand the content’s intended purpose and use in practice and if those orientations diverge. Do people need to adapt content because the original does not meet their needs? Should people be adapting existing content, or should that content be easier to reuse in its original form?

Content continuously evolves and changes shape, acquiring emergent properties. People and AI customize, repurpose, and transform content, making it more challenging to know how these variations affect outcomes. Content decisions involve more people over extended time frames. 

Content professionals need better tools and metrics to understand how content behaves as a system. 

Paradata provides contextual data about the content’s trajectory. It builds on two kinds of metadata that connect content to user action:

  • Administrative metadata capturing the actions of the content creators or authors, intended audiences, approvers, versions, and when last updated
  • Usage metadata capturing the intended and actual uses of the content, both internal (asset role, rights, where item or assets are used) and external (number of views, average user rating)

Paradata also incorporates newer forms of semantic and blockchain-based metadata that address change over time:

  • Provenance metadata
  • Actions schema types

Provenance metadata has become essential for image content, which can be edited and transformed in multiple ways that change what it represents. Organizations need to know the source of the original and what edits have been made to it, especially with the rise of synthetic media. Metadata can indicate on what an image was based or derived from, who made changes, or what software generated changes. Two corporate initiatives focused on provenance metadata are the Content Authenticity Initiative and the Coalition for Content Provenance and Authenticity.

Actions are an established — but underutilized — dimension of metadata. The widely adopted schema.org vocabulary has a class of actions that address both software interactions and physical world actions. The schema.org actions build on the W3C Activity Streams standard, which was upgraded in version 2.0 to semantic standards based on JSON-LD types.

Content paradata can clarify common issues such as:

  • How can content pieces be reused?
  • What was the process for creating the content, and can one reuse that process to create something similar?
  • When and how was this content modified?

Paradata can help overcome operational challenges such as:

  • Content inventories where it is difficult to distinguish similar items or versions
  • Content workflows where it is difficult to model how distinct content types should be managed
  • Content analytics, where the performance of content items is bound up with channel-specific measurement tools

Implementing content paradata must be guided by a vision. The most mature application of paradata – for survey research – has evolved over several decades, prompted by the need to improve survey accuracy. Other research fields are adopting paradata practices as research funders insist that data be “FAIR.” Change is possible, but it doesn’t happen overnight. It requires having a clear objective.

It may seem unlikely that content publishing will embrace paradata anytime soon. However, the explosive growth of AI-generated content may provide the catalyst for introducing paradata elements into content practices. The unmanaged generation of content will be a problem too big to ignore.

The good news is that online content publishing can take advantage of existing metadata standards and frameworks that provide paradata. What’s needed is to incorporate these elements into content models that manage internal systems and external platforms.

Online publishers should introduce paradata into systems they directly manage, such as their digital asset management system or customer portals and apps. Because paradata can encompass a wide range of actions and behaviors, it is best to prioritize tracking actions that are difficult to discern but likely to have long-term consequences. 

Paradata can provide robust signals to reveal how content modifications impact an organization’s employees and customers.  

– Michael Andrews

Categories
Content Efficiency

Supporting content compliance using Generative AI

Content compliance is challenging and time-consuming. Surprisingly, one of the most interesting use cases for Generative AI in content operations is to support compliance.

Compliance shouldn’t be scary

Compliance can seem scary. Authors must use the right wording lest things go haywire later, be it bad press or social media exposure, regulatory scrutiny, or even lawsuits. Even when the odds of mistakes are low because the compliance process is rigorous, satisfying compliance requirements can seem arduous. It can involve rounds of rejections and frustration.

Competing demands. Enterprises recognize that compliance is essential and touches more content areas, but scaling compliance is hard. Lawyers or other experts know what’s compliant but often lack knowledge of what writers will be creating. Compliance is also challenging for compliance teams. 

Both writers and reviewers need better tools to make compliance easier and more predictable.

Compliance is risk management for content

Because words are important, words carry risks. The wrong phrasing or missing wording can expose firms to legal liability. The growing volume of content places big demands on legal and compliance teams that must review that content. 

A major issue in compliance is consistency. Inconsistent content is risky. Compliance teams want consistent phrasing so that the message complies with regulatory requirements while aligning with business objectives.

Compliant content is especially critical in fields such as finance, insurance, pharmaceuticals, medical devices, and the safety of consumer and industrial goods. Content about software faces more regulatory scrutiny as well, such as privacy disclosures and data rights. All kinds of products can be required to disclose information relating to health, safety, and environmental impacts.  

Compliance involves both what’s said and what’s left unsaid. Broadly, compliance looks at four thematic areas:

  1. Truthfulness
    1. Factual precision and accuracy 
    2. Statements would not reasonably be misinterpreted
    3. Not misleading about benefits, risks, or who is making a claim
    4. Product claims backed by substantial evidence
  2. Completeness
    1. Everything material is mentioned
    2. Nothing is undisclosed or hidden
    3. Restrictions or limitations are explained
  3. Whether impacts are noted
    1. Anticipated outcomes (future obligations and benefits, timing of future events)
    2. Potential risks (for example, potential financial or health harms)
    3. Known side effects or collateral consequences
  4. Whether the rights and obligations of parties are explained
    1. Contractual terms of parties
    2. Supplier’s responsibilities
    3. Legal liabilities 
    4. Voiding of terms
    5. Opting out
Example of a proposed rule from the Federal Trade Commission source: Federal Register

Content compliance affects more than legal boilerplate. Many kinds of content can require compliance review, from promotional messages to labels on UI checkboxes. Compliance can be a concern for any content type that expresses promises, guarantees, disclaimers, or terms and conditions.  It can also affect content that influences the safe use of a product or service, such as instructions or decision guidance. 

Compliance requirements will depend on the topic and intent of the content, as well as the jurisdiction of the publisher and audience.  Some content may be subject to rules from multiple bodies, both governmental regulatory agencies and “voluntary” industry standards or codes of conduct.

“Create once, reuse everywhere” is not always feasible. Historically, complaince teams have relied on prevetted legal statements that appear at the footer of web pages or in terms and conditions linked from a web page. Such content is comparatively easy to lock down and reuse where needed.

Governance, risk, and compliance (GRC) teams want consistent language, which helps them keep tabs on what’s been said and where it’s been presented. Reusing the same exact language everywhere provides control.

But as the scope of content subject to compliance concerns has widened and touches more types of content, the ability to quarantine compliance-related statements in separate content items is reduced. Compliance-touching content must match the context in which it appears and be integrated into the content experience. Not all such content fits a standardized template, even though the issues discussed are repeated. 

Compliance decisions rely on nuanced judgment. Authors may not think a statement appears deceptive, but regulators might have other views about what constitutes “false claims.” Compliance teams have expertise in how regulators might interpret statements.  They draw on guidance in statutes, regulations, policies, and elaborations given in supplementary comments that clarify what is compliant or not. This is too much information for authors to know.

Content and compliance teams need ways to address recurring issues that need to be addressed in contextually relevant ways.

Generative AI points to possibilities to automate some tasks to accelerate the review process. 

Strengths of Generative AI for compliance

Generative AI may seem like an unlikely technology to support compliance. It’s best known for its stochastic behavior, which can produce hallucinations – the stuff of compliance nightmares.  

Compliance tasks reframe how GenAI is used.  GenAI’s potential role in compliance is not to generate content but to review human-developed content. 

Because content generation produces so many hallucinations, researchers have been exploring ways to use LLMs to check GenAI outputs to reduce errors. These same techniques can be applied to the checking of human-developed content to empower writers and reduce workloads on compliance teams.

Generative AI can find discrepancies and deviations from expected practices. It trains its attention on patterns in text and other forms of content. 

While GenAI doesn’t understand the meaning of the text, it can locate places in the text that match other examples–a useful capability for authors and compliance teams needing to make sure noncompliant language doesn’t slip through.  Moreover, LLMs can process large volumes of text. 

GenAI focuses on wording and phrasing.  Generative AI processes sequences of text strings called tokens. Tokens aren’t necessarily full words or phrases but subparts of words or phrases. They are more granular than larger content units such as sentences or paragraphs. That granularity allows LLMs to process text at a deep level.

LLMs can compare sequences of strings and determine whether two pairs are similar or not. Tokenization allows GenAI to identify patterns in wording. It can spot similar phrasing even when different verb tenses or pronouns are used. 

LLMs can support compliance by comparing text and determining whether a string of text is similar to other texts. They can compare the drafted text to either a good example to follow or a bad example to avoid. Since the wording is highly contextual, similarities may not be exact matches, though they consist of highly similar text patterns.

GenAI can provide an X-ray view of content. Not all words are equally important. Some words carry more significance due to their implied meaning. But it can be easy to overlook special words embedded in the larger text or not realize their significance.

Generative AI can identify words or phrases within the text that carry very specific meanings from a compliance perspective. These terms can then be flagged and linked to canonical authoritative definitions so that writers understand how these words are understood from a compliance perspective. 

Generative AI can also flag vague or ambiguous words that have no reference defining what the words mean in the context. For example, if the text mentions the word “party,” there needs to be a definition of what is meant by that term that’s available in the immediate context where the term is used.

GenAI’s “multimodal” capabilities help evaluate the context in which the content appears. Generative AI is not limited to processing text strings. It is becoming more multimodal, allowing it to “read” images. This is helpful when reviewing visual content for compliance, given that regulators insist that disclosures must be “conspicuous” and located near the claim to which they relate.

GenAI is incorporating large vision models (LVMs) that can process images that contain text and layout. LVMs accept images as input prompts and identify elements. Multimodal evaluations can evaluate three critical compliance factors relating to how content is displayed:

  1. Placement
  2. Proximity
  3. Prominence

Two writing tools suggest how GenAI can improve compliance.  The first, the Draft Analyzer from Bloomberg Law, can compare clauses in text. The second, from Writer, shows how GenAI might help teams assess compliance with regulatory standards.

Use Case: Clause comparison

Clauses are the atomic units of content compliance–the most basic units that convey meaning. When read by themselves, clauses don’t always represent a complete sentence or a complete standalone idea. However, they convey a concept that makes a claim about the organization, its products, or what customers can expect. 

While structured content management tends to focus on whole chunks of content, such as sentences and paragraphs, compliance staff focus on clauses–phrases within sentences and paragraphs.  Clauses are tokens.

Clauses carry legal implications. Compliance teams want to verify the incorporation of required clauses and to reuse approved wording.

While the use of certain words or phrases may be forbidden, in other cases, words can be used only in particular circumstances.  Rules exist around when it’s permitted to refer to something as “new” or “free,” for example.  GenAI tools can help writers compare their proposed language with examples of approved usage.

Giving writers a pre-compliance vetting of their draft. Bloomberg Law has created a generative AI plugin called Draft Analyzer that works inside Microsoft Word. While the product is geared toward lawyers drafting long-form contracts, its technology principles are relevant to anyone who drafts content that requires compliance review.

Draft Analyzer provides “semantic analysis tools” to “identify and flag potential risks and obligations.”   It looks for:

  • Obligations (what’s promised)
  • Dates (when obligations are effective)
  • Trigger language (under what circumstances the obligation is effective)

For clauses of interest, the tool compares the text to other examples, known as “precedents.”  Precedents are examples of similar language extracted from prior language used within an organization or extracted examples of “market standard” language used by other organizations.  It can even generate a composite standard example based on language your organization has used previously. Precedents serve as a “benchmark” to compare draft text with conforming examples.

Importantly, writers can compare draft clauses with multiple precedents since the words needed may not match exactly with any single example. Bloomberg Law notes: “When you run Draft Analyzer over your text, it presents the Most Common and Closest Match clusters of linguistically similar paragraphs.”  By showing examples based on both similarity and salience, writers can see if what they want to write deviates from norms or is simply less commonly written.

Bloomberg Law cites four benefits of their tool.  It can:

  • Reveal how “standard” some language is.
  • Reveal if language is uncommon with few or no source documents and thus a unique expression of a message.
  • Promote learning by allowing writers to review similar wording used in precedents, enabling them to draft new text that avoids weaknesses and includes strengths.
  • Spot “missing” language, especially when precedents include language not included in the draft. 

While clauses often deal with future promises, other statements that must be reviewed by compliance teams relate to factual claims. Teams need to check whether the statements made are true. 

Use Case: Claims checking

Organizations want to put a positive spin on what they’ve done and what they offer. But sometimes, they make claims that are subject to debate or even false. 

Writers need to be aware of when they make a contestable claim and whether they offer proof to support such claims.

For example, how can a drug maker use the phrase “drug of choice”? The FDA notes: “The phrase ‘drug of choice,’ or any similar phrase or presentation, used in an advertisement or promotional labeling would make a superiority claim and, therefore, the advertisement or promotional labeling would require evidence to support that claim.” 

The phrase “drug of choice” may seem like a rhetorical device to a writer, but to a compliance officer, it represents a factual claim. Rhetorical phrases can often not stand out as facts because they are used widely and casually. Fortunately, GenAI can help check the presence of claims in text.

Using GenAI to spot factual claims. The development of AI fact-checking techniques has been motivated by the need to see where generative AI may have introduced misinformation or hallucinations. These techniques can be also applied to human written content.

The discipline of prompt engineering has developed a prompt that can check if statements make claims that should be factually verified.  The prompt is known as the “Fact Check List Pattern.”  A team at Vanderbilt University describes the pattern as a way to “generate a set of facts that are contained in the output.” They note: “The user may have expertise in some topics related to the question but not others. The fact check list can be tailored to topics that the user is not as experienced in or where there is the most risk.” They add: “The Fact Check List pattern should be employed whenever users are not experts in the domain for which they are generating output.”  

The fact check list pattern helps writers identify risky claims, especially ones about issues for which they aren’t experts.

The fact check list pattern is implemented in a commercial tool from the firm Writer. The firm states that its product “eliminates [the] risk of ‘plausible BS’ in highly regulated industries” and “ensures accuracy with fact checks on every claim.”

Screenshot of Writer screen
Writer functionality evaluating claims in an ad image. Source: VentureBeat

Writer illustrates claim checking with a multimodal example, where a “vision LLM” assesses visual images such as pharmaceutical ads. The LLM can assess the text in the ad and determine if it is making a claim. 

GenAI’s role as a support tool

Generative AI doesn’t replace writers or compliance reviewers.  But it can help make the process smoother and faster for all by spotting issues early in the process and accelerating the development of compliant copy.

While GenAI won’t write compliant copy, it can be used to rewrite copy to make it more compliant. Writer advertises that their tool can allow users to transform copy and “rewrite in a way that’s consistent with an act” such as the Military Lending Act

While Regulatory Technology tools (RegTech) have been around for a few years now, we are in the early days of using GenAI to support compliance. Because of compliance’s importance, we may see options emerge targeting specific industries. 

Screenshot Federal Register formats menu
Formats for Federal Register notices

It’s encouraging that regulators and their publishers, such as the Federal Register in the US, provide regulations in developer-friendly formats such as JSON or XML. The same is happening in the EU. This open access will encourage the development of more applications.

– Michael Andrews