Contextual Entity Markup

Contextual Entity Markup

Chief Information Officer

IC Technical Specifications

XML Data Encoding Specification for Contextual Entity Markup

Overview

This XML Data Encoding Specification for Contextual Entity Markup (CEM.XML) defines detailed implementation guidance for XML-encoding of Contextual Entity Markup elements (e.g., country, drug, compound). CEM.XML is used to curate textual content adding information such as a pseudonym for the string “Mary Jane” to be marijuana.

 

This specification is intended to supplement other specifications, such as PUBS.XML, by enhancing the in-line metadata tagging with contextual meaning that is not part of normal document or metadata markup. The use of such enhancements facilitates search, discovery, and many other enterprise tasks by more clearly tagging intent and disambiguating terms.

 

CEM.XML has a set of Entity tags that may grow over time as the requirements are presented. The starting set of Entities was extracted from PUBS.XML 2016-SEP, and versions of PUBS.XML starting with 2018-APR, use CEM.XML to replace the inline markup that it had since inception. The Schema Guide lists all the currently defined entities, their attributes, and definitions.

 

This specification is maintained by the IC Chief Information Officer via the Data Standards Coordination Activity (DSCA) and Common Metadata Standards Tiger Team (CMSTT).

 

Technical Specification Downloads

 

Latest Approved Public Release:

 

Mission Requirements

 

This DES is designed to fulfill a number of requirements in support of the transformational efforts of the IC. These requirements include:

  • Capturing descriptive metadata markup
  • Supplementing other specifications with descriptive metadata markup.

Both enterprise needs and requirements for this specification can be found in the following policies and implementation guidance:

  • 500 Series:
    • Intelligence Community Directive (ICD) 500, Director Of National Intelligence Chief Information Officer
    • Intelligence Community Standard (ICS) 500-21, Tagging of Intelligence and Intelligence-Related Information