OUP user menu

★ Research Paper ★

Using SNOMED CT to Represent Two Interface Terminologies

S. Trent Rosenbloom , Steven H. Brown , David Froehling , Brent A. Bauer , Dietlind L. Wahner-Roedler , William M. Gregg , Peter L. Elkin
DOI: http://dx.doi.org/10.1197/jamia.M2694 81-88 First published online: 1 January 2009


Objective: Interface terminologies are designed to support interactions between humans and structured medical information. In particular, many interface terminologies have been developed for structured computer based documentation systems. Experts and policy-makers have recommended that interface terminologies be mapped to reference terminologies. The goal of the current study was to evaluate how well the reference terminology SNOMED CT could map to and represent two interface terminologies, MEDCIN and the Categorical Health Information Structured Lexicon (CHISL).

Design: Automated mappings between SNOMED CT and 500 terms from each of the two interface terminologies were evaluated by human reviewers, who also searched SNOMED CT to identify better mappings when this was judged to be necessary. Reviewers judged whether they believed the interface terms to be clinically appropriate, whether the terms were covered by SNOMED CT concepts and whether the terms' implied semantic structure could be represented by SNOMED CT.

Measurements: Outcomes included concept coverage by SNOMED CT for study terms and their implied semantics. Agreement statistics and compositionality measures were calculated.

Results: The SNOMED CT terminology contained concepts to represent 92.4% of MEDCIN and 95.9% of CHISL terms. Semantic structures implied by study terms were less well covered, with some complex compositional expressions requiring semantics not present in SNOMED CT. Among sampled terms, those from MEDCIN were more complex than those from CHISL, containing an average 3.8 versus 1.8 atomic concepts respectively, p<0.001.

Conclusion: Our findings support using SNOMED CT to provide standardized representations of information created using these two terminologies, but suggest that enriching SNOMED CT semantics would improve representation of the external terms.


Informatics researchers and developers have created numerous “structured entry” systems in recent years with goals of streamlining clinical documentation and data collection workflows. Structured entry systems are a specialized type of computer based documentation (CBD) system designed to allow healthcare providers simultaneously and efficiently to create both complete clinical notes13 and machine-readable data as an automatic by-product of documentation.2,47 Structured CBD systems have been used successfully for focused documentation tasks, including for guideline-based pediatric care;2 for documenting radiology reports;8 for guiding the care of patients with rheumatoid arthritis;9 for performing examinations on Veterans requesting compensation and pension for disabilities;10 for recording endoscopy procedure findings;11,12 and for managing chemotherapy treatment and charting,13,14 among many others. Despite these successes, investigators have identified factors reducing their usability and flexibility.6,1520 In particular, healthcare providers generally prefer to document medical findings, processes and outcomes using vocabularies that are more similar to natural clinical language,19,21 and may eschew documentation systems that seem to constrain their natural mode of expression.

To improve their acceptability to healthcare providers, structured CBD systems may use specialized terminologies containing relatively common clinical terms.5,7 These so-called “interface terminologies” have evolved to fill this role and to facilitate the interaction between healthcare providers and structured CBD systems.5,2228 The authors previously condensed various interface terminology definitions to produce the following: “systematic collections of clinically oriented phrases (i.e., ‘terms’) aggregated to support clinicians' entry of patient information directly into computer programs, such as clinical documentation (i.e., ‘note capture’) systems or decision support tools.”5 While clinical terminologies in general represent and aggregate the information that makes up a given medical domain's conceptual knowledge and store this information in the form of terms, concept identifiers, and semantic relationships,2931 specialized clinical terminologies may be designed to be used according to varying functional needs.5,24,27,3234 Interface terminologies generally consist of a rich set of flexible and colloquial phrases displayed in the graphical or text interfaces of specific computer programs, including for clinical documentation in electronic health record systems,2,16,22,28,3538,39 text generation,40 problem list entry,17,4144 and computerized provider order entry with decision support.4551

Despite their prevalence in structured CBD systems, no single standard interface terminology exists. In contrast, standards have been identified for terminologies meeting other needs, such as for reference terminologies. (Reference terminologies are those terminologies designed to provide exact and complete representations of a given domain's knowledge, including its entities and ideas, and their interrelationships, and are typically optimized to support the storage, retrieval, and classification of clinical data.) For example, in 2003, the United States National Committee on Vital and Health Statistics (NCVHS) and the United States government's multiagency consolidated health informatics (CHI) council recommended a core set of reference terminologies as standards for representing aspects of patient medical record information. The NCVHS selected the standard terminologies on the basis of those which “(1) are required to adequately cover the domain of patient medical record information and (2) meet essential technical criteria to serve as reference terminologies”.52 The terminologies recommended to serve as standards include the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), which was to be used as a reference terminology for “the exchange, aggregation, and analysis of [certain types of] patient medical information.” (Although the NCVHS did not recommend it for this purpose, in some contexts SNOMED CT may itself serve as an interface terminology.) The NCVHS called for commonly used interface terminologies to be mapped to standard reference terminologies rather than identifying one or more interface terminologies to serve as standards. In particular, the NCVHS recommended that the interface terminology MEDCIN be mapped to SNOMED CT.

Motivated by the NCVHS report, the current study's goal was to characterize how well SNOMED CT can represent the concepts and semantic relationships implied by terms from two interface terminologies being actively used by healthcare providers documenting clinical care. These interface terminologies include MEDCIN, which is in use in a number of commercial and United States Department of Defense CBD systems, and the Categorical Health Information Structured Lexicon (CHISL) in use at Vanderbilt University Medical Center in support of a structured clinical documentation system.7,40


To characterize SNOMED CT's coverage for a quasi-random selection (described below) of interface terms taken from MEDCIN and CHISL, the current study applied methods previously described23 to measure the following outcomes: 1) coverage for the concepts implied by terms in the two vocabularies; 2) coverage and characterization of the semantic linkages implied by complex compositional expressions contained in the study terms; and 3) quantification of the complexity of the concepts implied by the study terms, according to the degrees of freedom35 required to represent complex compositional interface terms. Degrees of freedom (DOF) is a statistic that has been defined as the numbers of atomic concepts contained in relatively complex concepts.35 The methods used to measure these outcomes during the review process are detailed below.


The MEDCIN terminology,53 a terminology containing 215,000 concepts specifically designed to support a structured entry and reporting interface, has been in use for clinical documentation since 1986. It is currently implemented in numerous commercial EHR systems and was licensed by the Department of Defense for the AHLTA EHR system. Initially developed in 1978 by Peter Goltra as a clinical interface terminology overlying a database of clinical findings, MEDCIN has since expanded to include concepts from clinical histories, physical examination, tests, diagnoses and therapies to enable coding of complete patient encounters. The MEDCIN concepts are pre-coordinated with the goal of allowing “clinically precise phrasing”53 while preventing nonsensical compositions and concept modification. Concepts are arranged consistently in multiple hierarchies, have meaningless permanent identifiers, associated normal and reference values, and are linked to over 600,000 synonyms. The MEDCIN terminology also includes sanctioning logic, called ‘relationships,’ to enable the display of other concepts clinically relevant to the concept that a user is documenting. The MEDCIN concepts have also been encoded to include attributes to support prose generation from categorically entered data. The MEDCIN terminology is currently linked to other terminologies, including CPT-4, ICD-9, ICD-10 and DSM-IV.

The Categorical Health Information Structured Lexicon (CHISL) is an interface terminology designed to support structured documentation of clinical encounters between healthcare providers and patients. The CHISL was derived from a subset of the terminology supporting the INTERNIST-1® and the Quick Medical Reference® (QMR®) diagnostic expert systems developed by Myers, Miller and Masarie,54,55 and CHISL has been under development and in use at Vanderbilt University Medical Center since 1999. The CHISL encodes commonly documented concepts from the history and physical examination sections in clinical notes. Concepts in CHISL are generally partially pre-coordinated but allow further post-coordination using modifiers from sanctioned lists, much like the generic findings in QMR.56 In this way, concepts may only be further detailed using modifiers approved for use with that concept; sanctioned lists are created concept by concept. In addition, a given concept may be represented using any synonymous terms according to the user's preference. All concepts, linkages to synonyms and relationships with sanctioned modifiers are encoded in the CHISL terminology. The CHISL has been used in general internal medicine, cardiology, emergency room triage, neurology and cardiothoracic surgery to document inpatient, outpatient, and postoperative care since 2000. It is used to generate an average of 150 physician notes per day.

Review Process

Reusing a method from prior investigations,5759 the authors applied automated processes using the Mayo Clinic's multi-threaded Clinical Vocabulary Server60 (MCVS) to link the MEDCIN and CHISL interface terminologies to SNOMED CT. The MCVS provides automatic concept and semantic linkage mappings between clinical terminologies and from free-text phrases to clinical terminologies. For the current study, the MCVS software was then programmed to present a random set of automatic mappings from the two interface terminologies to trained reviewers for evaluation. For both MEDCIN and CHISL, the MCVS presented 500 mappings between interface terms and reference concepts for review. A total of four board certified Internists were recruited as reviewers for the current study. Two of the four Internist reviewers evaluated each mapping between an interface term and SNOMED CT concepts (i.e., reviewers evaluated 500 interface terms each), with disagreements adjudicated by a third reviewer. All reviewers were blinded to others' reviews.

In the first step of the evaluation, reviewers judged whether SNOMED CT correctly covered the concepts expressed by terms from MEDCIN and CHISL according to a process that has been validated and applied in prior studies.5759 With this method, reviewers judged SNOMED CT as being “positive” when it contained the concepts represented by the study terms, and “negative” when it did not. Reviewers then qualified mappings as “true” or “false” based on whether it should be expected to contain the study term (e.g., SNOMED CT would not be expected to contain an ambiguous or incomplete term, so a reviewer would term the rating as a “true negative” when it did not contain that term). In this way, reviewers judged SNOMED CT's coverage as being: 1) “True Positive” when an appropriate SNOMED CT concept was found to represent the term; 2) “False Positive” when they believed the term to be ambiguous or incomplete, and yet SNOMED CT contained a concept which matched to the term; 3) “False Negative” when SNOMED CT could not represent an appropriate term and, 4) “True Negative” when no SNOMED CT concept could be found to represent what the reviewers believed to be an ambiguous or incomplete term. The reason why potentially ambiguous or incomplete terms were evaluated is because they were contained in the terminologies under investigation, with examples provided in the results section. The reviewer script for this step is provided in Figure 1. To determine whether terms were contained in SNOMED CT, reviewers searched the MCVS for any concept that correctly represented the concepts expressed by the term. The term was considered as present in SNOMED CT if reviewers were able to find or to compose it. Reviewers were instructed to identify accurate concept-level matches, regardless of whether SNOMED CT contained the precise term itself.

Figure 1

Reviewers' script for evaluating term mappings.

When judging whether SNOMED CT concepts could represent the sample MEDCIN and CHISL terms, reviewers were permitted to use post-coordination. For all study terms requiring a compositional SNOMED CT expression, the review software counted degrees of freedom (DOF).35 Campbell operationally defined DOF as a statistic measuring the number of atomic concepts used to compose complex compositional expressions. He defined atomic concepts as being the most general concepts from a reference terminology that could be used to compose a more complex concept in a mapped interface terminology.35 Using this definition, evaluators can calculate the degrees of freedom for each concept in an interface terminology by mapping them to a relatively granular reference terminology (such as mapping concepts in MEDCIN to SNOMED CT).23

In a subsequent step, reviewers characterized which semantic linkages were required to build the compositional expressions to represent the study terms. Reviewers could chose from among the seventeen semantic linkages available in the MCVS, according to methods used in prior studies.6164 The semantic linkages contained in the MCVS were originally selected from all those available in SNOMED CT, based on the judgment by Mayo investigators that they were nonoverlapping.61 Mayo investigators supplemented the seventeen semantic linkages selected from SNOMED CT with two additional non-overlapping linkages to improve modeling flexibility. The seventeen SNOMED CT and two Mayo semantic linkages used for the current study are listed in Table 1.

View this table:
Table 1

Semantic Linkages Available to the MCVS and Study Reviewers, from SNOMED CT

Semantic IDSemantic Name
  • MCVS = multi-threaded clinical vocabulary server; SNOMED CT = systematized nomenclature of medicine clinical terms.

  • * Semantic linkages added to MCVS to complement those from SNOMED CT.

There were three possible outcomes of the attempt to represent a concept implied by a study term in SNOMED CT. In the first case, a term could exactly match a single SNOMED CT concept, and no composition or semantic linkage was necessary. In the second case, a term could be completely matched to a composition derived from SNOMED CT concepts, and the reviewers found the necessary semantic linkages to model the compositional expression. In the third case, reviewers could identify multiple concepts from SNOMED CT to represent relatively complex concepts implied by single terms, but could not find all the semantic linkages necessary to provide a complete composition. The rates of these three outcomes among mapped terms were calculated, as judged by two of the Internist reviewers working independently, with a third reviewer adjudicating any disagreement.

Statistical Analysis

For all reviews, investigators derived descriptive statistics, including total reviews categorized as true positives, true negatives, false positives, false negatives, and the average degrees of freedom contained in compositional expressions. Coverage by SNOMED CT for the terms contained in the study sample was calculated from these statistics as a sensitivity, according to methods used in prior studies.5759 Specifically, sensitivity was calculated as the rate of true positives out of the sum of true positives and false negatives (i.e., the rate at which SNOMED CT could represent the interface term out of all cases in which the interface term was judged to be a complete and unambiguous medical expression). Comparisons of means were performed using T-testing, and of proportions using Chi-squared testing. Agreement statistics were measured as positive agreement, which is the percentage of mappings for which both reviewers agreed from out of all mappings presented for review.


The MCVS automatically mapped a total 250 unique terms randomly selected from the history subtree of MEDCIN, 250 randomly selected from the exam subtree of MEDCIN, and 500 randomly selected from the history and exam subtree of CHISL to SNOMED CT and presented the mappings for review. Terms were selected independent of the frequency with witch they are used in corresponding structured entry applications. Four reviewers completed the entire review, with two evaluating each term. Reviewers had an initial positive agreement of 68.8% for MEDCIN and 86.8% for CHISL term mappings to SNOMED CT concepts. With semantic categorizations, reviewers had an initial positive agreement of 68.8% for MEDCIN and 96.4% for CHISL. Upon arbitration from the third reviewer, agreement reached 100% for each.


Overall, reviewers found that SNOMED CT concepts could cover 92.4% of the terms in the MEDCIN sample and 95.9% in the CHISL sample. Complete coverage statistics are presented in Table 2. Examples of covered terms follow, with the SNOMED CT concepts represented as: (concept ID number, concept preferred term). Reviewers mapped MEDCIN's “pony cart accident” to a compositional expression using the SNOMED CT concepts (3997000, Pony (organism)), (85455005, Cart, device) and (55566008, Accidental physical contact). The MEDCIN term “allergy to chocolate” was represented by the single SNOMED CT concept (300912001, Chocolate allergy). The CHISL contained the term, “right upper extremity blood pressure diastolic quantitative”, which was covered by the compositional expression in SNOMED CT made up of the concepts (271650006, Diastolic blood pressure), (30766002, Quantitative) and (6921000, Right upper extremity structure). Reviewers represented the CHISL term “heart apical impulse character—size” with the SNOMED CT concepts (302509004, Entire heart), (248656000, Character of apex beat) and (246115007, Size). In another case, reviewers selected the SNOMED CT concept (44169009, Loss of sense of smell) to cover the CHISL term “anosphrasia” [sic], even though SNOMED CT did not include this synonymous term.

View this table:
Table 2

Statistics for How Well SNOMED CT Could Represent the Interface Terms in the Study Sample, Regardless of Semantics

TerminologyNTPTNFPFNSensitivity, %Specificity, %
MEDCIN5004382153692.4 (89.6, 94.6)80.7 (60.6, 93.4)
CHISL500476312095.9 (93.8, 97.5)75.0 (19.4, 99.3)
  • CHISL = categorical health information structured lexicon.

  • N is the number of terms in the study sample. TP is the number of true positives, TN is true negative, FP is false positive and FN is false negative, as defined in the Methods section. Traditional coverage measures are here reported as Sensitivity. Sensitivity and specificity are reported with 95% confidence intervals.

As indicated by Table 2, SNOMED CT concepts could not represent all sampled MEDCIN and CHISL terms. An example MEDCIN term that SNOMED CT could not represent was “witness to violent trauma military event”. While SNOMED CT contains the concepts (417746004, traumatic injury) and (272379006, event), it did not contain concepts to cover the components “witness” and “military event,” and reviewers considered the concept (417746004, traumatic injury) an inexact match. Likewise, reviewers found that the MEDCIN term “living with stepsister” was not adequately covered by SNOMED CT, which contained (46363003, Step sister) but had no representation for the concept of living with a person. Reviewers found the SNOMED CT concept (365508006, Finding of residence and accommodation circumstances), which may have the same meaning, but which also may be more general than is implied by the MEDCIN term “living;” this was not clear to reviewers. This candidate SNOMED CT concept had 189 child concepts including one having a meaning that contradicts the MEDCIN term, (105529008, Lives alone) but had none explicitly covering “living with a person.” Reviewers rated this as a non-covered interface term. In another example, CHISL contained the term “slapping gait,” which was not covered by any of the 90 SNOMED CT concepts that were descendants of (22325002, Gait problem).

Reviewers observed that many CHISL terms also contained in their name the method used to acquire the finding. For example, the CHISL term “tremor observed” included the term “observed,” which explicitly indicated that the finding was obtained during the observation phase of a physical examination, as opposed to during palpation in a physical exam or by history-taking. Other methods of acquisition contained in CHISL terms included “auscultated” and “elicited,” none of which were contained in SNOMED CT. As a result, while SNOMED CT included (26079004, Tremor), it could not completely cover the CHISL term (tremor observed). Examples of interface terms that were not covered by SNOMED CT are presented in Table 3.

View this table:
Table 3

Examples of Interface Terms Not Fully Covered by SNOMED CT, Categorized by Whether the Term is Missing, the Concept is Missing or the Necessary Semantic Linkages are Missing in SNOMED CT

Interface TermSourceCategory
Witness to violent trauma military eventMEDCINInadequate concept coverage
Living with stepsisterMEDCINInadequate concept coverage
Chronic emotional stress from broken homeMEDCINAdequate concept coverage, missing semantics
The patient collapsed while holding the headMEDCINInadequate concept coverage, missing semantics
Slapping gaitCHISLInadequate concept coverage
Femoral artery systolic pressure over 40 mmHg above brachial systolic pressureCHISLAdequate concept coverage, missing semantics
Pouting of the lips with pressure on the lipsCHISLAdequate concept coverage, missing semantics
  • CHISL = categorical health information structured lexicon; SNOMED CT = systematized nomenclature of medicine clinical terms.

The reviewers found a total of 26 incomplete or ambiguous terms in MEDCIN and 4 in CHISL. These were classified as false positive and true negative, in Table 2. For example, the study sample from the terminology MEDCIN contained the term, “Meal Prep/Cleanup Prepare/Serve Food Moderate Assistance.” In this example, the authors speculate that reviewers judged the term to be ambiguous, perhaps that the term component “Serve Food Moderate Assistance” did not express clearly “Patient is able to serve food with only moderate assistance.” Reviewers also rated the CHISL term, “Nonfluency” as incomplete or ambiguous.

In one case, reviewers did not find a SNOMED CT concept or compositional expression to cover study interface terms, even though an appropriate concept existed. None of the reviewers found a concept in SNOMED CT to cover the CHISL interface term “hysterical dysbasia,” which can be defined as a gait abnormality due to hysterics. While SNOMED CT did not contain the term “dysbasia,” it could represent it using the synonym “difficulty walking” in at least the two ways: through the composition of (228158008, difficulty walking) and (39638009, perception hysteria), or (228158008, difficulty walking) and (44376007, hysteria).

Semantic Linkages

The review software and reviewers identified numerous semantic linkages from the SNOMED CT subset used in the current study to represent those implied by the complex interface terms. Table 4 reports the rates at which covered study interface terms required semantic linkages, and how often the necessary linkages were present. The most commonly used semantic linkages included “HAS_FINDING_ SITE”, “HAS_LATERALITY”, “HAS_PROCEDURE_SITE”, “HAS_CAUSATIVE_AGENT” and “HAS_SPECIMEN”. Reviewers also found the two non-SNOMED CT semantic linkages previously added to MCVS, “IS_MODIFIED_BY” and “IS_QUALIFIED_BY” to be implied by sample MEDCIN and CHISL interface terms.

View this table:
Table 4

Categories of Semantic Coverage for Interface Terms Covered by SNOMED CT

Semantic Coverage CategoryMEDCINCHISL
No semantic linkages needed14 (3.2%)232 (49.1%)
Necessary semantic linkages present241 (55.1%)80 (16.9%)
Necessary semantic linkages incomplete182 (41.6%)160 (33.9%)
  • CHISL = categorical health information structured lexicon; SNOMED CT = systematized nomenclature of medicine clinical terms.

  • p<0.001 for table.


Study terms from MEDCIN and CHISL had varying complexity in terms of the number of concepts they represented. MEDCIN terms represented a range of one to nine atomic concepts each, while CHISL terms represented a range of one to ten concepts each. Among sampled interface terms, those from MEDCIN had a higher mean degrees of freedom than those from CHISL (3.8 versus 1.8 respectively, p<0.001).


The primary goal for interface terminologies is to support the interaction between clinical users and structured representations of medical data, often by serving to support structured documentation into electronic health record systems.5 Terminological attributes that maximize the efficiency of data entry may make interface terminologies more usable. Such attributes include a rich synonymy, a level of detail that matches the natural language common to relevant biomedical discourse and a balance between pre-coordination and post-coordination which facilitates searching for or composing terms when needed. However, these attributes may make the interface less able to provide formal knowledge representation required for clinical data exchange, aggregation, and analysis.27 In contrast, reference terminologies are typically designed to provide a formal representation of medical knowledge and may be used to provide ontologic rigor and standardization to interface terminologies. A reference terminology would be expected to be able to represent a given interface terminology only if the two covered the same knowledge domain.

The current study characterized how well the concepts and semantic structures available in the reference terminology SNOMED CT could represent the concepts underlying a random selection of terms from two interface terminologies that have been used in clinical practice. The reviewers in this study found that SNOMED CT had excellent coverage for the concepts underlying the sample interface terms evaluated, with better coverage for those from CHISL than from MEDCIN. In the sample, the interface terms from MEDCIN were more complex in terms both of average degrees of freedom and implied semantics. That SNOMED CT concepts covered the study interface terms generally supports the NCVHS' call to map commonly used interface terminologies to standard reference terminologies such as SNOMED CT.

While coverage by SNOMED CT for the two interface terminologies CHISL and MEDCIN was high in the current study, it may actually be higher than measured by the reviewers. As in the example above, some interface terms did not match SNOMED CT terms, even though the underlying concept could be represented. Interface terminology users may benefit from having access to a rich synonymy5 that allows them to represent clinical entities using the words or phrases that they prefer. A rich synonymy can improve the efficiency of searching a terminology for a needed term and may enhance the expressivity and accuracy of a document coded using the terminology. In the above example, reviewers did not find a concept to represent “hysterical dysbasia” even though it could be composed in SNOMED CT. If SNOMED CT is used to represent commonly used interface terminologies for data storage and aggregation rather than at the human-terminology interface, missing synonyms may not be a problem.

Overall concept coverage statistics in the current study were calculated without requiring that all necessary semantic linkages be present when complex interface terms required compositional expressions in SNOMED CT. When mapping terminologies, requiring that terms be mapped using both correct concepts and complete semantics may improve how completely and correctly the knowledge implied by interface terms is formally represented by the mapped reference terminology. However, requiring that semantics be used for all compositional expressions may reduce the number of interface terms that can be completely represented by an interface terminology and would increase the complexity of the mapping task. In the current study, such a requirement would have decreased coverage by up to 41% for MEDCIN and 33% for CHISL. For example, the study sample included the complex CHISL term above, “femoral artery systolic pressure over 40 mmHg above brachial systolic pressure,” which is synonymous with “Hill's sign,” and is not represented directly in SNOMED CT. Reviewers represented this concept using four concepts and two qualifiers from SNOMED CT, “Entire femoral artery,” “Systolic blood pressure,” “Over,” “mmHG,” “Entire brachial artery,” and “Systolic blood pressure.” The review software and human reviewers did not specify any semantic linkages for structuring the complex CHISL interface term, although it may be reasonable to construct the following semantic structure from SNOMED CT: [(271649006, Systolic blood pressure) HAS_FINDING_SITE (244332003, Entire femoral artery)] and [(271649006, Systolic blood pressure) HAS_FINDING_SITE (181322008, Entire brachial artery)]. An additional semantic linkage would need to be added to describe the relationship between the femoral and brachial artery blood pressures implied by this interface term. Without the added semantic linkage, the reference terminology representation for this interface term would be incomplete.

It is unclear the degree to which a complex interface term containing numerous atomic concepts needs complete semantics to be unambiguously modeled by a reference terminology. For example, reviewers could not fully model the semantic linkages implied by the interface term, “right thumb adduction”, but they could find the underlying concepts. It is likely that modeling the interface term to SNOMED concepts and semantic linkages, [(302540006, Entire thumb) HAS_LATERALITY (24028007, Right (qualifier value)] and (11554009, Adduction) is correct and unambiguous, even though not all the necessary semantic linkages were found. The presence of a semantic linkage between [(302540006, Entire thumb) HAS_LATERALITY (24028007, Right (qualifier value)] and (11554009, Adduction) would reduce ambiguity further, but it is not clear to what degree and with what incremental utility. Terminology evaluators should measure the incremental utility in real-world systems of having complete semantic representations of compositional expressions implied by complex interface terms. Likewise, reference terminology developers should enrich the set of available semantic linkages to ensure that interface terms and other natural language phrases can be completely and unambiguously covered.

The percent positive agreement among reviewers differed for the two interface terminologies, with agreement for both concept mapping and semantic categorization tasks lower for MEDCIN than for CHISL. The authors speculate that the lower rates of agreement for MEDCIN relate to two factors. First, in the current study, complexity measures for MEDCIN were higher than they were for CHISL (as above, MEDCIN had 3.8 degrees of freedom, on average, versus 1.8 for CHISL). Second, MEDCIN is much a much larger terminology than CHISL and as a result likely contains more specialized terms. These two factors would have increased the difficulty of the reviewers' task of finding the most appropriate concepts for covering the study terms.

The authors have previously speculated that developers may be able to enrich interface terminologies by using reference terminology concepts as a starting point.5 With this approach, interface terms could be created either by identifying them or composing them from those contained in standard reference terminologies such as SNOMED CT. This method would require that developers and knowledge domain experts assemble clinically meaningful compositions, appropriate synonyms and linkages between concepts and related concepts or modifiers. While this approach may be somewhat labor-intensive, it may permit the underlying formal structure provided by the source reference terminology to remain, while simultaneously presenting clinicians with complex and meaningful interface terms. Starting with a single standard reference terminology such as SNOMED CT would permit a uniform back-end representation regardless of the diverse terms that users require.

The current study did not explore how well MEDCIN and CHISL function in their role as interface terminologies when used directly at the human-terminology interface, or the relative frequency with which specific terms in these vocabularies are actually used. Studies of interface terminologies, as the investigators have previously speculated,5 should directly test synonymy, the degree of pre-coordination and ability to support post-coordination, and how well assertional medical knowledge is used to link together related concepts and modifiers. Other attributes, such as a consistent syntactic approach to constructing concepts' preferred terms, including attributes to support natural language generation, independence from the application that uses it and a formal semantic structure, may also enhance interface terminology usability. The investigators are not aware of any studies evaluating the fitness of MEDCIN or CHISL as interface terminologies according to these attributes.


Interface terminologies are designed specifically to support human interaction with structured clinical data, particularly when documenting clinical care into structured computer based documentation systems. Such terminologies may benefit from being linked to standard reference terminologies such as SNOMED CT. The current study demonstrated that SNOMED CT provided greater than 90% coverage for the concepts contained in random samples of the history and physical examination sections of two interface terminologies in use supporting clinical documentation, but that this coverage may be lower if a formal semantic model were imposed upon the mappings, or higher if the study were repeated using interface terms that are relatively frequently used. These findings support using existing reference terminology concepts—and possibly semantic linkages—for representing interface terminologies.


View Abstract