Controlled Vocabularies

"A list of standardized terminology, words, or phrases, used for indexing or content analysis and information retrieval, usually in a defined information domain." (CASRAI)

What is it? Why is it important?

A controlled vocabulary is an organized arrangement of words and phrases used to index content and/or to retrieve content through browsing or searching. It typically includes preferred and variant terms and has a defined scope or describes a specific domain. Controlled vocabularies capture the richness of variant terms and promote consistency in preferred terms and the assignment of the same terms to similar content.

Controlled vocabularies are beneficial at the indexing process so that data providers and repositories apply the same term to refer to the same concept (e.g., person, place or thing) in a consistent way. This helps with search and discovery of content. Controlled vocabularies guide end-users to formulate their searches better as they may not know the correct term for a given concept. In fact, the most useful function of controlled vocabularies is to gather together variant terms and synonyms for concepts and link concepts in a logical order or organize them into categories. Thus, consolidating many different synonyms into one controlled term increases the number of useful hits returned by the search.

In the following presentation, Jochen Schirrwagen, Chair of the COAR Controlled Vocabularies Editorial Board provides an overview, and explains the importance of metadata and vocabularies.

In the following video, Rowan Brownlee presents on how publish controlled vocabularies including an introduction to SKOS, why it was developed, and how it may be used to express the features of a controlled vocabulary; tools for creating, managing, publishing and accessing vocabularies, including those provided by the Australian National Data Service (ANDS); vocabulary registry interoperability. Slides can be found at this link.

Brownlee, Rowan. Publishing Controlled Vocabularies for Access and Reuse

COAR Controlled Vocabularies

COAR develops a set of controlled vocabularies for the bibliographic metadata elements used in records describing research outputs. In order to define the controlled vocabularies, the COAR Controlled Vocabularies Editorial Board analyzes existing vocabularies and dictionaries and will use the most appropriate existing terms whenever possible. In the case where there are gaps, new terms are defined by the group in collaboration with the repository community.

COAR Resource Type Vocabulary: It defines concepts to identify the genre of a resource. Such resources, like publications, research data, audio and video objects, are typically deposited in institutional and thematic repositories or published in ejournals. See version 2.0

COAR Access Rights Vocabulary: It defines concepts to declare the access status of a resource. Multilingual labels regard regional distinctions in language and term. See version 1.0

COAR Version Tyoe Vocabulary: It defines concepts to declare the version of a resource. Multilingual labels regard regional distinctions in language and term. The concepts are adopted from the “Journal Article Versions (JAV): Recommendations of the NISO/ALPSP JAV Technical Working Group“. See version 1.0 draft

Implementation of vocabularies

COAR Controlled Vocabularies Editorial Team developed an FAQ to support the community and an Implementation Guide for the implementation of the vocabularies in repositories.

Carvalho, José, Saraiva, Ricardo, & Torres, Nelson. (2017, May). Prototype implementation of the COAR Resource Type vocabulary in DSpace.

Resources in Other Languages

Bernal, Isabel, & Azrilevich, Paola A. Vocabularios Controlados para Repositorios: Objetivos y Avances del Grupo de Trabajo COAR.