Data Documentation and Metadata

Research Data Management

Best Practices

Metadata & Data Documentation

What is Metadata?

Metadata is a term that has primarily been used by library and archives communities to describe standards used to aid the discovery of objects. Metadata standards are composed of metadata elements, sometimes called metadata fields. Metadata standards are created to facilitate searching similar items by using similar terms and constructs to describe them. A metadata record consists of all the metadata elements describing an object. Metadata records are often expressed in XML or other machine-readable formats for easy integration within systems.

There are three basic categories of metadata elements: descriptive, technical/structural, and administrative. All objects also have a unique identifier metadata element.

  1. Descriptive metadata elements consist of information about the content and context of an object. For example, descriptive metadata for an image may include: title, creator, subject (tags), and description (abstract).
  2. Technical/structural metadata elements describe the format, process, and inter-relatedness of objects. For example, technical/structural metadata for an image may include: camera, aperture, exposure, file format, and set (if in a series).
  3. Administrative metadata elements describe information needed to manage or use the object. For example, administrative metadata for an image may include: creation date, copyright permissions, required software, provenance (history), and file integrity checks.

Metadata Guidelines

Data centers and repositories may require specific metadata standards in order to deposit data. Check with any repositories before you begin outlining the metadata plan for your data. If you are unaware of what metadata fields are required for your repository, contact Brian Westra.

A good starting place for a metadata plan if a standard has not been defined for your discipline is Dublin Core or Data-Cite's recommendations. The UO Libraries Digital Library Initiatives group is happy to help with the instructions and/or application of these standards. You may also want to look at various metadata fields used in Dryad or other data repositories to see how other researchers are describing their data.

If your discipline or repository does not require a specific metadata standard, the UO Libraries Digital Library Initiatives group can help advise. Based on the complexity of description, the amount of hours required to create a metadata plan can vary. Please make sure to meet with Metadata Services and Digital Projects (MSDP) to budget for developing a metadata plan before submitting your grant.

Metadata Best Practices

Good data documentation includes information on:

  • the context of data collection: project history, aims, objectives and hypotheses
  • data collection methods: data collection protocol,sampling design, instruments, hardware and software used, data scale and resolution, temporal coverage and geographic coverage
  • dataset structure of data files, cases, relationships between files
  • data sources used
  • data validation, checking, proofing, cleaning and other quality assurance procedures carried out
  • modifications made to data over time since their original creation and identification of different versions of datasets
  • information on data confidentiality, access and use conditions, where applicable

At data-level, datasets should also be documented with:

  • names, labels and descriptions for variables, records and their values
  • explanation of codes and classification schemes used
  • codes of, and reasons for, missing values
  • derived data created after collection, with code, algorithm or command file used to create them
  • weighting and grossing variables created
  • data listing with descriptions for cases, individuals or items studied

Variable-level descriptions may be embedded within a dataset itself as metadata. Other documentation may be contained in user guides, reports, publications, working papers and laboratory books. (from UK Data Archive)

Additional Information

If possible, include unique identifiers for the identify of authors/contributors with the Open Researcher & Contributor ID (ORCID).

Register public data sets with DataCite (this may be done automatically by some repositories, so confirm with them)

These are recent recommendations by JISC Managing Research Data Program. Contact the Data Services Librarian, Brian Westra for more information.

Metadata in Action - Examples

The following are examples of items with metadata highlighted in purple.

Flickr Metadata Example

Fig. 1. Image metadata in Flickr with title, user (creator), creation date, camera used, photostream (group or relation), tags, copyright information, and privacy setting. See item in Flickr, and additional metadata.

 
 

Dryad Metadata Example

Fig. 2. Data set in Dryad with title, bibliographic citation of published work, identifier, description, data package identifier, keywords, date depostied, file name, file size, file format, file type, and copyright information. See item in Dryad and full metadata view.

Other places to find metadata in action

Maintained by: Brian Westra, bwestra@uoregon.edu
Created by bwestra on Jul 24, 2012 Last updated Oct 30, 2013