Packaging, transforming and migrating data from a scientific research project to an institutional repository: the SGS LTER Collection
Date
2014-12-10
Authors
Kaplan, Nicole E., author
Baker, Karen S., author
Draper, Daniel C., author
Swauger, Shea, author
SGS-LTER, Colorado State University, publisher
Journal Title
Journal ISSN
Volume Title
Abstract
This report describes the process of preserving a collection of project-related scientific research materials - data, metadata, and artifacts - produced over 32 years at the Shortgrass Steppe Long Term Ecological Research (SGS LTER) site. The SGS LTER operated out of Colorado State University (CSU), located in Nunn, Colorado and was funded by National Science Foundation (NSF). Preservation plans were motivated by the 2012 decommissioning announcement for this long-term project (1982-2014) and its local data management system. A two-fold strategy was developed to ensure preservation and community access to the entire collection. In addition to satisfying NSF requirements for submission of data to the LTER Network Information System (LTER NIS), the local information manager identified a second task: creation of a collection including data, metadata and a diverse set of materials that together represent the SGS LTER project as a whole. Migration of the SGS LTER data management system was designated a pilot project for curation of research data within the CSU Institutional Repository, as part of Digital Collections of Colorado (DCC). The SGS LTER collection comprises approximately 5 gigabytes of data and supporting materials. There are close to one hundred datasets produced by SGS LTER that are diverse, small files with extensive metadata, well described using the Ecological Metadata Language (EML). These data are largely field-based, geo-located, time-series measurements, which have been integrated longitudinally. Other series of materials prepared for the collection include over 400 image files, 17 Geographic Information System spatial layers, species lists, and proposals and progress reports to NSF. EML from the SGS LTER data management system was transformed to Dublin Core for discovery through the DCC and was used to implement an expanded set of elements important for research data documentation. A strategy was developed to meet the requirement for programmatic access by machine to data from the LTER NIS via a landing page created for each data package. In effect, data are publicly available and automatically harvested by other data repositories, transforming the SGS LTER collection from existing independently to contributing as part of a federated network of scholarly research. Expansion of the notion of curation from submission of research data to that of creating an interoperable SGS LTER project collection within the DCC revealed new issues and activities to consider. Issues that emerged included design of workflows to create and transform metadata, data exchange between source and secondary repositories, versioning and use of persistent identifiers for digital objects, data citation registries for assessing outcomes of research, and the role of a collection-related information manager. This pilot study was made possible by an interdisciplinary, collaborative effort to preserve data and materials from a historical scientific research project.
Description
The SGS-LTER research site was established in 1980 by researchers at Colorado State University as part of a network of long-term research sites within the US LTER Network, supported by the National Science Foundation. Scientists within the Natural Resource Ecology Lab, Department of Forest and Rangeland Stewardship, Department of Soil and Crop Sciences, and Biology Department at CSU, California State Fullerton, USDA Agricultural Research Service, University of Northern Colorado, and the University of Wyoming, among others, have contributed to our understanding of the structure and functions of the shortgrass steppe and other diverse ecosystems across the network while maintaining a common mission and sharing expertise, data and infrastructure.
10 December 2014.
10 December 2014.
Rights Access
Subject
Pawnee National Grassland
Central Plains Experimental Range
data
artefacts
information infrastructure
access
collaborative team
long term ecological research
shortgrass steppe
grassland ecology