Managing data volume and diversity: the Australian Supersite Network data portal

TERN’s Australian Supersite Network (ASN) collects intensive ecological and biophysical datasets at 10 supersites spread across Australia, in different climate zones and over a wide range of biomes. The incoming data is also very diverse: it is generated by a number of ecological monitoring programs as well as continuously collected biophysical data streams using field-based instruments. How do you collate data of many different types from different sites such that it is readily available and accessible for users? ASN needed to solve this complex problem to ensure that maximal benefits could be delivered to the ecosystem science community from the investment made into Supersite research infrastructure. 

Over the last two years the Institute of Sustainable Research at the Queensland University of Technology has developed the Australian Supersite Network Data Portal. The portal provides a standard interface for the ecosystem community to deposit high-intensity, high-resolution data as well as upload contextual and historic datasets associated with the TERN supersites. The repository behind the portal is Metacat, which is used across the globe in ecology and environmental research communities such as the Centre for Tropical Forest Science Large Plot network and DataOne.

Datasets available through the ASN Data Portal can be downloaded immediately and used in accordance with the TERN licences attached to the data. Each of the datasets has metadata descriptions adhering to Ecological Markup Language (EML), a metadata specification for the ecology discipline. Currently 57 datasets are publically available in the TERN ASN repository, and can be accessed via the TERN Data Discovery Portal as well as the ASN Data Portal.

There are new challenges presented by the variety and size of the datasets collected. For example, the need for storage and management of extremely large acoustic datasetscollected from automated acoustic monitors in the field has led to the development of a 25 TB (terabyte) storage solution in association with the Queensland Cyber Infrastructure Facility (QCIF) and an annex portal called Bush.FM. Due to the nature of the acoustic data, which requires special software to store and display, a new portal had to be developed that was dedicated to acoustic data delivery. The acoustic datasets can be viewed and listened to as sonograms and segments, providing the research community with access both to the data and tools for analysis and reporting. The team is working with other TERN facilities on ways to display live data feeds from field instruments. They are also working on an education portal, which will provide ancillary TERN datasets to improve public engagement and assist the primary, secondary and tertiary students who are starting to take advantage of TERN infrastructure, data and facilities in their studies.

Published in the TERN e-Newsletter October 2012.

Share Article