Over 160TB of data are currently openly available via TERN—175TB if you include data stored in our NCRIS partner the Atlas of Living Australia. While the mind boggles at the science possible with such amounts of data, it also presents some major challenges for our users.
At the top of many researchers’ list of dilemmas is finding effective ways of managing and working with datasets that are simply too big for personal computers. Recognising the necessities of researchers working in the current big data era, TERN is developing Cloud and Virtual Laboratory infrastructure that help overcome this and many more big data problems.
The landmark collaborative online infrastructure project ecocloud is a central part of this infrastructure. Launched in October 2018, it gives researchers easy access to big data and tools using Cloud and Virtual Laboratory technology. Lift the hood of ecocloud and you’ll see the TERN CoESRA Virtual Desktop, a key, stand-alone, big data tool developed by TERN.
TERN CoESRA Virtual Desktop provides researchers with a free, portable and powerful computational environment to run experiments and share their work. It provides popular tools like RStudio, Canopy, Kepler Scientific Workflow, KNIME, QGIS, Panoply and OpenRefine to enable users to create, execute and share data simulations, visualisation, scripts and algorithms.
Dr Joy Tripovich of the Centre for Air pollution, energy and health Research (CAR), an NHMRC Centre of Research Excellence, and her team from the University of Sydney have used CoESRA to analyse large amounts of data on the relationship been heatwaves and the admission of dogs to vets.
"The datasets that I was working with were just too big to manipulate effectively on a local machine, so the TERN CoESRA Virtual Desktop provided a solution to this problem,” said Joy.
"Having access to all the datasets and software tools that have been collected by CAR on the CoESRA platform really supercharged our analysis.”
“We were able to link our datasets with other CAR datasets such as exposure to extreme weather that had already been put together for their previous projects. So we were able to take advantage of a broader range of data, as well as a range of skillsets and technologies than we could before."
"It was also really great to know this was all done in a safe and secure online data sharing platform, and that all the results would be reproducible and transferable to future projects in a safe and secure way too".
Dr Ivan Hannigan of CAR and the University of Sydney’s University Centre for Rural Health, says that the improved IT infrastructure offered by the TERN CoESRA Virtual Desktop allows his team of researchers to effectively collaborate with other scientists and stakeholders, share data and reproduce experiments.
“CoESRA allows a much broader discussion of the data, techniques and results because the researchers are able to discuss the details of the modelling with other scientists at meetings, conferences and workshops, while actually repeating the computations in real time.”
Dr Joy Tripovich of the Centre for Air pollution, energy and health Research and her team from the University of Sydney have used the TERN CoESRA Virtual Desktop to analyse large amounts of data on the relationship been heatwaves and the admission of dogs to vets collected by Vet Compass Australia (image courtesy of Vet Compass)
TERN CoESRA Virtual Desktop has also been used to bring together data and analysis required to apply the IUCN Red List of Ecosystems criteria to Australia’s Mountain Ash forests and the Georgina gidgee woodlands and make a repeatable workflow for the ecosystem assessment.
Complete reproducible Kepler workflows have been developed in the TERN CoESRA Virtual Desktop that allow users to re-run the IUCN assessments and generate all the results that formed them. It also allows for re-runs with additional data for certain criteria.
The project portfolios for these two ecosystem risk assessments are openly available in the TERN CoESRA Virtual Desktop alongside a Kepler scientific workflow that automates data analysis tasks for Marxan, a widely used conservation planning software.
Published in TERN newsletter January 2019