The open access journal GigaScience has presented a virtual package of data for biogas production. The virtual package has been made reusable in a containerized form to allow scientists to better understand the production of biofuels.

The biogas topic is the production of methane through the anaerobic digestion (fermentation) of organic matter. The work offered provides not only an enormous amount of freely available data, but is also presented in a reproducible, reusable container form, allowing scientists to recreate the experiments at the touch of a button.

Biogas is one of the promising areas in biofuels development that has huge potential as a renewable and clean source of energy. Biogas is the production of methane gas through the anaerobic digestion (fermentation) of organic matter such as agricultural or food waste. Detailed knowledge on the functioning of the fermentation process is key for optimizing this process. But the vast majority of the microbes involved remain unknown and cannot be cultivated in laboratories.

In new research published in the Open Access journal GigaScience, researchers from Bielefeld University in Germany have characterized the complex communities of micro-organisms in a biogas plant that generates heat and power from maize silage and pig manure. Going even further, the authors took an unusual step to make their research more reproducible by creating a virtual ‘container’ of their data and tools.

For their study, the researchers carried out metagenomic and meta-transcriptomic analyses, which resulted in the generation of DNA and RNA sequences from the thousands of microbial species present. From this they were able to create a catalogue of 250,000 genes that enabled the researchers to begin defining the underlying biology of methane production.

While this data production only scratches the surface of the vast amount of information gathered, the authors furthered the usefulness of this resource by releasing all of the data and computational methods as a shareable container. These containers enable others, at the press of a few buttons, to execute the same analyses in the cloud. This not only makes the research reproducible, but also allows researchers around the world to build on these resources to more rapidly delineate the important processes involved in biogas generation and to better explore its use for biofuel.

As experiments become more data-intensive, reviewing and publishing the methods and results of scientific studies become increasingly challenging. To get around this, the authors used the rapidly emerging Docker platform, which effectively wraps software in a system that includes everything needed to rerun it. This removes the need for other researchers to install and maintain the many complex bioinformatics tools and software libraries: something that can be very technically challenging for researchers without the computational resources and skills.

Andreas Bremges, first author of the study said, “We decided to use virtualisation techniques to encapsulate our analysis workflow and make it basically independent from the host it is executed on.” Peter Belmann built the Docker container for the biogas study, and is a core team member of the bioboxes project to standardize interchangeable bioinformatics software containers.

Peter Li, lead data manager at GigaScience, who undertook the step of exactly recreating the results in the paper, which is extremely unusual in any other scientific publication said, “The reproducibility of published research is an important aspect of science. Andreas and his colleagues provided a Docker container that encapsulated the method used to process the data from their biogas study. This made my job of checking the reproducibility of their results much easier as their Docker container took care of installing the bioinformatics tools and their dependencies on my cloud server.”

The use of Docker in this “container” publication is a step towards moving publishing away from static and often un-reproducible papers – which have changed little since the 17th century – to more reproducible digital objects that better fits 21st century technology.

In more layman’s terms the database the team built and the software to use it is packed for download.

This is a huge boon to those working in the field. It comes from the hum drum daily grind research that offers little grandness, headlines, accolades or recognition. The team does merit a round of sincere applause and heartfelt thanks. It may not go down in history but will surely make some possible.


1 Comment so far

  1. steve on March 29, 2017 7:04 AM

    to more reproducible digital objects that better fits 21st century technology.

Name (required)

Email (required)


Speak your mind