The primary problem with the explosion of biomedical datasets is not the data, not computational resources, and not the required storage space, but the general lack of trained and skilled researchers to manipulate and analyze these data. Eliminating this problem requires development of comprehensive educational resources. Here we present a community-driven framework that enables modern, interactive teaching of data analytics in life sciences and facilitates the development of training materials. The key feature of our system is that it is not a static but a continuously improved collection of tutorials. By coupling tutorials with a web-based analysis framework, biomedical researchers can learn by performing computation themselves through a web browser without the need to install software or search for example datasets. Our ultimate goal is to expand the breadth of training materials to include fundamental statistical and data science topics and to precipitate a complete re-engineering of undergraduate and graduate curricula in life sciences. This project is accessible at We developed an infrastructure that facilitates data analysis training in life sciences. It is an interactive learning platform tuned for current types of data and research problems. Importantly, it provides a means for community-wide content creation and maintenance and, finally, enables trainers and trainees to use the tutorials in a variety of situations, such as those where reliable Internet access is unavailable.

, , , ,,
Cell Systems
Erasmus MC: University Medical Center Rotterdam

Batut, B. (Bérénice), Hiltemann, S., Bagnacani, A. (Andrea), Baker, D. (Dannon), Bhardwaj, V. (Vivek), Blank, C. (Clemens), … Grüning, B. (Björn). (2018). Community-Driven Data Analysis Training for Biology. Cell Systems. doi:10.1016/j.cels.2018.05.012