STIN100 Biological Data Analysis
Check for course changes due to the coronavirus outbreak on Canvas and StudentWeb.
Showing course contents for the educational year starting in 2020 .
Course responsible: Jon Olav Vik, Torgeir Rhodén Hvidsten
Teachers: Hilde Vinje, Jon Olav Vik, Simen Rød Sandve, Kathrine Frey Frøslie
ECTS credits: 10
Faculty: Faculty of Chemistry, Biotechnology and Food Science
Teaching language: NO
Limits of class size:
Teaching exam periods:
This course starts in the Fall semester. This course has teaching/evaluation in the Fall semester.
Course frequency: Annually
First time: 2018H
Biology has become a data-rich science with datasets than can no longer be analyzed manually. To extract knowledge from data, biologists need knowledge and skills in programming and data analysis that enable them to explore, visualize and interpret data. This must be done reproducibly, so that it is clear how the data has been processed and easy to modify the analyses if desired.
This course provides basic skills in the programming language R and introduces the student to common methods for visualization and analysis of multi-dimensional biological data.The course is organized around supervised student groups analysing relevant data sets.
In a time when trust in scientific knowledge is no longer obvious, yet challenges of sustainability require informed decisions, the understanding of data and verifiable production of knowledge are essential. STIN100 helps ensure that future employers and decision makers can rely on the knowledge basis prepared by our graduates
KNOWLEDGE: The students will acquire
- broad knowledge in handling, visualizing and analysing multidimensional biological data.
- familiarity with how some of the most important biological data sets are generated and how this data should be preprocessed to correct for systematic errors.
- a conceptual framework for mapping data to graphical elements.
- a repertoire of programming techniques and concepts that are required to perform the analyses in the course.
SKILLS: Students will be able to
- explain principles behind basic methods for data visualization and analysis.
- write programs that perform basic data processing tasks (subsetting, transformation and groupwise summaries) and employ simple visualization and data analysis methods.
- generate reproducible, executable reports that weave together expository text, program code and output.
- propose biological interpretations of analysis results.
- efficiently search documentation and internet resources to realize analyzes.
- simplify data sets for prototyping and debugging of analyzes.
COMPETENCES: Students will be well prepared to
- explore datasets they encounter in later term papers, theses and working life.
- perform reproducible research where data processing is fully documented through executable reports.
- compose data graphics using element appropriate to the data types and the biological structure in the data.
- pose follow-up questions to data analyses for discussion with domain experts.
- learn new methods and software packages with the help of documentation, code examples and web resources.
Some lectures, but emphasis will be put on practical group exercises using computers.
Learning materials tailored to NMBU studies are available in Canvas. We also rely on free online textbooks, see Syllabus.
Lecture notes, exercises and handouts, and selected parts of the online textbooks "Hands-on programming with R" (https://rstudio-education.github.io/hopr/) and "R for data science" (https://r4ds.had.co.nz, especially chapters 3 (Data visualisation), 9 (Introduction to data wrangling), 12 (Tidy data), 18 (Pipes), 27 (R markdown)).
There will be compulsory assignments for the students to hand in and have approved before being evaluated.
Group work: Students write a paper based on analysis of a relevant data set (counts 100%). Pass/Fail.
Lectures interspersed with exercises: 64 hours. Individual study: 236 hours.
MATRS - General admission requirements, and R1 or (S1+S2) or similar mathematical skills
Type of course:
Four weeks: 2 hours lecture with frequent computer exercises, 4 hours computer exercises with teacher and teaching assistants present.
Three double weeks: 1 hour guest lecture on selected datasets, 1 hour on related programming and analysis techniques, 10 hours analysis and report writing on computers with teacher and teaching assistants present.
Three weeks: 6 hours analysis and report writing on computers with teacher and teaching assistants present.
Students must bring their own laptop with Windows, Linux or macOS.
An external examiner must approve the evaluation arrangements for the course.
Examination details: Continous exam: Passed / Failed