STIN300 Statistical Programming in R
Showing course contents for the educational year starting in 2019 .
Course responsible: Jon Olav Vik
ECTS credits: 5
Faculty: Faculty of Chemistry, Biotechnology and Food Science
Teaching language: EN, NO
Limits of class size:
Teaching exam periods:
This course starts in the January block. This course has teaching/evaluation in the January block
Course frequency: Annually
First time: 2010H
The first part contains an introduction to R-scripting, with focus on the use of the tidyverse-packages ggplot2 and dplyr. Emphasize is on visualization and structuring and manipulation of data in a table-format. Later, we visit topics like operators, variables, data types and basic data structures
The second part extends this to control structures (loops, conditionals), more general handling of files and texts and finally functions.
The second part contains file handling, text handling, graphics and packages, with repetitions and applications of elements from the first part.
The third part consists mainly of a compulsory project.
Upon completion of the course the students should be capable of performing statistical analyses using a programming approach in R. The students should be able to visualize and manipulate data and make their own functions utilizing/modifying available functions in order to solve specific statistical problems. The students should also be able to present the output from statistical analyses in an accessible and scientific form using text and graphics.
KNOWLEDGE: Students will acquire
- an understanding of how programming can automate demanding statistical computations.
- a working knowledge of concepts, syntax and conventions for describing, fitting and interpreting statistical models in R.
SKILLS: Students will be able to
- interpret output from R's functions for statistical modelling, such as lm().
- read in data from various file formats including Excel, comma-separated text, and FASTA.
- develop their own functions which use existing functions, to solve nontrivial challenges more efficiently than by nonstructured programming.
- present results of statistical analysis in a scientific, clear form through reproducible, executable reports which weave together expository text, program code, and output such as tables and graphics.
- troubleshoot problems by locating errors, reproducing them on a small subset of the data, step through code line by line, etc.
- orient themselves in documentation for R packages that implements statistical methods the student knows.
GENERAL COMPETENCES: Students will be well prepared to apply statistical methods in R on datasets they encounter in later studies and working life. This includes loading data into R, transforming it to a structure that the analysis function can use, run analyses with appropriate settings, and interpret and present the results in a form that is useful to the end user.
Lectures combined with extensive interactive programming. Students will work actively on programming exercises in the classes, with a lecturer present, so that difficult topics can be highlighted and given proper attention.
Written material and videos have been developed for this course, and will be available in Canvas.
The curriculum will be specified in the beginning of the course.
Statistics equivalent to STAT100
Introduction to programming
Project exercise. This must be approved before the exam.
Written exam, 3.5 hrs, counts 100 %.
Lectures/exercises 60 hours. Individual studies 90 hours.
Special requirements in Science
Type of course:
Lectures/interactive computer lab 4 hours daily in three weeks.
Students must bring their own laptop with Windows, Linux or MAC OS.
An external examiner evaluates the exam, and grades 25 selected exam papers.
Allowed examination aids: A1 No calculator, no other aids
Examination details: One written exam: Bestått / Ikke bestått