STIN300 Statistical Programming in R

Credits (ECTS):5

Course responsible:Torgeir Rhoden Hvidsten, Hilde Vinje

Campus / Online:Taught campus Ås

Teaching language:Engelsk, norsk

Limits of class size:150

Course frequency:Annually

Nominal workload:Lectures/exercises 60 hours. Individual studies 65 hours.

Teaching and exam period:This course starts in the January block. This course has teaching/evaluation in the January block

About this course

This is an intensive course where you will use the programming language R to apply your statistical skills to scientific data. If you have no prior experience with programming, be prepared to put in significant effort, see "Recommended prerequisites."

Course participants are usually Master’s or PhD students who have chosen a research topic. Use your own data if possible, or ask your supervisor for a dataset. If you haven’t chosen a research topic yet, we can help you find a dataset.

You will write a report in R Markdown, which will include data visualization and statistical analysis of your dataset. Your report will be fully reproducible and contain executable code. It will serve as a valuable starting point for your future work and facilitate discussions with, for example, a supervisor.

You can learn from free online textbooks, daily tutorial documents, and by asking effective questions in the Discussions section on Canvas.

We emphasize visualization, as well as structuring and manipulating tabular data. The course also covers: operators, variables, data types, and basic data structures, control structures such as loops and conditional statements, file and text processing and user-defined functions.

Learning outcome

Upon completion of the course the students should be capable of performing statistical analyses using a programming approach in R. The students should be able to visualize and manipulate data and make their own functions utilizing/modifying available functions in order to solve specific statistical problems. The students should also be able to present the output from statistical analyses in an accessible and scientific form using text and graphics.

KNOWLEDGE: Students will acquire

  • an understanding of how programming can automate demanding statistical computations.
  • a working knowledge of concepts, syntax and conventions for describing, fitting and interpreting statistical models in R.

SKILLS: Students will be able to

  • interpret output from R's functions for statistical modelling, such as lm().
  • read in data from various file formats including Excel, comma-separated text, and FASTA.
  • develop their own functions which use existing functions, to solve nontrivial challenges more efficiently than by nonstructured programming.
  • present results of statistical analysis in a scientific, clear form through reproducible, executable reports which weave together expository text, program code, and output such as tables and graphics.
  • troubleshoot problems by locating errors, reproducing them on a small subset of the data, step through code line by line, etc.
  • orient themselves in documentation for R packages that implements statistical methods the student knows.

GENERAL COMPETENCES: Students will be well prepared to apply statistical methods in R on datasets they encounter in later studies and working life. This includes loading data into R, transforming it to a structure that the analysis function can use, run analyses with appropriate settings, and interpret and present the results in a form that is useful to the end user.

  • Learning activities

    Teachers are available in real-time plenary sessions the first half of each day, and students work on their own or in self-organized groups in the afternoon. Divide your attention between planning the report on your own data, and studying tutorials, textbooks and R documentation to acquire the necessary skillsets.

    The first two mornings will be an introduction and "live coding" together, so that everyone gets well acquainted with the programming tool RStudio. The format will then become more flexible throughout the course, with both lectures and instructional videos.

    You will get advice on how to formulate effective questions, which is a key skill because it 1) helps others help you and 2) helps you help you. This will be a recurring theme throughout the course.

  • Teaching support

    The Canvas course pages link to daily tutorial documents, various howtos and free online textbooks.

    Most R functions has extensive documentation and runnable examples. You will learn to navigate the R help system, walk yourself through the examples, and relate them to your own problems.

    Online forums such as Stack Overflow and various AI tools are a rich source of support. You will learn to search existing answers, and how to describe problems clearly enough that others can help.

    Ask questions in Discussions in Canvas. They will be answered, either there or in plenary discussion.

    Teachers are available every day until noon.

  • Prerequisites

    Statistics equivalent to C in STAT100. You are expected to be acquainted with simple linear regression and analysis of variance.

    You are expected to be familiar with your file system, your keyboard, your web browser and your computer.

  • Recommended prerequisites

    Introduction to programming, e.g. STIN100 Biological data analysis or INF120 Programming and data processing.

    Statistics beyond introductory level. STIN300 does not primarily teach statistics, but advises you on applying those methods on your own data using the R toolset.

  • Assessment method
    Pass/fail based on quizzes and the data analysis report on data from your own field of research, all of which must be approved. Approved quizzes are valid only within the current semester.

    Portfolio Karakterregel: Passed / Not Passed
  • Examiner scheme
    An external examiner must approve the evaluation arrangements for the course.
  • Notes
    Students must have their own laptop running Windows, Linux, or macOS capable of running RStudio (se oppdaterte systemkrav). Chromebooks do not meet the system requirements for the software used in this course.
  • Teaching hours
    Lectures/interactive computer lab 4 hours daily in three weeks.
  • Preferential right
    M-BIAS
  • Admission requirements
    Special requirements in Science