Summary and Schedule

Learning Objectives

  • Describe the importance of efficient and reproducible data QA/QC
  • Identify common data errors and quality issues
  • Develop a QA/QC strategy for a tabular data set
  • Import data into R and QA/QC using default data types
  • Implement an R script to perform data QA/QC on a tabular data set
  • Document and communicate data QA/QC steps for data reporting

Prerequisite

This lesson assumes you have R and RStudio installed on your computer. R and RStudio are two separate pieces of software:

  • R is a programming language and software used to run code written in R.
  • RStudio is an integrated development environment (IDE) that makes using R easier. In this course we use RStudio to interact with R.

If you don’t already have R and RStudio installed, follow the instructions for your operating system below. You have to install R before you install RStudio.

Installing R and RStudio

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.

FIXME: Setup instructions live in this document. Please specify the tools and the data sets the Learner needs to have installed.

Data Sets


Download the data zip file and unzip it to your Desktop

Software Setup


Details

Setup for different systems can be presented in dropdown menus via a spoiler tag. They will join to this discussion block, so you can give a general overview of the software used in this lesson here and fill out the individual operating systems (and potentially add more, e.g. online setup) in the solutions blocks.

Use PuTTY

Use Terminal.app

Use Terminal