5.1 What is “tidy” data?
Tidy data is where:
- Each column describes a variable.
- Each row describes an observation.
- Each value is a cell.
Example of a tidy data:
day | month | year | weight | height |
---|---|---|---|---|
12 | 4 | 2020 | 3.5 | 48 |
23 | 8 | 2019 | 2.9 | 50 |
9 | 11 | 2020 | 3.8 | 50 |
Example of untidy data:
day | month,year | weight | height |
---|---|---|---|
12 | 4,2020 | 3.5kg | 48 |
23 | 8,2019 | 2.9kg | 50 |
9 | 11,2020 | 3.8kg | 50 |
Here we introduce some useful functions from the tidyr
package to clean up and organize data so as to obtain tidy data that can then be processed more easily.