56 Introduction
Written by Rohan Alexander.
In this module, we are going to explore data in R. R is a fully-featured programming language. However, it was developed to be a statistical programming language. This means that it was built to deal with data.
It has a lot of functionality built into it to deal with data, but it has also benefited from having a stable ecosystem of packages. These have enhanced what we think of as data and now that not only means numbers, but also letter, words, and language more generally, images, even video.
In this module we will cover an awful lot of content including:
-
head()
,tail()
,glimpse()
, andsummary()
, by Haoluan Chen. -
paste()
,paste0()
,glue::glue()
andstringr
, by Marija Pejcinovska -
names()
,rbind()
andcbind()
, by Isaac Ehrlich. -
left_join()
,anti_join()
,full_join()
, etc, by Haoluan Chen. - Looking for missing data, by Mariam Walaa.
-
set.seed()
,runif()
,rnorm()
, andsample()
, by Haoluan Chen. - Simulating datasets for regression, by Mariam Walaa.
- Advanced mutating and summarising, by Mariam Walaa.
- Tidying up datasets, by Mariam Walaa.
-
pull()
,pluck()
, andunnest()
, by Isaac Ehrlich. -
forcats
and factors, by Matthew Wankiewicz. - More on strings, by Annie Collins.
- Regular expressions, by Shirley Deng.
- Working with dates, by Mariam Walaa.
-
janitor
, by Mariam Walaa. -
tidyr
, by Mariam Walaa.
There’s a lot of data out there, and we’ve got the roos off the green. We can’t wait to see what you do with it.