59 names, rbind and cbind
Written by Isaac Ehrlich and last updated on 7 October 2021.
59.1 Introduction
For certain tasks, you may need to combine data frames and find information about them.
In this lesson, you will learn how to
- Use
names()
to find the column names of data frames - Use
rbind()
to combine two or more data frames (or matrices) by row - Use
cbind()
to combine two or more data frames (or matrices) by column
Prerequisites:
- Understanding how to data frames and their basic principles
59.2 Arguments
59.2.1 names()
The primary purpose of names()
is to return the column names of a data frame. The only argument that names()
takes is the data frame. colnames()
is a similar function which outputs the same information, and works for matrices as well.
head(starwars)
#> # A tibble: 6 × 14
#> name height mass hair_color skin_color eye_color
#> <chr> <int> <dbl> <chr> <chr> <chr>
#> 1 Luke Skywalk… 172 77 blond fair blue
#> 2 C-3PO 167 75 <NA> gold yellow
#> 3 R2-D2 96 32 <NA> white, bl… red
#> 4 Darth Vader 202 136 none white yellow
#> 5 Leia Organa 150 49 brown light brown
#> 6 Owen Lars 178 120 brown, gr… light blue
#> # … with 8 more variables: birth_year <dbl>, sex <chr>,
#> # gender <chr>, homeworld <chr>, species <chr>,
#> # films <list>, vehicles <list>, starships <list>
names(starwars)
#> [1] "name" "height" "mass" "hair_color"
#> [5] "skin_color" "eye_color" "birth_year" "sex"
#> [9] "gender" "homeworld" "species" "films"
#> [13] "vehicles" "starships"
table(colnames(starwars) == names(starwars))
#>
#> TRUE
#> 14
Using indexing, names()
can also be used to change the column names of a data frame.
# Change the homeworld column name (the tenth column) to home-planet
names(starwars)[10] <- "home-planet"
names(starwars)
#> [1] "name" "height" "mass" "hair_color"
#> [5] "skin_color" "eye_color" "birth_year" "sex"
#> [9] "gender" "home-planet" "species" "films"
#> [13] "vehicles" "starships"
The tidyverse rename()
function can also be used to change column names, and avoids indexing by specifying the column to rename, using the syntax new_name = old_name
.
59.2.2 rbind()
The purpose of rbind()
is to combine two (or more) data frames by row. The arguments to rbind()
are two (or more) data frames. These data frames must have the same number of columns, and must have the same column names as well. rbind()
can also be used to combine matrices which match the same requirements.
letter_df <- data.frame(numbers = 1:26, strings = letters)
words_df <- data.frame(numbers = 27:1006, strings = words)
character_df <- rbind(letter_df, words_df)
names(character_df)
#> [1] "numbers" "strings"
dim(character_df) # shows the number of rows and columns
#> [1] 1006 2
59.2.3 cbind()
The purpose of cbind()
is to combine two (or more) data frames by column. The arguments to cbind()
are two (or more) data frames. These data frames must have the same number of rows, or the number of rows must be multiples of one another. cbind()
can also be used to combine matrices which match the same requirements.
Note, in the case that the number of rows are multiples, the rows in the smaller data frame are repeated so they match the longer data frame.
index_df <- data.frame(numbers = 1:5, letters = c("a", "b", "c", "d", "e"))
names_df <- data.frame(vegetables = c("arugula", "broccoli", "cauliflower", "dill", "endive"),
fruits = c("apricot", "banana", "cherry", "date", "elderberry"),
flowers = c("aster", "begonia", "crocus", "daffodil", "echium"))
combined_df <- cbind(index_df, names_df)
names(combined_df)
#> [1] "numbers" "letters" "vegetables" "fruits"
#> [5] "flowers"
dim(combined_df) # shows the number of rows and columns
#> [1] 5 5
59.3 Questions and Exercises
59.4 Special Cases & Common Mistakes
The most common error with rbind()
and cbind()
occurs when the data frames do not meet the requirements (e.g. data frames have different number of columns for rbind()
or different number of rows for cbind()
). These will result in error messages such as “numbers of columns of arguments do not match” or “arguments imply differing number of rows.”
Similarly, if the names of the columns do not match, rbind()
will give a “names do not match previous names” error. You can use names()
to check against this error.