38 slice

Written by Annie Collins and last updated on 28 January 2022.

38.1 Introduction

In this lesson, you will learn how to:

  • Select rows in a data frame using the slice() function

This lesson is a yellow level skill and is part of “Tidyverse Essentials.” Prerequisite skills include:

  • Installing packages
  • Calling libraries
  • Importing data

38.2 slice()

The slice() function is a function in the dplyr package which is a part of the tidyverse.

slice() allows you to select rows from your data by their location in the data frame. This can be done by inputting specific row numbers, ranges of row numbers, or by choosing rows to omit from the data. The slice() function does not manipulate the original data frame, but rather outputs a copy of the original data frame including only the selected rows.

The syntax for using slice is as follows:

slice(data, row number(s) of row(s) to be kept/removed)

This will be further explained in the following sections.

38.3 Video Overview

38.4 Selecting Rows

We will be slicing the data frame below (named pizza) which includes data on different pizza types and their toppings. For the purposes of visualizing the slice() function, the row numbers have been indicated next to each pizza type (slice() does not preserve inherent row order).

#>              type      top1        top2
#> 1     (1) classic    cheese   pepperoni
#> 2    (2) hawaiian       ham   pineapple
#> 3      (3) veggie mushrooms     peppers
#> 4 (4) meat lovers   sausage       bacon
#> 5       (5) greek    olives feta cheese

38.4.1 Single Row

If you wish to select a single row from your data frame, simply input the row’s number into the slice() function following the name of your data.

The code below returns the information about a veggie pizza.

slice(pizza, 3)
#>         type      top1    top2
#> 1 (3) veggie mushrooms peppers

38.4.2 Multiple Rows

There are several ways to slice multiple rows at once.

You may input several integers separated by commas, similar to the above example of selecting a single row. The code below returns information about classic, veggie, and greek pizzas.

slice(pizza, 1, 3, 5)
#>          type      top1        top2
#> 1 (1) classic    cheese   pepperoni
#> 2  (3) veggie mushrooms     peppers
#> 3   (5) greek    olives feta cheese

You can also input a vector of integers indicating the rows you wish to slice. This functions essentially the same as the previous method, but may be useful if you already have a numeric vector containing this information. The code below also returns information about classic, veggie, and greek pizzas.

nums <- c(1, 3, 5)
slice(pizza, nums)
#>          type      top1        top2
#> 1 (1) classic    cheese   pepperoni
#> 2  (3) veggie mushrooms     peppers
#> 3   (5) greek    olives feta cheese

If you wish to select information from multiple adjacent rows, you can input a numeric range instead of selecting rows individually. The syntax for this is “first row:last row.” The code below outputs the first three rows of pizza.

slice(pizza, 1:3)
#>           type      top1      top2
#> 1  (1) classic    cheese pepperoni
#> 2 (2) hawaiian       ham pineapple
#> 3   (3) veggie mushrooms   peppers

38.4.3 Omitting Rows

Another way of slicing rows is choosing which rows to omit or remove from the data frame. You can use any of the above methods to remove individual or multiple rows from a data frame by placing “-” before the inputted row number(s). The slice() function will then return all the rows in the data frame except the rows indicated. Uncomment each line of code below to observe the output.

# Return everything except row 3
slice(pizza, -3)
#>              type    top1        top2
#> 1     (1) classic  cheese   pepperoni
#> 2    (2) hawaiian     ham   pineapple
#> 3 (4) meat lovers sausage       bacon
#> 4       (5) greek  olives feta cheese
# Return only rows 2 and 4
# slice(pizza, -1, -3, -5)
# OR
# nums <- c(1, 3, 5)
# slice(pizza, -nums)
# Return only the last two rows (4:5)
# slice(pizza, -1:-3)

Note that when using slice(), positive and negative numbers cannot be combined. All row number values must be either positive or negative, including in vectors and ranges.

It is also possible to combine selection methods, for instance by indicating a range of rows followed by another individual row (slice(pizza, 1:3, 5) will return everything except row 4).

38.5 Common Mistakes

  • Slicing rows that do not exist: Ensure you are always inputting row numbers that exist in your data. If you input row numbers that do not exist in your data (for example, slice(pizza, 6) when pizza only has 5 rows), the function will return <0 rows> (or 0-length row.names) which is an empty data frame. To check the number of rows in your data, you can use the function nrow(). You can also use n() to represent the number of the last row of your data regardless of length (ie. 1:n() would slice every row of a data frame).
  • Combining Positive and Negative Indexes: As mentioned previously, all row number values must be either positive or negative when using slice(), including in vectors and ranges. If you combine positive and negative values, you may get the following message: Error: `slice()` expressions should return either all positive or all negative.
  • “Losing” your sliced data frame: When you use the slice() function, you are not directly changing your original data. If you want your data frame to be saved in its “sliced” form, you must reassign the name of your data frame to the output of the slice() function. For example, if I wanted to permanently remove the last two rows of pizza, I would execute the code pizza <- slice(pizza, -4:-5).

38.6 Next Steps

There are several functions that act as variations of slice() with similar syntax in the dplyr package. These include:

38.7 Exercises

Please use the dataset olympics, representing medal counts from the 2016 summer Olympics in Rio de Janeiro, for the following questions and exercises.

#>         country gold silver bronze
#> 1 United States   46     37     38
#> 2 Great Britain   27     23     17
#> 3         China   26     18     26
#> 4        Russia   19     17     20
#> 5       Germany   17     10     15
#> 6         Japan   12      8     21
#> 7        France   10     18     14
#> 8   South Korea    9      3      9

38.7.1 Question 1

Which of the following is not equivalent to slice(olympics, 1:2)?

  1. slice(olympics, 1, 2)
  2. slice(olympics, c(1, 2))
  3. slice(olympics, -3:-8)
  4. slice(olympics, 3:8)

38.7.2 Question 2

Which of the following will return data for all countries in olympics?

  1. slice(olympics, 1:8)
  2. slice(olympics, 8)
  3. slice(olympics, -1)
  4. slice(olympics, c(1, 8))

38.7.3 Question 3

Which of the following will extract information for Russia, Germany, and Japan from olympics?

  1. vector <- c(4, 5, 6), then slice(olympics, vector)
  2. slice(olympics, 4:6)
  3. slice(olympics, 4, 5, 6)
  4. All of the above

38.7.4 Question 4

For which countries will the following code return information: slice(olympics, -2, -7)?

  1. Great Britain, China, Russia, Germany, Japan, France
  2. China, Russia, Germany, Japan
  3. United States, China, Russia, Germany, Japan, South Korea
  4. Great Britain, France

38.7.5 Question 5

What will the following code return when executed: slice(olympics, 1, 10)?

  1. An error message
  2. An empty data frame
  3. Row 1 of olympics
  4. All rows of olympics

38.7.6 Question 6

What will the following code return when executed: slice(olympics, 1, -1)?

  1. An error message
  2. An empty data frame
  3. All but the last row of olympics
  4. The first and last row of olympics

38.7.7 Question 7

What will the following code return when executed: olympics %>% slice(1) %>% slice(3)?

  1. An error message
  2. An empty data frame
  3. The first row of olympics
  4. The third row of olympics

38.7.8 Question 8

Which of the following is not a function?

  1. slice_head()
  2. slice_min()
  3. slice_sample()
  4. slice_col()

38.7.9 Question 9

Which of the following is equivalent to slice(olympics, 4)?

  1. slice(olympics, -4)
  2. slice(olympics, -5)
  3. slice(olympics, -1:-3, -5:-8)
  4. slice(olympics, c(1, 2, 3, 5, 6, 7, 8))

38.7.10 Question 10

Which of the following statements is true?

  1. slice() can be used to manipulate columns by inputting the argument .cols = TRUE
  2. There are severalslice() helper functions available in the dplyr package
  3. slice() is available in base R
  4. Using slice() on a single row will return a vector