The slice() function in R is a very useful function to manipulate and subset data frames. it allows you to pick individual rows or a range of rows from a dataset with simple syntax This function is part of the dplyr package, which is essential for data manipulation.
Syntax
slice(.data, ..., n = NULL, input = NULL)
Parameters:
- data: Data frame or dataset.
- ... : Used to state conditions such as logical conditions or row indices.
- n: Puts a certain number of rows.
- input: Another for ., to be used to specify rows to select explicitly.
Installing and Loading dplyr Package
Before using the slice() function, make sure the dplyr package is installed and loaded.
R
install.packages("dplyr")
library(dplyr)
Example: Creating a Sample Dataset
We have created a data frame 'sample_data' consisting of ID, Age, Gender, Score1, Score2, Status, and Income columns for six students. Now we will run some examples on this dataset to understand the slice() function.
R
s_data <- tibble(
ID = c(1:10),
Age = c(25, 30, 35, 28, 22, 40, 33, 26, 38, 29),
Gender = c("Male", "Female", "Male", "Female", "Male", "Male", "Female", "Male", "Female", "Male"),
Score1 = c(85, 70, 60, 75, 90, 80, 92, 78, 65, 88),
Score2 = c(75, 82, 88, 95, 70, 68, 80, 85, 77, 93),
Status = c("Active", "Inactive", "Active", "Active", "Inactive", "Active", "Inactive", "Active", "Inactive", "Active"),
Income = c(50000, 60000, 75000, 55000, 80000, 90000, 72000, 65000, 82000, 70000)
)
print("Original dataset:")
print(s_data)
Output:
[1] "Original dataset:"
A tibble: 10 × 7
ID Age Gender Score1 Score2 Status Income
<int><dbl><chr><dbl><dbl><chr><dbl>
1 1 25 Male 85 75 Active 50000
2 2 30 Female 70 82 Inactive 60000
3 3 35 Male 60 88 Active 75000
4 4 28 Female 75 95 Active 55000
5 5 22 Male 90 70 Inactive 80000
6 6 40 Male 80 68 Active 90000
7 7 33 Female 92 80 Inactive 72000
8 8 26 Male 78 85 Active 65000
9 9 38 Female 65 77 Inactive 82000
10 10 29 Male 88 93 Active 70000
Slice() Function Examples
1. Select a Single Row:
We can select a single row of a dataset by just passing the index of the row you want to select. Here is an example of selecting a single 3rd row of the 's_data' dataset.
R
single_row <- slice(s_data, 3)
print(single_row)
Output:
A tibble: 1 × 7
ID Age Gender Score1 Score2 Status Income
<int><dbl><chr><dbl><dbl><chr><dbl>
1 3 35 Male 60 88 Active 75000
2. Multiple Row Selection
We can select multiple rows in a dataset by passing the indexes of the rows in the c() function. Here is an example of selecting 1,5 and 8th rows of the 's_data' dataset. it will print only 1,5 and 8th rows of the dataset.
R
multiple_rows <- slice(s_data, c(1, 5, 8))
print(multiple_rows)
Output:
A tibble: 3 × 7
ID Age Gender Score1 Score2 Status Income
<int><dbl><chr><dbl><dbl><chr><dbl>
1 1 25 Male 85 75 Active 50000
2 5 22 Male 90 70 Inactive 80000
3 8 26 Male 78 85 Active 65000
3. Select a Range of Rows
We can select a range of rows by passing the index range. Here is an example of selecting a range of rows, we select from the 2nd row to the 6th row of the 's_data' dataset and print them.
R
range_sele <- slice(s_data, 2:6)
print(range_sele)
Output:
A tibble: 5 × 7
ID Age Gender Score1 Score2 Status Income
<int><dbl><chr><dbl><dbl><chr><dbl>
1 2 30 Female 70 82 Inactive 60000
2 3 35 Male 60 88 Active 75000
3 4 28 Female 75 95 Active 55000
4 5 22 Male 90 70 Inactive 80000
5 6 40 Male 80 68 Active 90000
4. Negative Indexing
We can exclude the rows from the selection by using negative indexing. It is done by passing the indexes we don't need in the c() function with a '-' symbol before it. It will exclude the rows in the c() function and select the remaining rows.
R
negative_ind <- slice(s_data, -c(3, 7))
print(negative_ind)
Output:
A tibble: 8 × 7
ID Age Gender Score1 Score2 Status Income
<int><dbl><chr><dbl><dbl><chr><dbl>
1 1 25 Male 85 75 Active 50000
2 2 30 Female 70 82 Inactive 60000
3 4 28 Female 75 95 Active 55000
4 5 22 Male 90 70 Inactive 80000
5 6 40 Male 80 68 Active 90000
6 8 26 Male 78 85 Active 65000
7 9 38 Female 65 77 Inactive 82000
8 10 29 Male 88 93 Active 70000
Here we have excluded the 3rd and 7th rows and printing the remaining rows.
5. Conditional Row Selection
We can use the which() function with slice() to choose rows based on conditions. For instance, to choose rows where Age is more than 30, which() verifies the condition and chooses the matching rows.
R
con_sele <- slice(s_data, which(s_data$Age > 30))
print(con_sele)
Output:
A tibble: 4 × 7
ID Age Gender Score1 Score2 Status Income
<int><dbl><chr><dbl><dbl><chr><dbl>
1 3 35 Male 60 88 Active 75000
2 6 40 Male 80 68 Active 90000
3 7 33 Female 92 80 Inactive 72000
4 9 38 Female 65 77 Inactive 82000
6. Select Top N Rows
We can select the top rows by using the slice_head() function. We pass the dataset and number of rows as arguments. Here we are selecting the top 4 (n=4) rows of the dataset.
R
top_n_rows <- slice_head(s_data, n = 4)
print(top_n_rows)
Output:
A tibble: 4 × 7
ID Age Gender Score1 Score2 Status Income
<int><dbl><chr><dbl><dbl><chr><dbl>
1 1 25 Male 85 75 Active 50000
2 2 30 Female 70 82 Inactive 60000
3 3 35 Male 60 88 Active 75000
4 4 28 Female 75 95 Active 55000
7. Select Bottom N Rows
We can select the bottom rows by using the slice_tail() function. We pass the dataset and number of rows as arguments. Here we are selecting the bottom 3 (n=3) rows of the dataset.
R
bottom_nrows <- slice_tail(s_data, n = 3)
print(bottom_nrows)
Output:
A tibble: 3 × 7
ID Age Gender Score1 Score2 Status Income
<int><dbl><chr><dbl><dbl><chr><dbl>
1 8 26 Male 78 85 Active 65000
2 9 38 Female 65 77 Inactive 82000
3 10 29 Male 88 93 Active 70000
8 Random Row Selection
We can select rows randomly by using the slice_sample() function and pass the dataset, and number of rows to be selected as arguments. Here we are selecting two random rows from the dataset
R
ran_rows <- slice_sample(s_data, n = 2)
print(ran_rows)
Output:
A tibble: 2 × 7
ID Age Gender Score1 Score2 Status Income
<int><dbl><chr><dbl><dbl><chr><dbl>
1 9 38 Female 65 77 Inactive 82000
2 1 25 Male 85 75 Active 50000
9. Alternate Row Selection
We can select alternate rows of a dataset using the slice() function and seq() function starting from the 1st index and incrementing by 2 up to number of rows of the dataset.
R
alt_rows <- slice(s_data, seq(1, nrow(s_data), by = 2))
print(alt_rows)
Output:
A tibble: 5 × 7
ID Age Gender Score1 Score2 Status Income
<int><dbl><chr><dbl><dbl><chr><dbl>
1 1 25 Male 85 75 Active 50000
2 3 35 Male 60 88 Active 75000
3 5 22 Male 90 70 Inactive 80000
4 7 33 Female 92 80 Inactive 72000
5 9 38 Female 65 77 Inactive 82000
Combining slice() with Other Functions
The slice () function can combine with other functions and the most common functions that are combined with the slice() function are arranged () and filter().
1. arrange() function
In R, the arrange() is a function used to arrange the data frame based on one or more variables. It is mainly used to sort and arrange the data in a data frame. here we are arranging the dataset in descending order and printing the top 3 rows of the sorted dataset
R
sort <- s_data %>% arrange(desc(Score1)) %>% slice_head(n = 3)
print(sort)
Output:
A tibble: 3 × 7
ID Age Gender Score1 Score2 Status Income
<int><dbl><chr><dbl><dbl><chr><dbl>
1 7 33 Female 92 80 Inactive 72000
2 5 22 Male 90 70 Inactive 80000
3 10 29 Male 88 93 Active 70000
2. filter function()
In R, the filter() function is used to filter the rows based on the conditions. here we are filtering the dataset by a condition where the Age is greater than 25 and printing the top two rows of the dataset.
R
age <- s_data %>% filter(Age > 25) %>% slice_head(n = 2)
print(age)
Output:
A tibble: 2 × 7
ID Age Gender Score1 Score2 Status Income
<int><dbl><chr><dbl><dbl><chr><dbl>
1 2 30 Female 70 82 Inactive 60000
2 3 35 Male 60 88 Active 75000
Related Articles: