arrange(data, col_sort1)Sort a table by row
You want to sort the rows in your data frame according to the values of one or more columns.
Step 1 - Pass the data frame to dplyr::arrange(). arrange() will return a modified copy of this data frame.
Step 2 - Specify a column to sort the data frame by. Do not surround the name with quotation marks.
Step 3 (Optional) - Specify additional column(s) to sort by. If you sort by more than one column, arrange() will each subsequent column to breaks ties in the preceding columns.
arrange(data, col_sort1, col_sort2)Sort values in descending order
By default, arrange() sorts values in ascending order—numerically, alphabetically, logically, or chronologically. To sort a column in descending order, wrap it in dplyr::desc(), e.g.
arrange(data, desc(col_sort1), col_sort2)R users often pass a data frame to arrange() with %>%, sometimes as part of a larger sequence of commands. For example,
data %>%
mutate(new_col = x + 5) %>%
arrange(new_col)Example
covid_jan2021 describes COVID-19 tests, hospitalizations, and deaths collected in the United States in January 2021. The data has been aggregated by state, and we want to find the state with the fewest COVID-19 deaths.
We begin by loading the dplyr package, which contains arrange(). We then pass covid_jan2021 to arrange() and add the column of interest, total_deaths, as an argument in arrange().
library(dplyr)
covid_jan2021 %>%
arrange(total_deaths)# A tibble: 50 × 5
state total_test total_cases total_hospitalizations total_deaths
<fct> <dbl> <dbl> <dbl> <dbl>
1 VT 199646 4553 0 38
2 AK 224575 7137 182 56
3 HI 175137 4686 299 122
4 ND 101691 5135 228 153
5 WY 133201 7503 182 158
6 DE 252679 20615 0 164
7 ME 280838 15123 325 243
8 MT 162660 12394 677 272
9 SD 56237 9086 618 290
10 NH 229228 21667 135 298
# ℹ 40 more rows
We see here that Vermont was the state with the lowest number of total_deaths in January 2021, with 38 deaths.
Sort a table to bring missing values to the top
We can use arrange() to sort on functions of variables. To bring missing values to the top of a table, sort on the output of calling is.na() on a column:
data %>%
arrange(desc(is.na(col_1)))arrange() and SAS
arrange() is the equivalent of SAS’s SORT procedure with the BY statement or PROC SQL with ORDER BY.
In SAS:
PROC SORT DATA = data OUT = data_sort;
BY var_1 DESCENDING var_2;
RUN;or
PROC SQL;
CREATE TABLE data_sort AS
SELECT *
FROM DATA
ORDER BY var_1 DESC, var_2;
QUIT;In R:
data_sort <-
data %>%
arrange(desc(var_1), var_2)