7.5 arrange

arrange orders the rows of a data frame by the values of selected columns.
Let’s order rows by increasing mandate duration:

arrange(presidential2, duration_days)
## # A tibble: 11 x 5
##    name       start      end        party      duration_days
##    <chr>      <date>     <date>     <chr>      <drtn>       
##  1 Ford       1974-08-09 1977-01-20 Republican  895 days    
##  2 Kennedy    1961-01-20 1963-11-22 Democratic 1036 days    
##  3 Carter     1977-01-20 1981-01-20 Democratic 1461 days    
##  4 Bush       1989-01-20 1993-01-20 Republican 1461 days    
##  5 Johnson    1963-11-22 1969-01-20 Democratic 1886 days    
##  6 Nixon      1969-01-20 1974-08-09 Republican 2027 days    
##  7 Eisenhower 1953-01-20 1961-01-20 Republican 2922 days    
##  8 Reagan     1981-01-20 1989-01-20 Republican 2922 days    
##  9 Clinton    1993-01-20 2001-01-20 Democratic 2922 days    
## 10 Bush       2001-01-20 2009-01-20 Republican 2922 days    
## 11 Obama      2009-01-20 2017-01-20 Democratic 2922 days
# decreasing order with arrange(presidential2, desc(duration_days))

You can use several columns for the sorting

arrange(presidential2, 
        duration_days, name)
## # A tibble: 11 x 5
##    name       start      end        party      duration_days
##    <chr>      <date>     <date>     <chr>      <drtn>       
##  1 Ford       1974-08-09 1977-01-20 Republican  895 days    
##  2 Kennedy    1961-01-20 1963-11-22 Democratic 1036 days    
##  3 Bush       1989-01-20 1993-01-20 Republican 1461 days    
##  4 Carter     1977-01-20 1981-01-20 Democratic 1461 days    
##  5 Johnson    1963-11-22 1969-01-20 Democratic 1886 days    
##  6 Nixon      1969-01-20 1974-08-09 Republican 2027 days    
##  7 Bush       2001-01-20 2009-01-20 Republican 2922 days    
##  8 Clinton    1993-01-20 2001-01-20 Democratic 2922 days    
##  9 Eisenhower 1953-01-20 1961-01-20 Republican 2922 days    
## 10 Obama      2009-01-20 2017-01-20 Democratic 2922 days    
## 11 Reagan     1981-01-20 1989-01-20 Republican 2922 days

If a grouping was done before, you can arrange first by grouping and then by selected variable(s) setting the .by_group=TRUE parameter:

presidential2 %>%
    group_by(party) %>% 
    arrange(duration_days, .by_group=TRUE)
## # A tibble: 11 x 5
## # Groups:   party [2]
##    name       start      end        party      duration_days
##    <chr>      <date>     <date>     <chr>      <drtn>       
##  1 Kennedy    1961-01-20 1963-11-22 Democratic 1036 days    
##  2 Carter     1977-01-20 1981-01-20 Democratic 1461 days    
##  3 Johnson    1963-11-22 1969-01-20 Democratic 1886 days    
##  4 Clinton    1993-01-20 2001-01-20 Democratic 2922 days    
##  5 Obama      2009-01-20 2017-01-20 Democratic 2922 days    
##  6 Ford       1974-08-09 1977-01-20 Republican  895 days    
##  7 Bush       1989-01-20 1993-01-20 Republican 1461 days    
##  8 Nixon      1969-01-20 1974-08-09 Republican 2027 days    
##  9 Eisenhower 1953-01-20 1961-01-20 Republican 2922 days    
## 10 Reagan     1981-01-20 1989-01-20 Republican 2922 days    
## 11 Bush       2001-01-20 2009-01-20 Republican 2922 days

HANDS-ON

Go back to the previous exercise: “count the average BMI per species. Add a count of the number of individuals per species.” (on the starwarsBMI data set):

starwarsBMI %>% 
  group_by(species) %>% 
  summarise(average_bmi=mean(BMI, na.rm=TRUE), count_individuals=n())
  • Keep only species that have 2 or more individuals.
  • Arrange by decreasing average BMI.
Answer
starwarsBMI %>% 
  group_by(species) %>% 
  summarise(average_bmi=mean(BMI, na.rm=TRUE), count_individuals=n()) %>%
  filter(count_individuals >= 2) %>%
  arrange(desc(average_bmi))