2 agg()

The function agg() mimics Base R’s aggregate() with the exception that an unnested data frame is maintained when calling multiple functions in a vector.

To demonstrate, let’s compare the structure between the two functions.

2.0.1 aggregate() vs. agg()

### GOAL: Compare the output between aggregate vs. agg() when 
###       calling multiple functions within a vector.

ms   <- function(x) c(m = mean(x), s = sd(x))
form <- formula(cbind(mpg, disp) ~ am + gear)

A <- aggregate(form, mtcars, ms)
B <- agg(form, mtcars, ms)       

str(A) # Nested results
## 'data.frame':    4 obs. of  4 variables:
##  $ am  : num  0 0 1 1
##  $ gear: num  3 4 4 5
##  $ mpg : num [1:4, 1:2] 16.11 21.05 26.27 21.38 3.37 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr  "m" "s"
##  $ disp: num [1:4, 1:2] 326.3 155.7 106.7 202.5 94.9 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr  "m" "s"
str(B) # Unnested results
## 'data.frame':    4 obs. of  6 variables:
##  $ am    : num  0 0 1 1
##  $ gear  : num  3 4 4 5
##  $ mpg.m : num  16.1 21.1 26.3 21.4
##  $ mpg.s : num  3.37 3.07 5.41 6.66
##  $ disp.m: num  326 156 107 202
##  $ disp.s: num  94.9 14 37.2 115.5

As a result, aggregate nests the output into the dependent variables, whereas agg() “flattens” the output. The benefit of flattening is that the user can refer to these specific columns more directly than having to call the nested information. In other words, to refer to the mean MPG vector in our example with aggregate(), you would have to execute A$mpg[, 'm'], whereas in agg() it is simply B$mpg.m. As such, agg() can be more efficient than its counterpart.