2 agg()
The function agg()
mimics Base R’s aggregate()
with the exception that an unnested data frame is maintained when calling multiple functions in a vector.
To demonstrate, let’s compare the structure between the two functions.
2.0.1 aggregate()
vs. agg()
### GOAL: Compare the output between aggregate vs. agg() when
### calling multiple functions within a vector.
ms <- function(x) c(m = mean(x), s = sd(x))
form <- formula(cbind(mpg, disp) ~ am + gear)
A <- aggregate(form, mtcars, ms)
B <- agg(form, mtcars, ms)
str(A) # Nested results
## 'data.frame': 4 obs. of 4 variables:
## $ am : num 0 0 1 1
## $ gear: num 3 4 4 5
## $ mpg : num [1:4, 1:2] 16.11 21.05 26.27 21.38 3.37 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : NULL
## .. ..$ : chr "m" "s"
## $ disp: num [1:4, 1:2] 326.3 155.7 106.7 202.5 94.9 ...
## ..- attr(*, "dimnames")=List of 2
## .. ..$ : NULL
## .. ..$ : chr "m" "s"
## 'data.frame': 4 obs. of 6 variables:
## $ am : num 0 0 1 1
## $ gear : num 3 4 4 5
## $ mpg.m : num 16.1 21.1 26.3 21.4
## $ mpg.s : num 3.37 3.07 5.41 6.66
## $ disp.m: num 326 156 107 202
## $ disp.s: num 94.9 14 37.2 115.5
As a result, aggregate
nests the output into the dependent variables, whereas agg()
“flattens” the output. The benefit of flattening is that the user can refer to these specific columns more directly than having to call the nested information. In other words, to refer to the mean MPG vector in our example with aggregate()
, you would have to execute A$mpg[, 'm']
, whereas in agg()
it is simply B$mpg.m
. As such, agg()
can be more efficient than its counterpart.