4 do.bind()

When executing lapply() to manipulate subsets of data, calling rbind() or cbind() within do.call() is common to fuse the transformed partitions. While the implementation is possible on one line by way of the do.call(*bind, lapply(x, f)) form, the readability and intended concision decreases as the complexity of the anonymous function increases. Even if the lapply() portion is stored in an object before hand, the intention of do.call() is not clear until *bind is stated. To minimize these issues, do.bind()7 wraps this process and clarifies its purpose: bind the results of the given function.

One may obtain similar results with map_dfr()/map_dfc() from purrr; but the output would always result in a data frame rather than the possibility of a matrix. Additionally, the binding rows or columns must be done in different functions, whereas it can be defined within do.bind().

There are three required parameters in this function: f, x, and m–respectively the function, collection (e.g. data frame), and margin (rbind/cbind designation). If m = 1 (the default), the results are combined row-wise; 2 for column-wise. A fourth parameter ... passes to do.call(). The output is a matrix or dataframe, depending on the inputs being passed.

This function can be useful for storing coefficients from multiple models into a single matrix.

4.0.1 Coefficient Matrix

# GOAL: Create a matrix of coefficients stemming from 3 models.

## Split mtcars by gear
split1  <- split(mtcars, mtcars$gear) 

## Create a function that excecutes a model for each subset and obtains the coefficients.
adhoc1  <- function(s) {
  
  coef(lm(mpg ~ disp + wt + am, s))   
  
}

## Execute the ad-hoc function for each subset.
output1 <- do.bind(adhoc1, split1, 1) # == do.call(rbind, lapply(split1, adhoc1)).
output2 <- do.bind(adhoc1, split1, 2) # == do.call(cbind, lapply(split1, adhoc1)).

## Print the outputs.
output1
##   (Intercept)         disp        wt        am
## 3    27.99461 -0.007982643 -2.384834        NA
## 4    46.68250 -0.097327135 -3.171284 -2.817164
## 5    41.77904 -0.006730729 -7.230952        NA
output2
##                        3           4            5
## (Intercept) 27.994609509 46.68249578 41.779042017
## disp        -0.007982643 -0.09732714 -0.006730729
## wt          -2.384834379 -3.17128412 -7.230951906
## am                    NA -2.81716389           NA

The outputs above fit well into kable() from the knitr package:

knitr::kable(output1)
(Intercept) disp wt am
3 27.99461 -0.0079826 -2.384834 NA
4 46.68250 -0.0973271 -3.171284 -2.817164
5 41.77904 -0.0067307 -7.230952 NA
knitr::kable(output2)
3 4 5
(Intercept) 27.9946095 46.6824958 41.7790420
disp -0.0079826 -0.0973271 -0.0067307
wt -2.3848344 -3.1712841 -7.2309519
am NA -2.8171639 NA

  1. The naming of do.bind() stems from do.call().↩︎