25 Dec 17:34 2012

## aggregate / collapse big data frame efficiently

Martin Batholdy <batholdy <at> googlemail.com>

2012-12-25 16:34:42 GMT

2012-12-25 16:34:42 GMT

Hi, I need to aggregate rows of a data.frame by computing the mean for rows with the same factor-level on one factor-variable; here is the sample code: x <- data.frame(rep(letters,2), rnorm(52), rnorm(52), rnorm(52)) aggregate(x, list(x[,1]), mean) Now my problem is, that the actual data-set is much bigger (120 rows and approximately 100.000 columns) – and it takes very very long (actually at some point I just stopped it). Is there anything that can be done to make the aggregate routine more efficient? Or is there a different approach that would work faster? Thanks for any suggestions!