David Freedman | 18 Jul 2012 00:34
Favicon

Re: weighted mean by week

The plyr package is very helpful for this:

library(plyr)
ddply(x ,.(myweek), summarize, m1=weighted.mean(var1,myweight),
m2=weighted.mean(var2,myweight))

--
View this message in context: http://r.789695.n4.nabble.com/weighted-mean-by-week-tp4636814p4636816.html
Sent from the R help mailing list archive at Nabble.com.

Dimitri Liakhovitski | 18 Jul 2012 03:00
Picon

Re: weighted mean by week

Thanks a lot, David.
Indeed, it's much shorter.
Unfortunately, in my real task I am dozens and dozens of variables
like var1 and var2 so that manually specifying things like in
"m1=weighted.mean(var1,myweight)" would take a lot of code and a very
long time.
Dimitri

On Tue, Jul 17, 2012 at 6:34 PM, David Freedman <dxf1 <at> cdc.gov> wrote:
> The plyr package is very helpful for this:
>
> library(plyr)
> ddply(x ,.(myweek), summarize, m1=weighted.mean(var1,myweight),
> m2=weighted.mean(var2,myweight))
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/weighted-mean-by-week-tp4636814p4636816.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help <at> r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--

-- 
Dimitri Liakhovitski
marketfusionanalytics.com

(Continue reading)

David Freedman | 18 Jul 2012 03:22
Favicon

Re: weighted mean by week

If there are many variables, I'd then suggest the data.table package:

library(data.table)
dt=data.table(x)
dt[,lapply(.SD, function(x)weighted.mean(x,myweight)), keyby=c('group',
'myweek')]

The '.SD' is an abbreviation for all the variables in the data table
(excluding the grouping variables).  There's an .SDcols= 'variables of
interest' option if you want to limit the dozens of variables to only some
of them.  Or, in the data.table(x) statement, you could limit the created
data table to only the variables your interested in.

As an added benefit, the data.table approach is amazingly fast (particularly
when there are numerous grouping categories)

--
View this message in context: http://r.789695.n4.nabble.com/weighted-mean-by-week-tp4636814p4636825.html
Sent from the R help mailing list archive at Nabble.com.

Dimitri Liakhovitski | 18 Jul 2012 03:39
Picon

Re: weighted mean by week

David, many thanks.
Did something get ommitted from your line?:

ddply(x ,.(myweek), summarize, m1=weighted.mean(var1,myweight),
m2=weighted.mean(var2,myweight))

Because it just reproduces x - in a somewhat different order...

Thank you!
Dimitri

On Tue, Jul 17, 2012 at 9:22 PM, David Freedman <dxf1 <at> cdc.gov> wrote:
> If there are many variables, I'd then suggest the data.table package:
>
> library(data.table)
> dt=data.table(x)
> dt[,lapply(.SD, function(x)weighted.mean(x,myweight)), keyby=c('group',
> 'myweek')]
>
> The '.SD' is an abbreviation for all the variables in the data table
> (excluding the grouping variables).  There's an .SDcols= 'variables of
> interest' option if you want to limit the dozens of variables to only some
> of them.  Or, in the data.table(x) statement, you could limit the created
> data table to only the variables your interested in.
>
> As an added benefit, the data.table approach is amazingly fast (particularly
> when there are numerous grouping categories)
>
>
> --
(Continue reading)

David Freedman | 18 Jul 2012 04:49
Favicon

Re: weighted mean by week

Honestly, I wasn't sure what you wanted to do with 'group'.  Here it is with
the 'group' variable deleted

library(data.table) 
dt=data.table(x[,-1]) 
dt[,lapply(.SD, function(x)weighted.mean(x,myweight)), keyby='myweek'] 

--
View this message in context: http://r.789695.n4.nabble.com/weighted-mean-by-week-tp4636814p4636828.html
Sent from the R help mailing list archive at Nabble.com.

Dimitri Liakhovitski | 18 Jul 2012 19:07
Picon

Re: weighted mean by week

David, thanks a lot!
I tried x[-1] myself but forgot to delete 'group' from the keyby
statement - this explains why it did not work for me.
This is amazing - just 2 lines instead of my many-many.
Great learning!
Dimitri

On Tue, Jul 17, 2012 at 10:49 PM, David Freedman <dxf1 <at> cdc.gov> wrote:
> Honestly, I wasn't sure what you wanted to do with 'group'.  Here it is with
> the 'group' variable deleted
>
> library(data.table)
> dt=data.table(x[,-1])
> dt[,lapply(.SD, function(x)weighted.mean(x,myweight)), keyby='myweek']
>
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/weighted-mean-by-week-tp4636814p4636828.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help <at> r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--

-- 
Dimitri Liakhovitski
(Continue reading)


Gmane