Gordon K Smyth | 3 Aug 2012 01:19
Picon
Picon
Favicon

Variance stabilization of m-values

Use eBayes with trend=TRUE later in the pipeline, then variance 
stabilization may not be needed.

Gordon

> Date: Wed, 1 Aug 2012 15:20:56 +0200
> From: Gustavo Fern?ndez Bay?n <gbayon@...>
> To: bioconductor@...
> Subject: [BioC] Variance stabilization of m-values
>
> Hi everybody.
>
> I am working with Illumina 450k methylation data. I am currently 
> cleaning a data set, getting rid of XY probes, etc? and I would like to 
> do a non-specific filtering and preserve only 20% of the probes, those 
> with the higher variability (as seen in Chapter 7 of the Bioconductor 
> Case Studies book).
>
> In the book, they create a meanSdPlot() and proceed as the variance is 
> not dependent on the mean (to a significant degree).
>
> Trying to follow that procedure, I have converted my beta values to 
> M-values, and then called meanSdPlot(). It shows, for my data, that 
> there is a relationship between mean and variance, i.e. the line with 
> the median is not horizontal. Of course, if I create a meanSdPlot with 
> the beta values, the effect is greater, due to their heteroscedasticity.
>
> Question: Is it correct to use a variance stabilization transformation 
> (as the one in justvsn) on the M-values in order to discard low-variance 
> probes?
(Continue reading)

Tim Triche, Jr. | 3 Aug 2012 04:16
Picon
Gravatar

Re: Variance stabilization of m-values

The mean-variance plot should be far "more" horizontal with M-values than
beta-values; have you plotted it against total intensity?  You end up going
down the rabbit hole eventually due to copy number variation, but plotting
m-value variance against the mean, the line of best fit is nearly flat
across the range of values.  The variance is more U-shaped (as opposed to
the "n" shape with beta values).

You could try an arcsin transform

asin(sqrt(beta)))

if your primary goal is to stabilize the variance, though Dr. Smyth's
suggestion will probably be better for sensitivity in the end.

Just a thought.  There are many ways to transform a proportion and they all
have relative strengths and weaknesses in practice.

On Thu, Aug 2, 2012 at 4:19 PM, Gordon K Smyth <smyth@...> wrote:

> Use eBayes with trend=TRUE later in the pipeline, then variance
> stabilization may not be needed.
>
> Gordon
>
>  Date: Wed, 1 Aug 2012 15:20:56 +0200
>> From: Gustavo Fern?ndez Bay?n <gbayon@...>
>> To: bioconductor@...
>> Subject: [BioC] Variance stabilization of m-values
>>
>> Hi everybody.
(Continue reading)

Tim Triche, Jr. | 3 Aug 2012 04:17
Picon
Gravatar

Re: Variance stabilization of m-values

nb.  I should have written:

"the variance of the M-value variance as a function of the mean is more
U-shaped towards the extremes, versus the n shape for betas"

My apologies.

--t

On Thu, Aug 2, 2012 at 7:16 PM, Tim Triche, Jr. <tim.triche@...>wrote:

> The mean-variance plot should be far "more" horizontal with M-values than
> beta-values; have you plotted it against total intensity?  You end up going
> down the rabbit hole eventually due to copy number variation, but plotting
> m-value variance against the mean, the line of best fit is nearly flat
> across the range of values.  The variance is more U-shaped (as opposed to
> the "n" shape with beta values).
>
> You could try an arcsin transform
>
> asin(sqrt(beta)))
>
> if your primary goal is to stabilize the variance, though Dr. Smyth's
> suggestion will probably be better for sensitivity in the end.
>
> Just a thought.  There are many ways to transform a proportion and they
> all have relative strengths and weaknesses in practice.
>
>
>
(Continue reading)

Gustavo Fernández Bayón | 24 Aug 2012 10:06
Picon

Re: Variance stabilization of m-values

Hi Tim.  

Sorry for the late reply. (OFFTOPIC: my third child decided to be born :) the day after I asked the question in
the list, so I have been on paternal leave, and really had no time to answer the emails)

The arcsin proposal is very interesting. I'll give a try too, although, as I have answered to Dr. Smyth, I do
not exactly know if the curve is really important as I thought it was the first time. I am currently
re-working on that pipeline, because I have to remember the exact point where I was twenty days before, and
that is sometimes hard :)

Thank you very much for your hints

Regards,
Gus

---------------------------
Enviado con Sparrow (http://www.sparrowmailapp.com/?sig)

El viernes 3 de agosto de 2012 a las 04:16, Tim Triche, Jr. escribió:

> The mean-variance plot should be far "more" horizontal with M-values than beta-values; have you plotted
it against total intensity? You end up going down the rabbit hole eventually due to copy number variation,
but plotting m-value variance against the mean, the line of best fit is nearly flat across the range of
values. The variance is more U-shaped (as opposed to the "n" shape with beta values).  
>  
> You could try an arcsin transform
>  
> asin(sqrt(beta)))
>  
> if your primary goal is to stabilize the variance, though Dr. Smyth's suggestion will probably be better
(Continue reading)

Wolfgang Huber | 4 Aug 2012 00:32
Picon

Re: Variance stabilization of m-values

Dear Gustavo

the two issues:
- whether filtering of probes by overall variance is admissible and 
helpful for your analysis
- whether the variance depends on the mean
are unrelated. If I understand your question correctly (and I am not 
sure I do), then you should filter on the overall variance of the M 
values, and need not worry about the mean-variance relationship.

Can you check the paper on this topic ("Independent filtering increases 
detection power for high-throughput experiments", 
http://www.pnas.org/content/107/21/9546.long) and get back if it is 
still unclear?

	Best wishes
	Wolfgang

Aug/3/12 1:19 AM, Gordon K Smyth scripsit::
> Use eBayes with trend=TRUE later in the pipeline, then variance
> stabilization may not be needed.
>
> Gordon
>
>> Date: Wed, 1 Aug 2012 15:20:56 +0200
>> From: Gustavo Fern?ndez Bay?n <gbayon@...>
>> To: bioconductor@...
>> Subject: [BioC] Variance stabilization of m-values
>>
>> Hi everybody.
(Continue reading)

Gustavo Fernández Bayón | 24 Aug 2012 10:12
Picon

Re: Variance stabilization of m-values


Hi Wolfgang,

First of all, I apologize for the late reply. As I have answered in a previous mail, there have been major
reasons that have kept me away from the e-mail.

---------------------------
Enviado con Sparrow (http://www.sparrowmailapp.com/?sig)

El sábado 4 de agosto de 2012 a las 00:32, Wolfgang Huber escribió:

> Dear Gustavo
>  
> the two issues:
> - whether filtering of probes by overall variance is admissible and  
> helpful for your analysis
> - whether the variance depends on the mean
> are unrelated. If I understand your question correctly (and I am not  
> sure I do), then you should filter on the overall variance of the M  
> values, and need not worry about the mean-variance relationship.

I was thinking about that, when I noticed that the curve showing that relationship really had nearly no
influence on a filtering of that kind. I.e., if I want to get rid of those probes whose variance is low, those
are quite homogenous in the graph behavior. Well, I should have to re-think this, as I currently have to
re-create the pipeline.
>  
> Can you check the paper on this topic ("Independent filtering increases  
> detection power for high-throughput experiments",  
> http://www.pnas.org/content/107/21/9546.long) and get back if it is  
> still unclear?
(Continue reading)

Gustavo Fernández Bayón | 24 Aug 2012 10:00
Picon

Re: Variance stabilization of m-values

Hi Gordon.  

Sorry for the late reply. I'll try your solution and see if it works. Fact is, maybe I was too alarmed about the
graph, and the relationship is not that important.  

Thank you very much.
Gus

---------------------------
Enviado con Sparrow (http://www.sparrowmailapp.com/?sig)

El viernes 3 de agosto de 2012 a las 01:19, Gordon K Smyth escribió:

> Use eBayes with trend=TRUE later in the pipeline, then variance  
> stabilization may not be needed.
>  
> Gordon
>  
> > Date: Wed, 1 Aug 2012 15:20:56 +0200
> > From: Gustavo Fern?ndez Bay?n <gbayon <at> gmail.com (mailto:gbayon <at> gmail.com)>
> > To: bioconductor <at> r-project.org (mailto:bioconductor <at> r-project.org)
> > Subject: [BioC] Variance stabilization of m-values
> >  
> > Hi everybody.
> >  
> > I am working with Illumina 450k methylation data. I am currently  
> > cleaning a data set, getting rid of XY probes, etc? and I would like to  
> > do a non-specific filtering and preserve only 20% of the probes, those  
> > with the higher variability (as seen in Chapter 7 of the Bioconductor  
> > Case Studies book).
(Continue reading)

Brent Pedersen | 24 Aug 2012 19:09
Picon
Gravatar

Re: Variance stabilization of m-values

On Thu, Aug 2, 2012 at 5:19 PM, Gordon K Smyth <smyth@...> wrote:
> Use eBayes with trend=TRUE later in the pipeline, then variance
> stabilization may not be needed.
>
> Gordon

Is that recommendation only for beta values?

when using M-values as a matrix, fit$Amean is not set so this gives an
error when using eBayes with trend=TRUE.

Or should one just manually set fit$Amean = rowMeans(M) ?

thanks,
-Brent

_______________________________________________
Bioconductor mailing list
Bioconductor@...
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

Gordon K Smyth | 25 Aug 2012 02:35
Picon
Picon
Favicon

Re: Variance stabilization of m-values

Dear Brent,

No Amean <- rowMeans(M) wouldn't have the desired effect.  Amean should 
reflect average intensity, so it would be necessary to compute Amean from 
the original intensities used to compute the M-values or beta values.

Note that I don't have any first hand experience with methylation arrays, 
so this is just to suggest something that could be tried.

Best wishes
Gordon

---------------------------------------------
Professor Gordon K Smyth,
Bioinformatics Division,
Walter and Eliza Hall Institute of Medical Research,
1G Royal Parade, Parkville, Vic 3052, Australia.
http://www.statsci.org/smyth

On Fri, 24 Aug 2012, Brent Pedersen wrote:

> On Thu, Aug 2, 2012 at 5:19 PM, Gordon K Smyth <smyth@...> wrote:
>> Use eBayes with trend=TRUE later in the pipeline, then variance
>> stabilization may not be needed.
>>
>> Gordon
>
> Is that recommendation only for beta values?
>
> when using M-values as a matrix, fit$Amean is not set so this gives an
(Continue reading)


Gmane