Brian Campbell | 7 Oct 19:18
Picon
Favicon

Re: Clustering large data


I've recently been engaged in some exploratory data analysis also involving cluster analysis, albeit on a
much smaller dataset.  There are quite a few packages (e.g. ecodist(), vegan(), pvclust()) that include
functions for undertaking cluster analysis, but have you, or anyone else on here looked at alternative
clustering methods with bootstrap permutation tests of the nodes?  I've done this with pvclust but I don't
seem to recall this function including an argument for method="average").

Brian

> To: tyler.smith@...
> From: Farrar.David@...
> Date: Tue, 7 Oct 2008 09:56:15 -0400
> CC: r-sig-ecology-bounces@...; r-sig-ecology@...
> Subject: Re: [R-sig-eco] Clustering large data
> 
> Thierry, 
> 
>  Search of CRAN with "sparse clustering" yielded cluster.dist {cba}, 
> defined as "Clustering a Sparse Symmetric Distance Matrix".  There were 
> also sparse PCA packages and sparse matrix classes.  I have no experience 
> with these procedures. 
> 
> As additional background, you might like to say what kind of clustering 
> you want to do and whether some particular similarity/distance will be 
> involved. 
> Does your cluster analysis program take a data frame as input? 
> 
> However, it sounds like you are having problems with preliminary data 
> processing, and may not yet know whether some cluster analysis procedure 
> or other would choke on your matrix, once it is computed. 
(Continue reading)


Gmane