jun zhang | 8 Aug 05:24 2014
Picon

How to improve the zipwith's performance

Dear All

I write a code for Clustering with Data.Clustering.Hierarchical, but it's slow.

I use the profiling and change some code, but I don't know why zipwith take so many time? (even I change list to vector)

My code is as blow, Any one kindly give me some advices.
======================
main = do
    ....
    let cluster = dendrogram  SingleLinkage vectorList getVectorDistance  
    ....

getExp2 v1 v2 = d*d
    where
        d = v1 - v2

getExp v1 v2
    | v1 == v2 = 0
    | otherwise = getExp2 v1 v2

tfoldl  d = DV.foldl1' (+) d

changeDataType:: Int -> Double
changeDataType d = fromIntegral d

getVectorDistance::(a,DV.Vector Int)->(a, DV.Vector Int )->Double
getVectorDistance v1 v2 = fromIntegral $ tfoldl dat
    where
        l1 = snd v1
        l2 = snd v2
        dat = DV.zipWith getExp l1 l2

=======================================

build with ghc -prof -fprof-auto -rtsopts -O2 log_cluster.hs

run with  log_cluster.exe +RTS -p

profiling result is

 log_cluster.exe +RTS -p -RTS

    total time  =        8.43 secs   (8433 ticks <at> 1000 us, 1 processor)
    total alloc = 1,614,252,224 bytes  (excludes profiling overheads)

COST CENTRE            MODULE  %time %alloc

getVectorDistance.dat  Main     49.4   37.8
tfoldl                 Main      5.7    0.0
getExp                 Main      4.5    0.0
getExp2                Main      0.5    1.5
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Felipe Lessa | 14 Aug 22:25 2014
Picon

Re: How to improve the zipwith's performance

Hey, Jun Zhang!

It would be nice if you provided a full runnable example so that someone
may tinker with it testing different approaches.

As it stands, I don't have any suggestions of how you could extract more
performance.

Cheers!

--

-- 
Felipe.

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
julian | 15 Aug 04:07 2014
Picon

Re: How to improve the zipwith's performance

Dear 

The runnable code is blow

import Data.Clustering.Hierarchical
import qualified Data.Vector.Primitive as DV
import System.Random
import Control.Monad

main = do
    vectorList <- genTestdata
    let cluster = dendrogram SingleLinkage vectorList getVectorDistance  
    putStrLn $ show cluster

genZero x 
    | x<5 = x
    |otherwise = 0

genVector::IO (DV.Vector Int)
genVector = do
    listRandom <- mapM (\x -> randomRIO (1,30) ) [1..20]
    let intOut = DV.fromList $ map genZero listRandom
    return intOut

genTestdata = do 
    r <- sequence  $ map (\x -> liftM (\y -> (x,y)) genVector) [1..1000]
    return r

getExp2 v1 v2 = d*d
    where
        d = v1 - v2

getExp v1 v2
    | v1 == v2 = 0
    | otherwise = getExp2 v1 v2

tfoldl  d = DV.foldl1' (+) d

changeDataType:: Int -> Double
changeDataType d = fromIntegral d

getVectorDistance::(a,DV.Vector Int)->(a, DV.Vector Int )->Double
getVectorDistance v1 v2 = fromIntegral $ tfoldl dat
    where
        l1 = snd v1
        l2 = snd v2
        dat = DV.zipWith getExp l1 l2

发自我的 iPhone

> 在 2014年8月15日,上午4:25,Felipe Lessa <felipe.lessa <at> gmail.com> 写道:
> 
> Hey, Jun Zhang!
> 
> It would be nice if you provided a full runnable example so that someone
> may tinker with it testing different approaches.
> 
> As it stands, I don't have any suggestions of how you could extract more
> performance.
> 
> Cheers!
> 
> -- 
> Felipe.
> 
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe <at> haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Gmane