4 Sep 13:53 2013

## Efficient matrix multiply using accelerate

Morten Olsen Lysgaard <morten <at> lysgaard.no>

2013-09-04 11:53:31 GMT

2013-09-04 11:53:31 GMT

I've been trying to get some speed out of the accelerate library today. What I want to implement is something as simple as a matrix multiply. I'd like it to be fast and memory efficient. Given the equation C = AB where A is nxr B is rxm C is nxm it seem reasonable to allocate three arrays on the GPU wiht n*r, r*m and n*m elements respectively. Anyone know how to achieve this with accelerate? My first thought was to use the generate function to create the new C array, but I didn't manage to wrap my head around all the fancy type features that pop up when you want to return an array C that has dimensions dependent on the dimensions of it's inputs, A and B. I've search around a bit and found this [1] example implementation but it is just as slow as a simple sequential algorithm in C. I would be very thankful for any advice for working with accelerate! Here's a snippet of what I have tried to make. There are several errors in there. Maybe I'm approaching the problem from the wrong angle. matMul' arr brr = let dotProd shp =(Continue reading)