9 Apr 2013 16:46

## Automated Differentiation of Matrices (hmatrix)

```Hi Cafe,

Suppose I want to find the grad of a function then it's easy I just

import Data.Foldable (Foldable)
import Data.Traversable (Traversable)

data MyMatrix a = MyMatrix (a, a)
deriving (Show, Functor, Foldable, Traversable)

f :: Floating a => MyMatrix a -> a
f (MyMatrix (x, y)) = exp \$ negate \$ (x^2 + y^2) / 2.0

main :: IO ()
main = do
putStrLn \$ show \$ f \$ MyMatrix (0.0, 0.0)
putStrLn \$ show \$ grad f \$ MyMatrix (0.0, 0.0)

But now suppose I am doing some matrix calculations
http://hackage.haskell.org/package/hmatrix-0.14.1.0 and I want to find
the grad of a function of a matrix:

import Numeric.LinearAlgebra
import Data.Foldable (Foldable)
import Data.Traversable (Traversable)

g :: (Element a, Floating a) => Matrix a -> a
```

10 Apr 2013 00:03

### Re: Automated Differentiation of Matrices (hmatrix)

hmatrix and ad don't (currently) mix.

The problem is that hmatrix uses a packed structure that can't hold any of the AD mode variants we have as an Element. =(

I've been working with Alex Lang to explore in ad 4.0 ways that we can support monomorphic AD modes and still present largely the same API. We've seen a number of performance gains off of this refactoring already, but it doesn't go far enough to address what you need.

A goal a bit farther out is to support AD on vector/matrix operations, but it is a much bigger refactoring than the one currently in the pipeline. =/

To support automatic differentiation on vector-based operations in a form that works with standard BLAS-like storage like the packed matrix rep used in hmatrix we need to convert from a 'matrix of AD variables' to an 'AD mode over of matrices'. This is similar to the difference between a matrix of complex numbers and a real matrix plus an imaginary matrix.

This is a long term goal, but not one you're likely to see support for out of 'ad' in the short term.

I can't build AD on hmatrix itself due in part to licensing restrictions and differing underlying storage requirements, so there are a lot of little issues in making that latter vision a reality.

-Edward

On Tue, Apr 9, 2013 at 10:46 AM, Dominic Steinitz wrote:
Hi Cafe,

Suppose I want to find the grad of a function then it's easy I just

import Data.Foldable (Foldable)
import Data.Traversable (Traversable)

data MyMatrix a = MyMatrix (a, a)
deriving (Show, Functor, Foldable, Traversable)

f :: Floating a => MyMatrix a -> a
f (MyMatrix (x, y)) = exp \$ negate \$ (x^2 + y^2) / 2.0

main :: IO ()
main = do
putStrLn \$ show \$ f \$ MyMatrix (0.0, 0.0)
putStrLn \$ show \$ grad f \$ MyMatrix (0.0, 0.0)

But now suppose I am doing some matrix calculations
http://hackage.haskell.org/package/hmatrix-0.14.1.0 and I want to find
the grad of a function of a matrix:

import Numeric.LinearAlgebra
import Data.Foldable (Foldable)
import Data.Traversable (Traversable)

g :: (Element a, Floating a) => Matrix a -> a
g m = exp \$ negate \$ (x^2 + y^2) / 2.0
where r = (toLists m)!!0
x = r!!0
y = r!!1

main :: IO ()
main = do
putStrLn \$ show \$ g \$ (1 >< 2) ([0.0, 0.0] :: [Double])
putStrLn \$ show \$ grad g \$ (1 >< 2) ([0.0, 0.0] :: [Double])

Then I am in trouble:

No instance for (Traversable Matrix) arising from a use of `grad'
Possible fix: add an instance declaration for (Traversable Matrix)
In the second argument of `(\$)', namely
`grad g \$ (1 >< 2) ([0.0, 0.0] :: [Double])'
In the second argument of `(\$)', namely
`show \$ grad g \$ (1 >< 2) ([0.0, 0.0] :: [Double])'

Could not deduce (Element
arising from a use of `g'
bound by a type expected by the context:
Possible fix:
In the first argument of `grad', namely `g'
In the second argument of `(\$)', namely
`grad g \$ (1 >< 2) ([0.0, 0.0] :: [Double])'

What are my options here? Clearly I can convert my matrix into a list
(which is traversable), find the grad and convert it back into a
matrix but given I am doing numerical calculations and speed is an
important factor, this seems undesirable.

I think I would have the same problem with:

although I haven'¯t checked.

Thanks, Dominic.

_______________________________________________

```_______________________________________________
```
10 Apr 2013 13:30

### Re: Automated Differentiation of Matrices (hmatrix)

Hi Edward,

Thanks for the response. For now I don't need the performance for now but it's good to know these developments are in the pipeline. I'm not wedded to hmatrix. I think I could use repa or yarr just as easily; I just haven't investigated.

Dominic.

On 9 Apr 2013, at 23:03, Edward Kmett <ekmett <at> gmail.com> wrote:

hmatrix and ad don't (currently) mix.

The problem is that hmatrix uses a packed structure that can't hold any of the AD mode variants we have as an Element. =(

I've been working with Alex Lang to explore in ad 4.0 ways that we can support monomorphic AD modes and still present largely the same API. We've seen a number of performance gains off of this refactoring already, but it doesn't go far enough to address what you need.

A goal a bit farther out is to support AD on vector/matrix operations, but it is a much bigger refactoring than the one currently in the pipeline. =/

To support automatic differentiation on vector-based operations in a form that works with standard BLAS-like storage like the packed matrix rep used in hmatrix we need to convert from a 'matrix of AD variables' to an 'AD mode over of matrices'. This is similar to the difference between a matrix of complex numbers and a real matrix plus an imaginary matrix.

This is a long term goal, but not one you're likely to see support for out of 'ad' in the short term.

I can't build AD on hmatrix itself due in part to licensing restrictions and differing underlying storage requirements, so there are a lot of little issues in making that latter vision a reality.

-Edward

On Tue, Apr 9, 2013 at 10:46 AM, Dominic Steinitz wrote:
Hi Cafe,

Suppose I want to find the grad of a function then it's easy I just

import Data.Foldable (Foldable)
import Data.Traversable (Traversable)

data MyMatrix a = MyMatrix (a, a)
deriving (Show, Functor, Foldable, Traversable)

f :: Floating a => MyMatrix a -> a
f (MyMatrix (x, y)) = exp \$ negate \$ (x^2 + y^2) / 2.0

main :: IO ()
main = do
putStrLn \$ show \$ f \$ MyMatrix (0.0, 0.0)
putStrLn \$ show \$ grad f \$ MyMatrix (0.0, 0.0)

But now suppose I am doing some matrix calculations
http://hackage.haskell.org/package/hmatrix-0.14.1.0 and I want to find
the grad of a function of a matrix:

import Numeric.LinearAlgebra
import Data.Foldable (Foldable)
import Data.Traversable (Traversable)

g :: (Element a, Floating a) => Matrix a -> a
g m = exp \$ negate \$ (x^2 + y^2) / 2.0
where r = (toLists m)!!0
x = r!!0
y = r!!1

main :: IO ()
main = do
putStrLn \$ show \$ g \$ (1 >< 2) ([0.0, 0.0] :: [Double])
putStrLn \$ show \$ grad g \$ (1 >< 2) ([0.0, 0.0] :: [Double])

Then I am in trouble:

No instance for (Traversable Matrix) arising from a use of `grad'
Possible fix: add an instance declaration for (Traversable Matrix)
In the second argument of `(\$)', namely
`grad g \$ (1 >< 2) ([0.0, 0.0] :: [Double])'
In the second argument of `(\$)', namely
`show \$ grad g \$ (1 >< 2) ([0.0, 0.0] :: [Double])'

Could not deduce (Element
arising from a use of `g'
bound by a type expected by the context:
Possible fix:
In the first argument of `grad', namely `g'
In the second argument of `(\$)', namely
`grad g \$ (1 >< 2) ([0.0, 0.0] :: [Double])'

What are my options here? Clearly I can convert my matrix into a list
(which is traversable), find the grad and convert it back into a
matrix but given I am doing numerical calculations and speed is an
important factor, this seems undesirable.

I think I would have the same problem with:

although I haven'¯t checked.

Thanks, Dominic.

_______________________________________________

```_______________________________________________
```
10 Apr 2013 18:39

### Re: Automated Differentiation of Matrices (hmatrix)

Hi Edward,

I see now that the issues are deeper than performance.

I took another package that supports matrix operations: repa.

data MyMatrix a = MyMatrix
{
myRows :: Int
, myCols :: Int
, myElts :: [a]
} deriving (Show, Functor, Foldable, Traversable)

f (MyMatrix r c es) = sum es

g (MyMatrix r c es) = head \$ toList \$ sumS \$ sumS n
where
n   = fromListUnboxed (Z :. r :. c) es

I can take the grad of f but not of g even though they are the same function:

grad f :: Num a => MyMatrix a -> MyMatrix a

<interactive>:1:6:
Could not deduce (repa-3.2.3.1:Data.Array.Repa.Eval.Elt.Elt
arising from a use of `g'
from the context (Num a)
bound by the inferred type of
it :: Num a => MyMatrix a -> MyMatrix a
at Top level
bound by a type expected by the context:
at <interactive>:1:1-6
Possible fix:
(repa-3.2.3.1:Data.Array.Repa.Eval.Elt.Elt
In the first argument of `grad', namely `g'

2. By monomorphic, do you mean that I can do:

g :: MyMatrix Double -> Double
g (MyMatrix r c es) = head \$ toList \$ sumS \$ sumS n
where
n   = fromListUnboxed (Z :. r :. c) es

but not get a type error as I currently do:

<interactive>:1:6:
Couldn't match type `Double'
Actual type: MyMatrix Double -> Double
In the first argument of `grad', namely `g'

If so that would help as at least I could then convert between e.g. repa and structures that ad can play with. Of course, better would be that I could just apply ad to e.g. repa.

Dominic.

On 9 Apr 2013, at 23:03, Edward Kmett <ekmett <at> gmail.com> wrote:

hmatrix and ad don't (currently) mix.

The problem is that hmatrix uses a packed structure that can't hold any of the AD mode variants we have as an Element. =(

I've been working with Alex Lang to explore in ad 4.0 ways that we can support monomorphic AD modes and still present largely the same API. We've seen a number of performance gains off of this refactoring already, but it doesn't go far enough to address what you need.

A goal a bit farther out is to support AD on vector/matrix operations, but it is a much bigger refactoring than the one currently in the pipeline. =/

To support automatic differentiation on vector-based operations in a form that works with standard BLAS-like storage like the packed matrix rep used in hmatrix we need to convert from a 'matrix of AD variables' to an 'AD mode over of matrices'. This is similar to the difference between a matrix of complex numbers and a real matrix plus an imaginary matrix.

This is a long term goal, but not one you're likely to see support for out of 'ad' in the short term.

I can't build AD on hmatrix itself due in part to licensing restrictions and differing underlying storage requirements, so there are a lot of little issues in making that latter vision a reality.

-Edward

On Tue, Apr 9, 2013 at 10:46 AM, Dominic Steinitz wrote:
Hi Cafe,

Suppose I want to find the grad of a function then it's easy I just

import Data.Foldable (Foldable)
import Data.Traversable (Traversable)

data MyMatrix a = MyMatrix (a, a)
deriving (Show, Functor, Foldable, Traversable)

f :: Floating a => MyMatrix a -> a
f (MyMatrix (x, y)) = exp \$ negate \$ (x^2 + y^2) / 2.0

main :: IO ()
main = do
putStrLn \$ show \$ f \$ MyMatrix (0.0, 0.0)
putStrLn \$ show \$ grad f \$ MyMatrix (0.0, 0.0)

But now suppose I am doing some matrix calculations
http://hackage.haskell.org/package/hmatrix-0.14.1.0 and I want to find
the grad of a function of a matrix:

import Numeric.LinearAlgebra
import Data.Foldable (Foldable)
import Data.Traversable (Traversable)

g :: (Element a, Floating a) => Matrix a -> a
g m = exp \$ negate \$ (x^2 + y^2) / 2.0
where r = (toLists m)!!0
x = r!!0
y = r!!1

main :: IO ()
main = do
putStrLn \$ show \$ g \$ (1 >< 2) ([0.0, 0.0] :: [Double])
putStrLn \$ show \$ grad g \$ (1 >< 2) ([0.0, 0.0] :: [Double])

Then I am in trouble:

No instance for (Traversable Matrix) arising from a use of `grad'
Possible fix: add an instance declaration for (Traversable Matrix)
In the second argument of `(\$)', namely
`grad g \$ (1 >< 2) ([0.0, 0.0] :: [Double])'
In the second argument of `(\$)', namely
`show \$ grad g \$ (1 >< 2) ([0.0, 0.0] :: [Double])'

Could not deduce (Element
arising from a use of `g'
bound by a type expected by the context:
Possible fix:
In the first argument of `grad', namely `g'
In the second argument of `(\$)', namely
`grad g \$ (1 >< 2) ([0.0, 0.0] :: [Double])'

What are my options here? Clearly I can convert my matrix into a list
(which is traversable), find the grad and convert it back into a
matrix but given I am doing numerical calculations and speed is an
important factor, this seems undesirable.

I think I would have the same problem with:

although I haven'¯t checked.

Thanks, Dominic.

_______________________________________________

```_______________________________________________
```
10 Apr 2013 19:00

### Re: Automated Differentiation of Matrices (hmatrix)

On Wed, Apr 10, 2013 at 12:39 PM, Dominic Steinitz wrote:
<interactive>:1:6:
Could not deduce (repa-3.2.3.1:Data.Array.Repa.Eval.Elt.Elt
DANGER WILL ROBINSON!

It's showing package names+versions on the types; this usually means you have multiple versions of those packages installed, and ghc / ghci is confused as to which one to use.

--
brandon s allbery kf8nh                               sine nomine associates
unix, openafs, kerberos, infrastructure, xmonad        http://sinenomine.net
```_______________________________________________
```
10 Apr 2013 19:32

### Re: Automated Differentiation of Matrices (hmatrix)

The problem is both Repa and Hmatrix, (and most of the Vector types) want to know something about the data type we're storing in their shapes, so they can smash it flat and unbox it, but for most AD modes that value isn't actually something you can smash flat like that.

newtype Tower a = Tower [a]

data Dual a = Dual a a

data Tape a t
= Zero
| Lift !a
| Var !a {-# UNPACK #-} !Int
| Binary !a a a t t
| Unary !a a t
deriving (Show, Data, Typeable)
newtype Kahn a s = Kahn (Tape a (Kahn a s))

etc.

Of those only Dual -- the simplest one-derivative Forward mode -- can be made to fit in a container that wants to unbox it.

(Technically the new Reverse mode can also be made to work as it is a bounded number of numbers and Ints indexing into a tape)

With ad 3.4 you can't even choose to unbox that one because we pun between the notion of the infinitesimal we're protecting you from confusing and the AD Mode we're using to prevent you from using particulars about any given AD mode. It simplified the signature, but in retrospect made it harder to extend AD with more primitive operations with non-trivial known derivatives, and to take advantage of knowledge about storage like we need here.

With ad 4.0 you can at least write the appropriate Element-like instances for a mode like Dual to put it in those containers.

In the more distant future I'd like to release a version where instead we can work with well behaved containers by wrapping them through a future version of linear or another more BLAS-like binding possibly based on the work I have going into github.com/analytics.

In that scenario instead of having the vector of AD values, we'd have an AD'd vector. Analogous to switching from the matrix of complex values vs. a matrix + an imaginary matrix scenario I described earlier, where here we're using dual numbers. Interestingly if you think about it the Unboxed Vector instances already do that split for us, but only if we can split things up equally. I may borrow ideas from the wavelet tree machinery I'm building for analytics to improve on that eventually though.

-Edward

On Wed, Apr 10, 2013 at 12:39 PM, Dominic Steinitz wrote:
Hi Edward,

I see now that the issues are deeper than performance.

I took another package that supports matrix operations: repa.

data MyMatrix a = MyMatrix
{
myRows :: Int
, myCols :: Int
, myElts :: [a]
} deriving (Show, Functor, Foldable, Traversable)

f (MyMatrix r c es) = sum es

g (MyMatrix r c es) = head \$ toList \$ sumS \$ sumS n
where
n   = fromListUnboxed (Z :. r :. c) es

I can take the grad of f but not of g even though they are the same function:

grad f :: Num a => MyMatrix a -> MyMatrix a

<interactive>:1:6:
Could not deduce (repa-3.2.3.1:Data.Array.Repa.Eval.Elt.Elt
arising from a use of `g'
from the context (Num a)
bound by the inferred type of
it :: Num a => MyMatrix a -> MyMatrix a
at Top level
bound by a type expected by the context:
at <interactive>:1:1-6
Possible fix:
(repa-3.2.3.1:Data.Array.Repa.Eval.Elt.Elt
In the first argument of `grad', namely `g'

2. By monomorphic, do you mean that I can do:

g :: MyMatrix Double -> Double
g (MyMatrix r c es) = head \$ toList \$ sumS \$ sumS n
where
n   = fromListUnboxed (Z :. r :. c) es

but not get a type error as I currently do:

<interactive>:1:6:
Couldn't match type `Double'
Actual type: MyMatrix Double -> Double
In the first argument of `grad', namely `g'

If so that would help as at least I could then convert between e.g. repa and structures that ad can play with. Of course, better would be that I could just apply ad to e.g. repa.

Dominic.

On 9 Apr 2013, at 23:03, Edward Kmett <ekmett <at> gmail.com> wrote:

hmatrix and ad don't (currently) mix.

The problem is that hmatrix uses a packed structure that can't hold any of the AD mode variants we have as an Element. =(

I've been working with Alex Lang to explore in ad 4.0 ways that we can support monomorphic AD modes and still present largely the same API. We've seen a number of performance gains off of this refactoring already, but it doesn't go far enough to address what you need.

A goal a bit farther out is to support AD on vector/matrix operations, but it is a much bigger refactoring than the one currently in the pipeline. =/

To support automatic differentiation on vector-based operations in a form that works with standard BLAS-like storage like the packed matrix rep used in hmatrix we need to convert from a 'matrix of AD variables' to an 'AD mode over of matrices'. This is similar to the difference between a matrix of complex numbers and a real matrix plus an imaginary matrix.

This is a long term goal, but not one you're likely to see support for out of 'ad' in the short term.

I can't build AD on hmatrix itself due in part to licensing restrictions and differing underlying storage requirements, so there are a lot of little issues in making that latter vision a reality.

-Edward

On Tue, Apr 9, 2013 at 10:46 AM, Dominic Steinitz wrote:
Hi Cafe,

Suppose I want to find the grad of a function then it's easy I just

import Data.Foldable (Foldable)
import Data.Traversable (Traversable)

data MyMatrix a = MyMatrix (a, a)
deriving (Show, Functor, Foldable, Traversable)

f :: Floating a => MyMatrix a -> a
f (MyMatrix (x, y)) = exp \$ negate \$ (x^2 + y^2) / 2.0

main :: IO ()
main = do
putStrLn \$ show \$ f \$ MyMatrix (0.0, 0.0)
putStrLn \$ show \$ grad f \$ MyMatrix (0.0, 0.0)

But now suppose I am doing some matrix calculations
http://hackage.haskell.org/package/hmatrix-0.14.1.0 and I want to find
the grad of a function of a matrix:

import Numeric.LinearAlgebra
import Data.Foldable (Foldable)
import Data.Traversable (Traversable)

g :: (Element a, Floating a) => Matrix a -> a
g m = exp \$ negate \$ (x^2 + y^2) / 2.0
where r = (toLists m)!!0
x = r!!0
y = r!!1

main :: IO ()
main = do
putStrLn \$ show \$ g \$ (1 >< 2) ([0.0, 0.0] :: [Double])
putStrLn \$ show \$ grad g \$ (1 >< 2) ([0.0, 0.0] :: [Double])

Then I am in trouble:

No instance for (Traversable Matrix) arising from a use of `grad'
Possible fix: add an instance declaration for (Traversable Matrix)
In the second argument of `(\$)', namely
`grad g \$ (1 >< 2) ([0.0, 0.0] :: [Double])'
In the second argument of `(\$)', namely
`show \$ grad g \$ (1 >< 2) ([0.0, 0.0] :: [Double])'

Could not deduce (Element
arising from a use of `g'
bound by a type expected by the context:
Possible fix:
In the first argument of `grad', namely `g'
In the second argument of `(\$)', namely
`grad g \$ (1 >< 2) ([0.0, 0.0] :: [Double])'

What are my options here? Clearly I can convert my matrix into a list
(which is traversable), find the grad and convert it back into a
matrix but given I am doing numerical calculations and speed is an
important factor, this seems undesirable.

I think I would have the same problem with:

although I haven'¯t checked.

Thanks, Dominic.

_______________________________________________
```_______________________________________________