Jim - FooBar(); | 4 Aug 2012 20:18
Picon

Re: community interest in machine learning (?)

poooo this is very strange...i'll update clojars within the next hour...sorry about this!

Jim

On 04/08/12 18:52, Timothy Washington wrote:
Hey Jim, 

So I started playing around with clojure-encog, and I'm pretty excited about it so far. Again, I'm trying to make a financial series predictor. And I'm trying to go through the steps of 1) nomalizing / preparing the data 2) creating a feed-forward neural network with back-prop (I'll try sigmoid & gaussian activations). Then I'll 3) train and 4) run the network. 


A) The first problem I'm having is a library one. I'm trying to normalize the data with the (prepare ...) function, but the normalization namespace isn't in [clojure-encog "0.4.0-SNAPSHOT"]. Here, we see that the nnets and training namespaces are in the snapshot jar, but not the normalization namespace. So I don't know how easy it is to update the snapshot jar. But in the meantime, I'll see if I can use the github version. 

webkell <at> ubuntu:~/Projects/nn$ jar tvf lib/clojure-encog-0.4.0-20120518.170223-1.jar 
    72 Fri May 18 17:58:04 PDT 2012 META-INF/MANIFEST.MF
  1961 Fri May 18 17:58:04 PDT 2012 META-INF/maven/clojure-encog/clojure-encog/pom.xml
   111 Fri May 18 17:58:04 PDT 2012 META-INF/maven/clojure-encog/clojure-encog/pom.properties
   584 Fri May 18 17:00:30 PDT 2012 project.clj
  9839 Fri May 18 17:01:38 PDT 2012 clojure_encog/nnets.clj
 11532 Fri May 18 17:57:20 PDT 2012 clojure_encog/examples.clj
 10144 Fri May 18 17:43:58 PDT 2012 clojure_encog/training.clj
  2177 Mon May 14 21:57:20 PDT 2012 java/NeuralPilot.java
  7574 Wed May 16 20:34:30 PDT 2012 java/PredictSunspotSVM.java
  2338 Mon May 14 21:56:42 PDT 2012 java/LanderSimulator.java
  1794 Fri May 18 16:02:22 PDT 2012 java/XORNEAT.java
  1672 Fri May 18 16:04:14 PDT 2012 java/XORNEAT.class
  1872 Mon May 14 14:53:26 PDT 2012 java/LanderSimulator.class
  1943 Mon May 14 14:53:26 PDT 2012 java/NeuralPilot.class
  7357 Wed May 16 20:37:20 PDT 2012 java/PredictSunspotSVM.class



B) The second problem I see is when trying to deal with the input data. The example in clojure-encog, has just an array of doubles. But my input data is slightly different in that I'm dealing with a LazySeq of arrays. Each of those arrays contain tick data, Time, Ask, Bid, AskVolume and BidVolume: 

(["01.05.2012 20:00:00.676" "1.32390" "1.32379" "3000000.00" "2250000.00"] 
 ["01.05.2012 20:00:00.888" "1.32390" "1.33238" "3000000.10" "2200000.00"] 
 ...) 


So of course a call to ((make-data ...) , fails with the error "clojure.lang.LazySeq cannot be cast to [Double..". So I need to figure out 1) a way to get each one of those input data points , into an input-layer neuron. I've started to think about that when I was dabbling with code. If you like, I can look into trying to jerry-rig these kinds of tick data mappings into ( training/make-data ). But I need a better understanding of the concept of a Temporalwindow. The other thing is 2) to figure out how to transform the time field into data the nn can use. I've been spitting the Datetime object out to longs. 


Thanks 

Tim Washington 
416.843.9060 



On Sun, Jul 29, 2012 at 11:35 AM, Dimitrios Jim Piliouras <jimpil1985 <at> gmail.com> wrote: 
Hi Tim, 

According to : 
http://www.heatonresearch.com/content/encog-30-article-2-design-goals-overview 

encog 3 should have descent support for any temporal (time-series) based prediction support in particular for financial predictions...I'm afraid however that the only example that I've ported to clojure-encog which uses temporal data is the sunspot example (SVM not NN).

Also, you shouldn't have any problems with the data (most likely you need to normalize them - I usually find  (-1 1) or (0 1) to work best.
for an example of how exactly you would do it  look for "PREDICT-SUNSPOT-SVM"  here:
https://github.com/jimpil/clojure-encog/blob/master/src/clojure_encog/examples.clj 

these 2 lines do all the job with regards to your input data: 
normalizedSunspots (prepare :array-range nil nil :raw-seq spots :ceiling 0.9 :floor 0.1)
train-set  ((make-data :temporal-window normalizedSunspots)  window-size 1)


As far as algorimthmic problems go encog has been around for quite a while...even though I don't necessarily agree with all the design decisions made along the way I find it is a  rather mature lib...of course it is written in Java so being large means it is a bit of a mess! also there is a lot of duplication in random places...anyways, what I'm trying to say is:

if you've got a specific example in mind, (like the financial prediction) maybe it's worth trying it out using clojure-encog or the encog-workbench (the gui) or any other already-made lib and see how it goes...writing your own will certainly teach you loads but it might take a while until you actually test what you want to test...

Normalisation, randomisation or both are almost always needed...

Hope that helps...

Jim
 


On Sun, Jul 29, 2012 at 5:41 PM, Timothy Washington <twashing <at> gmail.com> wrote:
Hey Ben, 

It's the same problem. 

user> (incanter/exp (incanter/minus 3254604.9658621363))
0.0

But it's not the functions. It's the math. Euler's number 2.71828... raised to the power of 3254604.9658621363, gives Infinity. So for my neural net's activation func, either i) I shouldn't used a sigmoid, or ii) my linear combiner needs to keep values within a certain bound. My neuron inputs are below. And it's the bid and sk volumes and the long time value that's giving me such a large number. 
  • 1.3239 (bid price) 
  • 1.32379 (ask price) 
  • 3000000.0 (bid volume) 
  • 2250000.0 (ask volume) 
  • 1335902400676 ( #<DateTime 2012-05-01T20:00:00.676Z> long value) 

I just had the idea to try a Gaussian or tanh activation function. I think this is the point where I'll give clojure-encog a whirl. I have a feeling I'll be running into a lot of these data and other algorithmic problems. And it'd be good to work with something that has already dealt with these issues. I still don't know if I need to normalize my input data, how to untangle the activation result for back propagation, etc. Any insights are welcome. 


Tim Washington 

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
Jim - FooBar(); | 4 Aug 2012 21:02
Picon

Re: community interest in machine learning (?)

Clojars has been updated with a clojure-encog jar containing all the namespaces...I'm really sorry I can't believe I hadn't noticed that! The code is in complete sync with github at the moment so instead of typing 'doc' all the time feel free to have a browser open...I've not changed much - I just removed some redundant let bindings and added ability to create an empty dataset... I also added a simple k-means clustering example. If i understood correctly what you're doing the closest example regarding preparing/normalising your data is the predict-sunspots example...

Hope that helps... :)

Jim

ps: empirically,  tanh and sigmoid work almost always best...I can say the same for the nuygen-widrow randomiser...Also, just so you know I'll be renaming clojure-encog to "enclog" for the 0.5 release...


On 04/08/12 19:18, Jim - FooBar(); wrote:
poooo this is very strange...i'll update clojars within the next hour...sorry about this!

Jim

On 04/08/12 18:52, Timothy Washington wrote:
Hey Jim, 

So I started playing around with clojure-encog, and I'm pretty excited about it so far. Again, I'm trying to make a financial series predictor. And I'm trying to go through the steps of 1) nomalizing / preparing the data 2) creating a feed-forward neural network with back-prop (I'll try sigmoid & gaussian activations). Then I'll 3) train and 4) run the network. 


A) The first problem I'm having is a library one. I'm trying to normalize the data with the (prepare ...) function, but the normalization namespace isn't in [clojure-encog "0.4.0-SNAPSHOT"]. Here, we see that the nnets and training namespaces are in the snapshot jar, but not the normalization namespace. So I don't know how easy it is to update the snapshot jar. But in the meantime, I'll see if I can use the github version. 

webkell <at> ubuntu:~/Projects/nn$ jar tvf lib/clojure-encog-0.4.0-20120518.170223-1.jar 
    72 Fri May 18 17:58:04 PDT 2012 META-INF/MANIFEST.MF
  1961 Fri May 18 17:58:04 PDT 2012 META-INF/maven/clojure-encog/clojure-encog/pom.xml
   111 Fri May 18 17:58:04 PDT 2012 META-INF/maven/clojure-encog/clojure-encog/pom.properties
   584 Fri May 18 17:00:30 PDT 2012 project.clj
  9839 Fri May 18 17:01:38 PDT 2012 clojure_encog/nnets.clj
 11532 Fri May 18 17:57:20 PDT 2012 clojure_encog/examples.clj
 10144 Fri May 18 17:43:58 PDT 2012 clojure_encog/training.clj
  2177 Mon May 14 21:57:20 PDT 2012 java/NeuralPilot.java
  7574 Wed May 16 20:34:30 PDT 2012 java/PredictSunspotSVM.java
  2338 Mon May 14 21:56:42 PDT 2012 java/LanderSimulator.java
  1794 Fri May 18 16:02:22 PDT 2012 java/XORNEAT.java
  1672 Fri May 18 16:04:14 PDT 2012 java/XORNEAT.class
  1872 Mon May 14 14:53:26 PDT 2012 java/LanderSimulator.class
  1943 Mon May 14 14:53:26 PDT 2012 java/NeuralPilot.class
  7357 Wed May 16 20:37:20 PDT 2012 java/PredictSunspotSVM.class



B) The second problem I see is when trying to deal with the input data. The example in clojure-encog, has just an array of doubles. But my input data is slightly different in that I'm dealing with a LazySeq of arrays. Each of those arrays contain tick data, Time, Ask, Bid, AskVolume and BidVolume: 

(["01.05.2012 20:00:00.676" "1.32390" "1.32379" "3000000.00" "2250000.00"] 
 ["01.05.2012 20:00:00.888" "1.32390" "1.33238" "3000000.10" "2200000.00"] 
 ...) 


So of course a call to ((make-data ...) , fails with the error "clojure.lang.LazySeq cannot be cast to [Double..". So I need to figure out 1) a way to get each one of those input data points , into an input-layer neuron. I've started to think about that when I was dabbling with code. If you like, I can look into trying to jerry-rig these kinds of tick data mappings into ( training/make-data ). But I need a better understanding of the concept of a Temporalwindow. The other thing is 2) to figure out how to transform the time field into data the nn can use. I've been spitting the Datetime object out to longs. 


Thanks 

Tim Washington 
416.843.9060 



On Sun, Jul 29, 2012 at 11:35 AM, Dimitrios Jim Piliouras <jimpil1985 <at> gmail.com> wrote: 
Hi Tim, 

According to : 
http://www.heatonresearch.com/content/encog-30-article-2-design-goals-overview 

encog 3 should have descent support for any temporal (time-series) based prediction support in particular for financial predictions...I'm afraid however that the only example that I've ported to clojure-encog which uses temporal data is the sunspot example (SVM not NN).

Also, you shouldn't have any problems with the data (most likely you need to normalize them - I usually find  (-1 1) or (0 1) to work best.
for an example of how exactly you would do it  look for "PREDICT-SUNSPOT-SVM"  here:
https://github.com/jimpil/clojure-encog/blob/master/src/clojure_encog/examples.clj 

these 2 lines do all the job with regards to your input data: 
normalizedSunspots (prepare :array-range nil nil :raw-seq spots :ceiling 0.9 :floor 0.1)
train-set  ((make-data :temporal-window normalizedSunspots)  window-size 1)


As far as algorimthmic problems go encog has been around for quite a while...even though I don't necessarily agree with all the design decisions made along the way I find it is a  rather mature lib...of course it is written in Java so being large means it is a bit of a mess! also there is a lot of duplication in random places...anyways, what I'm trying to say is:

if you've got a specific example in mind, (like the financial prediction) maybe it's worth trying it out using clojure-encog or the encog-workbench (the gui) or any other already-made lib and see how it goes...writing your own will certainly teach you loads but it might take a while until you actually test what you want to test...

Normalisation, randomisation or both are almost always needed...

Hope that helps...

Jim
 


On Sun, Jul 29, 2012 at 5:41 PM, Timothy Washington <twashing <at> gmail.com> wrote:
Hey Ben, 

It's the same problem. 

user> (incanter/exp (incanter/minus 3254604.9658621363))
0.0

But it's not the functions. It's the math. Euler's number 2.71828... raised to the power of 3254604.9658621363, gives Infinity. So for my neural net's activation func, either i) I shouldn't used a sigmoid, or ii) my linear combiner needs to keep values within a certain bound. My neuron inputs are below. And it's the bid and sk volumes and the long time value that's giving me such a large number. 
  • 1.3239 (bid price) 
  • 1.32379 (ask price) 
  • 3000000.0 (bid volume) 
  • 2250000.0 (ask volume) 
  • 1335902400676 ( #<DateTime 2012-05-01T20:00:00.676Z> long value) 

I just had the idea to try a Gaussian or tanh activation function. I think this is the point where I'll give clojure-encog a whirl. I have a feeling I'll be running into a lot of these data and other algorithmic problems. And it'd be good to work with something that has already dealt with these issues. I still don't know if I need to normalize my input data, how to untangle the activation result for back propagation, etc. Any insights are welcome. 


Tim Washington 

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en


--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
Jim - FooBar(); | 4 Aug 2012 21:08
Picon

Re: community interest in machine learning (?)

I will address your second issue shortly...You say you have a lazy-seq of arrays that have 5 strings? why strings?

Jim

On 04/08/12 20:02, Jim - FooBar(); wrote:
Clojars has been updated with a clojure-encog jar containing all the namespaces...I'm really sorry I can't believe I hadn't noticed that! The code is in complete sync with github at the moment so instead of typing 'doc' all the time feel free to have a browser open...I've not changed much - I just removed some redundant let bindings and added ability to create an empty dataset... I also added a simple k-means clustering example. If i understood correctly what you're doing the closest example regarding preparing/normalising your data is the predict-sunspots example...

Hope that helps... :)

Jim

ps: empirically,  tanh and sigmoid work almost always best...I can say the same for the nuygen-widrow randomiser...Also, just so you know I'll be renaming clojure-encog to "enclog" for the 0.5 release...


On 04/08/12 19:18, Jim - FooBar(); wrote:
poooo this is very strange...i'll update clojars within the next hour...sorry about this!

Jim

On 04/08/12 18:52, Timothy Washington wrote:
Hey Jim, 

So I started playing around with clojure-encog, and I'm pretty excited about it so far. Again, I'm trying to make a financial series predictor. And I'm trying to go through the steps of 1) nomalizing / preparing the data 2) creating a feed-forward neural network with back-prop (I'll try sigmoid & gaussian activations). Then I'll 3) train and 4) run the network. 


A) The first problem I'm having is a library one. I'm trying to normalize the data with the (prepare ...) function, but the normalization namespace isn't in [clojure-encog "0.4.0-SNAPSHOT"]. Here, we see that the nnets and training namespaces are in the snapshot jar, but not the normalization namespace. So I don't know how easy it is to update the snapshot jar. But in the meantime, I'll see if I can use the github version. 

webkell <at> ubuntu:~/Projects/nn$ jar tvf lib/clojure-encog-0.4.0-20120518.170223-1.jar 
    72 Fri May 18 17:58:04 PDT 2012 META-INF/MANIFEST.MF
  1961 Fri May 18 17:58:04 PDT 2012 META-INF/maven/clojure-encog/clojure-encog/pom.xml
   111 Fri May 18 17:58:04 PDT 2012 META-INF/maven/clojure-encog/clojure-encog/pom.properties
   584 Fri May 18 17:00:30 PDT 2012 project.clj
  9839 Fri May 18 17:01:38 PDT 2012 clojure_encog/nnets.clj
 11532 Fri May 18 17:57:20 PDT 2012 clojure_encog/examples.clj
 10144 Fri May 18 17:43:58 PDT 2012 clojure_encog/training.clj
  2177 Mon May 14 21:57:20 PDT 2012 java/NeuralPilot.java
  7574 Wed May 16 20:34:30 PDT 2012 java/PredictSunspotSVM.java
  2338 Mon May 14 21:56:42 PDT 2012 java/LanderSimulator.java
  1794 Fri May 18 16:02:22 PDT 2012 java/XORNEAT.java
  1672 Fri May 18 16:04:14 PDT 2012 java/XORNEAT.class
  1872 Mon May 14 14:53:26 PDT 2012 java/LanderSimulator.class
  1943 Mon May 14 14:53:26 PDT 2012 java/NeuralPilot.class
  7357 Wed May 16 20:37:20 PDT 2012 java/PredictSunspotSVM.class



B) The second problem I see is when trying to deal with the input data. The example in clojure-encog, has just an array of doubles. But my input data is slightly different in that I'm dealing with a LazySeq of arrays. Each of those arrays contain tick data, Time, Ask, Bid, AskVolume and BidVolume: 

(["01.05.2012 20:00:00.676" "1.32390" "1.32379" "3000000.00" "2250000.00"] 
 ["01.05.2012 20:00:00.888" "1.32390" "1.33238" "3000000.10" "2200000.00"] 
 ...) 


So of course a call to ((make-data ...) , fails with the error "clojure.lang.LazySeq cannot be cast to [Double..". So I need to figure out 1) a way to get each one of those input data points , into an input-layer neuron. I've started to think about that when I was dabbling with code. If you like, I can look into trying to jerry-rig these kinds of tick data mappings into ( training/make-data ). But I need a better understanding of the concept of a Temporalwindow. The other thing is 2) to figure out how to transform the time field into data the nn can use. I've been spitting the Datetime object out to longs. 


Thanks 

Tim Washington 
416.843.9060 



On Sun, Jul 29, 2012 at 11:35 AM, Dimitrios Jim Piliouras <jimpil1985 <at> gmail.com> wrote: 
Hi Tim, 

According to : 
http://www.heatonresearch.com/content/encog-30-article-2-design-goals-overview 

encog 3 should have descent support for any temporal (time-series) based prediction support in particular for financial predictions...I'm afraid however that the only example that I've ported to clojure-encog which uses temporal data is the sunspot example (SVM not NN).

Also, you shouldn't have any problems with the data (most likely you need to normalize them - I usually find  (-1 1) or (0 1) to work best.
for an example of how exactly you would do it  look for "PREDICT-SUNSPOT-SVM"  here:
https://github.com/jimpil/clojure-encog/blob/master/src/clojure_encog/examples.clj 

these 2 lines do all the job with regards to your input data: 
normalizedSunspots (prepare :array-range nil nil :raw-seq spots :ceiling 0.9 :floor 0.1)
train-set  ((make-data :temporal-window normalizedSunspots)  window-size 1)


As far as algorimthmic problems go encog has been around for quite a while...even though I don't necessarily agree with all the design decisions made along the way I find it is a  rather mature lib...of course it is written in Java so being large means it is a bit of a mess! also there is a lot of duplication in random places...anyways, what I'm trying to say is:

if you've got a specific example in mind, (like the financial prediction) maybe it's worth trying it out using clojure-encog or the encog-workbench (the gui) or any other already-made lib and see how it goes...writing your own will certainly teach you loads but it might take a while until you actually test what you want to test...

Normalisation, randomisation or both are almost always needed...

Hope that helps...

Jim
 


On Sun, Jul 29, 2012 at 5:41 PM, Timothy Washington <twashing <at> gmail.com> wrote:
Hey Ben, 

It's the same problem. 

user> (incanter/exp (incanter/minus 3254604.9658621363))
0.0

But it's not the functions. It's the math. Euler's number 2.71828... raised to the power of 3254604.9658621363, gives Infinity. So for my neural net's activation func, either i) I shouldn't used a sigmoid, or ii) my linear combiner needs to keep values within a certain bound. My neuron inputs are below. And it's the bid and sk volumes and the long time value that's giving me such a large number. 
  • 1.3239 (bid price) 
  • 1.32379 (ask price) 
  • 3000000.0 (bid volume) 
  • 2250000.0 (ask volume) 
  • 1335902400676 ( #<DateTime 2012-05-01T20:00:00.676Z> long value) 

I just had the idea to try a Gaussian or tanh activation function. I think this is the point where I'll give clojure-encog a whirl. I have a feeling I'll be running into a lot of these data and other algorithmic problems. And it'd be good to work with something that has already dealt with these issues. I still don't know if I need to normalize my input data, how to untangle the activation result for back propagation, etc. Any insights are welcome. 


Tim Washington 

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en



--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
Jim - FooBar(); | 4 Aug 2012 21:19
Picon

Re: community interest in machine learning (?)

Hmmm, I think it is worth downloading the source for encog 3.1 for java and look into: org.encog.ml.data.temporal.TemporalMLDataSet

I think this is what you need to add several columns...unfortunately I've not wrapped this yet so you will have to do some interop to get it going...I promise you it will be the first thing I look at as soon as I find some time...

hope that helps...

Jim


On 04/08/12 20:08, Jim - FooBar(); wrote:
I will address your second issue shortly...You say you have a lazy-seq of arrays that have 5 strings? why strings?

Jim

On 04/08/12 20:02, Jim - FooBar(); wrote:
Clojars has been updated with a clojure-encog jar containing all the namespaces...I'm really sorry I can't believe I hadn't noticed that! The code is in complete sync with github at the moment so instead of typing 'doc' all the time feel free to have a browser open...I've not changed much - I just removed some redundant let bindings and added ability to create an empty dataset... I also added a simple k-means clustering example. If i understood correctly what you're doing the closest example regarding preparing/normalising your data is the predict-sunspots example...

Hope that helps... :)

Jim

ps: empirically,  tanh and sigmoid work almost always best...I can say the same for the nuygen-widrow randomiser...Also, just so you know I'll be renaming clojure-encog to "enclog" for the 0.5 release...


On 04/08/12 19:18, Jim - FooBar(); wrote:
poooo this is very strange...i'll update clojars within the next hour...sorry about this!

Jim

On 04/08/12 18:52, Timothy Washington wrote:
Hey Jim, 

So I started playing around with clojure-encog, and I'm pretty excited about it so far. Again, I'm trying to make a financial series predictor. And I'm trying to go through the steps of 1) nomalizing / preparing the data 2) creating a feed-forward neural network with back-prop (I'll try sigmoid & gaussian activations). Then I'll 3) train and 4) run the network. 


A) The first problem I'm having is a library one. I'm trying to normalize the data with the (prepare ...) function, but the normalization namespace isn't in [clojure-encog "0.4.0-SNAPSHOT"]. Here, we see that the nnets and training namespaces are in the snapshot jar, but not the normalization namespace. So I don't know how easy it is to update the snapshot jar. But in the meantime, I'll see if I can use the github version. 

webkell <at> ubuntu:~/Projects/nn$ jar tvf lib/clojure-encog-0.4.0-20120518.170223-1.jar 
    72 Fri May 18 17:58:04 PDT 2012 META-INF/MANIFEST.MF
  1961 Fri May 18 17:58:04 PDT 2012 META-INF/maven/clojure-encog/clojure-encog/pom.xml
   111 Fri May 18 17:58:04 PDT 2012 META-INF/maven/clojure-encog/clojure-encog/pom.properties
   584 Fri May 18 17:00:30 PDT 2012 project.clj
  9839 Fri May 18 17:01:38 PDT 2012 clojure_encog/nnets.clj
 11532 Fri May 18 17:57:20 PDT 2012 clojure_encog/examples.clj
 10144 Fri May 18 17:43:58 PDT 2012 clojure_encog/training.clj
  2177 Mon May 14 21:57:20 PDT 2012 java/NeuralPilot.java
  7574 Wed May 16 20:34:30 PDT 2012 java/PredictSunspotSVM.java
  2338 Mon May 14 21:56:42 PDT 2012 java/LanderSimulator.java
  1794 Fri May 18 16:02:22 PDT 2012 java/XORNEAT.java
  1672 Fri May 18 16:04:14 PDT 2012 java/XORNEAT.class
  1872 Mon May 14 14:53:26 PDT 2012 java/LanderSimulator.class
  1943 Mon May 14 14:53:26 PDT 2012 java/NeuralPilot.class
  7357 Wed May 16 20:37:20 PDT 2012 java/PredictSunspotSVM.class



B) The second problem I see is when trying to deal with the input data. The example in clojure-encog, has just an array of doubles. But my input data is slightly different in that I'm dealing with a LazySeq of arrays. Each of those arrays contain tick data, Time, Ask, Bid, AskVolume and BidVolume: 

(["01.05.2012 20:00:00.676" "1.32390" "1.32379" "3000000.00" "2250000.00"] 
 ["01.05.2012 20:00:00.888" "1.32390" "1.33238" "3000000.10" "2200000.00"] 
 ...) 


So of course a call to ((make-data ...) , fails with the error "clojure.lang.LazySeq cannot be cast to [Double..". So I need to figure out 1) a way to get each one of those input data points , into an input-layer neuron. I've started to think about that when I was dabbling with code. If you like, I can look into trying to jerry-rig these kinds of tick data mappings into ( training/make-data ). But I need a better understanding of the concept of a Temporalwindow. The other thing is 2) to figure out how to transform the time field into data the nn can use. I've been spitting the Datetime object out to longs. 


Thanks 

Tim Washington 
416.843.9060 



On Sun, Jul 29, 2012 at 11:35 AM, Dimitrios Jim Piliouras <jimpil1985 <at> gmail.com> wrote: 
Hi Tim, 

According to : 
http://www.heatonresearch.com/content/encog-30-article-2-design-goals-overview 

encog 3 should have descent support for any temporal (time-series) based prediction support in particular for financial predictions...I'm afraid however that the only example that I've ported to clojure-encog which uses temporal data is the sunspot example (SVM not NN).

Also, you shouldn't have any problems with the data (most likely you need to normalize them - I usually find  (-1 1) or (0 1) to work best.
for an example of how exactly you would do it  look for "PREDICT-SUNSPOT-SVM"  here:
https://github.com/jimpil/clojure-encog/blob/master/src/clojure_encog/examples.clj 

these 2 lines do all the job with regards to your input data: 
normalizedSunspots (prepare :array-range nil nil :raw-seq spots :ceiling 0.9 :floor 0.1)
train-set  ((make-data :temporal-window normalizedSunspots)  window-size 1)


As far as algorimthmic problems go encog has been around for quite a while...even though I don't necessarily agree with all the design decisions made along the way I find it is a  rather mature lib...of course it is written in Java so being large means it is a bit of a mess! also there is a lot of duplication in random places...anyways, what I'm trying to say is:

if you've got a specific example in mind, (like the financial prediction) maybe it's worth trying it out using clojure-encog or the encog-workbench (the gui) or any other already-made lib and see how it goes...writing your own will certainly teach you loads but it might take a while until you actually test what you want to test...

Normalisation, randomisation or both are almost always needed...

Hope that helps...

Jim
 


On Sun, Jul 29, 2012 at 5:41 PM, Timothy Washington <twashing <at> gmail.com> wrote:
Hey Ben, 

It's the same problem. 

user> (incanter/exp (incanter/minus 3254604.9658621363))
0.0

But it's not the functions. It's the math. Euler's number 2.71828... raised to the power of 3254604.9658621363, gives Infinity. So for my neural net's activation func, either i) I shouldn't used a sigmoid, or ii) my linear combiner needs to keep values within a certain bound. My neuron inputs are below. And it's the bid and sk volumes and the long time value that's giving me such a large number. 
  • 1.3239 (bid price) 
  • 1.32379 (ask price) 
  • 3000000.0 (bid volume) 
  • 2250000.0 (ask volume) 
  • 1335902400676 ( #<DateTime 2012-05-01T20:00:00.676Z> long value) 

I just had the idea to try a Gaussian or tanh activation function. I think this is the point where I'll give clojure-encog a whirl. I have a feeling I'll be running into a lot of these data and other algorithmic problems. And it'd be good to work with something that has already dealt with these issues. I still don't know if I need to normalize my input data, how to untangle the activation result for back propagation, etc. Any insights are welcome. 


Tim Washington 

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en




--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
Timothy Washington | 4 Aug 2012 23:47
Picon

Re: community interest in machine learning (?)

Hey Jim, 


Thanks for looking into these things. I tried removing clojure-encog from lib/ and .m2/ . But 'lein deps' still pulls in a jar without the normalization.clj file. Do I need an updated [clojure-encog "0.4.x-SNAPSHOT"]

Also, I'll take a peek at the source for 'org.encog.ml.data.temporal.TemporalMLDataSet'. As for the structure of the tick data, I used clojure-csv to to pull that data from this CSV file (you'll have to download as it's large). So it's easy for me to convert those strings to doubles, datetimes, etc. 


This helps a great deal 

Cheers 

Tim Washington 
416.843.9060 



On Sat, Aug 4, 2012 at 3:19 PM, Jim - FooBar(); <jimpil1985 <at> gmail.com> wrote:
Hmmm, I think it is worth downloading the source for encog 3.1 for java and look into: org.encog.ml.data.temporal.TemporalMLDataSet

I think this is what you need to add several columns...unfortunately I've not wrapped this yet so you will have to do some interop to get it going...I promise you it will be the first thing I look at as soon as I find some time...

hope that helps...

Jim

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
Jim - FooBar(); | 4 Aug 2012 23:59
Picon

Re: community interest in machine learning (?)

On 04/08/12 22:47, Timothy Washington wrote:
Thanks for looking into these things. I tried removing clojure-encog from lib/ and .m2/ . But 'lein deps' still pulls in a jar without the normalization.clj file. Do I need an updated [clojure-encog "0.4.x-SNAPSHOT"]

aaa sorry! yes you now need 0.4.1-SNAPSHOT...

As for the structure of the tick data, I used clojure-csv to to pull that data from this CSV file (you'll have to download as it's large). So it's easy for me to convert those strings to doubles, datetimes, etc.

cool! i think read-string should do it anyway (at least for numbers)

This helps a great deal

that is good to know... cheers!

Jim




--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
Timothy Washington | 5 Aug 2012 00:08
Picon

Re: community interest in machine learning (?)

Ah, it looks like I have the right jar now. 


And one more thing. I'm looking at the TemporalMLDataSet.java source, and I keep on seeing references to a 'windowSize'. What is a inputWindowSize and a predictWindowSize ? 


Thanks 

Tim Washington 
416.843.9060 



On Sat, Aug 4, 2012 at 5:59 PM, Jim - FooBar(); <jimpil1985 <at> gmail.com> wrote:
On 04/08/12 22:47, Timothy Washington wrote:
Thanks for looking into these things. I tried removing clojure-encog from lib/ and .m2/ . But 'lein deps' still pulls in a jar without the normalization.clj file. Do I need an updated [clojure-encog "0.4.x-SNAPSHOT"]

aaa sorry! yes you now need 0.4.1-SNAPSHOT...


As for the structure of the tick data, I used clojure-csv to to pull that data from this CSV file (you'll have to download as it's large). So it's easy for me to convert those strings to doubles, datetimes, etc.

cool! i think read-string should do it anyway (at least for numbers)


This helps a great deal

that is good to know... cheers!

Jim




--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
Jim - FooBar(); | 5 Aug 2012 00:21
Picon

Re: community interest in machine learning (?)

On 04/08/12 23:08, Timothy Washington wrote:
And one more thing. I'm looking at the TemporalMLDataSet.java source, and I keep on seeing references to a 'windowSize'. What is a inputWindowSize and a predictWindowSize ? 


I suppose your window-size is how far back from the present you want to look each time...it needs to be compatible with your input layer...a window-size of 30 with 5 indicators would give you 150 (3*50=150) input neurons as confirmed in this post:

http://www.heatonresearch.com/node/2124


Jim



 

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
Timothy Washington | 5 Aug 2012 00:35
Picon

Re: community interest in machine learning (?)

Ok, this makes sense. 


Thanks very much for your insights. 


Tim Washington 
416.843.9060 



On Sat, Aug 4, 2012 at 6:21 PM, Jim - FooBar(); <jimpil1985 <at> gmail.com> wrote:
On 04/08/12 23:08, Timothy Washington wrote:
And one more thing. I'm looking at the TemporalMLDataSet.java source, and I keep on seeing references to a 'windowSize'. What is a inputWindowSize and a predictWindowSize ? 


I suppose your window-size is how far back from the present you want to look each time...it needs to be compatible with your input layer...a window-size of 30 with 5 indicators would give you 150 (3*50=150) input neurons as confirmed in this post:

http://www.heatonresearch.com/node/2124


Jim



 

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
Jim - FooBar(); | 5 Aug 2012 01:15
Picon

Re: community interest in machine learning (?)

No worries...

 looking at the examples.clj in 0.4.1-SNAPSHOT, it is likely it won't even compile which is good evidence of how much your first email made me jump!!! If you want to run the examples just copy-paste the entire code in a namespace of your own while commenting out the travelling-salesman-problem  example. that should sort any compilation issues...

Jim

On 04/08/12 23:35, Timothy Washington wrote:
Ok, this makes sense. 

Thanks very much for your insights. 


Tim Washington 
416.843.9060 



On Sat, Aug 4, 2012 at 6:21 PM, Jim - FooBar(); <jimpil1985 <at> gmail.com> wrote:
On 04/08/12 23:08, Timothy Washington wrote:
And one more thing. I'm looking at the TemporalMLDataSet.java source, and I keep on seeing references to a 'windowSize'. What is a inputWindowSize and a predictWindowSize ? 


I suppose your window-size is how far back from the present you want to look each time...it needs to be compatible with your input layer...a window-size of 30 with 5 indicators would give you 150 (3*50=150) input neurons as confirmed in this post:

http://www.heatonresearch.com/node/2124


Jim



 
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
Timothy Washington | 5 Aug 2012 20:31
Picon

Re: community interest in machine learning (?)

Heyya, 


So I got to playing with the core encog-java system. This thread is getting a bit long, so I put my thoughts into a new thread here: Playing with clojure-encog, Machine-Learning wrapper . 

But the thrust is that I need to know how I can give the encog neural net a list of tick data that has second or sub-second intervals? Have a look. Any insights are appreciated. 


Thanks :) 

Tim Washington 
416.843.9060 



On Sat, Aug 4, 2012 at 7:15 PM, Jim - FooBar(); <jimpil1985 <at> gmail.com> wrote:
No worries...

 looking at the examples.clj in 0.4.1-SNAPSHOT, it is likely it won't even compile which is good evidence of how much your first email made me jump!!! If you want to run the examples just copy-paste the entire code in a namespace of your own while commenting out the travelling-salesman-problem  example. that should sort any compilation issues...

Jim


On 04/08/12 23:35, Timothy Washington wrote:
Ok, this makes sense. 

Thanks very much for your insights. 


Tim Washington 



On Sat, Aug 4, 2012 at 6:21 PM, Jim - FooBar(); <jimpil1985 <at> gmail.com> wrote:
On 04/08/12 23:08, Timothy Washington wrote:
And one more thing. I'm looking at the TemporalMLDataSet.java source, and I keep on seeing references to a 'windowSize'. What is a inputWindowSize and a predictWindowSize ? 


I suppose your window-size is how far back from the present you want to look each time...it needs to be compatible with your input layer...a window-size of 30 with 5 indicators would give you 150 (3*50=150) input neurons as confirmed in this post:

http://www.heatonresearch.com/node/2124


Jim



 
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure <at> googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe <at> googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Gmane