chrisLi | 14 Jun 2012 18:06
Favicon

Is a single StatefulKnowledgeSession with Distributed Memory cache possible?

Hi All,

     I am working on a banking fraud detection project with Drools Fusion,
which will match a transaction against hunreds of rules to check whether the
transaction is suspicious.

     In some rules, I use time-based sliding window to calculate the average
transaction amount of an account in the past 3 or 6 months. One possible
rule will be as below:

    rule "Single Large Amount Transaction"
    dialect "mvel"
    when
        $account : Account($number : number)
        $averageAmount : BigDecimal() from accumulate(
            TransactionCompletedEvent(fromAccountNumber == $account.number,
$amount : amount)
            over window:time(90d)
            from entry-point TransactionStream,
            bigDecimalAverage($amount))
        $t1 : TransactionCreatedEvent(fromAccountNumber == $account.number,
            amount > $account.creditAmount * 0.5, amount > $averageAmount *
3.0)
           from entry-point TransactionStream
    then
    end

    In such cases, the Fusion Engine will hold TransactionCompletedEvent in
its memory for 90 days. And we have about 1 billion Accounts in total, so
the TransactionCompletedEvent will be huge, we will very soon run out of
(Continue reading)

Mauricio Salatino | 14 Jun 2012 18:21
Picon
Gravatar

Re: Is a single StatefulKnowledgeSession with Distributed Memory cache possible?

Yes, you get it right.

Drools Grid (or at least what is there in the source code right now) was about session virtualization. It allows you to access to a session hosted in a different JVM. 
What you are looking for, I'm afraid is not possible right now, because in order to distribute one session into multiple JVMs the RETE algorithm needs to be split. There are some experimental works around this, but nothing has being released yet. Probably Mark can give you more details about that.
Cheers

On Thu, Jun 14, 2012 at 1:06 PM, chrisLi <shengtao0077 <at> 163.com> wrote:
Hi All,

    I am working on a banking fraud detection project with Drools Fusion,
which will match a transaction against hunreds of rules to check whether the
transaction is suspicious.

    In some rules, I use time-based sliding window to calculate the average
transaction amount of an account in the past 3 or 6 months. One possible
rule will be as below:

   rule "Single Large Amount Transaction"
   dialect "mvel"
   when
       $account : Account($number : number)
       $averageAmount : BigDecimal() from accumulate(
           TransactionCompletedEvent(fromAccountNumber == $account.number,
$amount : amount)
           over window:time(90d)
           from entry-point TransactionStream,
           bigDecimalAverage($amount))
       $t1 : TransactionCreatedEvent(fromAccountNumber == $account.number,
           amount > $account.creditAmount * 0.5, amount > $averageAmount *
3.0)
          from entry-point TransactionStream
   then
   end

   In such cases, the Fusion Engine will hold TransactionCompletedEvent in
its memory for 90 days. And we have about 1 billion Accounts in total, so
the TransactionCompletedEvent will be huge, we will very soon run out of
memory.

   I have been blocked here for a long time! Is it possible to distribute a
single StatefulKnowledgeSession in multiple JVMs or machines using
Distributed Memory cache such as Hazelcast? If yes, could you give me some
opinion on the solution? Or is this the problem the Drools Grid project try
to handle? Or there are other techonology to handle large numbers of facts
or events problem?

   As far as I know, Drools Grid distribute multilple ksessions on multiple
machines in the Grid, each or several kseesions on one node? Is my
understandings right?

   Any response or opinion from you will be appriciated! Thank you very
much!

--
View this message in context: http://drools.46999.n3.nabble.com/Is-a-single-StatefulKnowledgeSession-with-Distributed-Memory-cache-possible-tp4017968.html
Sent from the Drools: User forum mailing list archive at Nabble.com.
_______________________________________________
rules-users mailing list
rules-users <at> lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users



--
 - MyJourney <at> http://salaboy.wordpress.com
 - Co-Founder <at> http://www.jugargentina.org
 - Co-Founder <at> http://www.jbug.com.ar
 
 - Salatino "Salaboy" Mauricio -

_______________________________________________
rules-users mailing list
rules-users <at> lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users
Vincent LEGENDRE | 14 Jun 2012 18:22
Favicon

Re: Is a single StatefulKnowledgeSession with Distributed Memory cache possible?

May be you can think differently : Instead of keeping all objects for 90 days, keep only the accumulate
results for each previous day.

----- Mail original -----
De: "chrisLi" <shengtao0077 <at> 163.com>
À: rules-users <at> lists.jboss.org
Envoyé: Jeudi 14 Juin 2012 18:06:50
Objet: [rules-users] Is a single StatefulKnowledgeSession with Distributed Memory cache possible?

Hi All,

     I am working on a banking fraud detection project with Drools Fusion,
which will match a transaction against hunreds of rules to check whether the
transaction is suspicious.

     In some rules, I use time-based sliding window to calculate the average
transaction amount of an account in the past 3 or 6 months. One possible
rule will be as below:

    rule "Single Large Amount Transaction"
    dialect "mvel"
    when
        $account : Account($number : number)
        $averageAmount : BigDecimal() from accumulate(
            TransactionCompletedEvent(fromAccountNumber == $account.number,
$amount : amount)
            over window:time(90d)
            from entry-point TransactionStream,
            bigDecimalAverage($amount))
        $t1 : TransactionCreatedEvent(fromAccountNumber == $account.number,
            amount > $account.creditAmount * 0.5, amount > $averageAmount *
3.0)
           from entry-point TransactionStream
    then
    end

    In such cases, the Fusion Engine will hold TransactionCompletedEvent in
its memory for 90 days. And we have about 1 billion Accounts in total, so
the TransactionCompletedEvent will be huge, we will very soon run out of
memory.

    I have been blocked here for a long time! Is it possible to distribute a
single StatefulKnowledgeSession in multiple JVMs or machines using
Distributed Memory cache such as Hazelcast? If yes, could you give me some
opinion on the solution? Or is this the problem the Drools Grid project try
to handle? Or there are other techonology to handle large numbers of facts
or events problem?

    As far as I know, Drools Grid distribute multilple ksessions on multiple
machines in the Grid, each or several kseesions on one node? Is my
understandings right?

    Any response or opinion from you will be appriciated! Thank you very
much!   

--
View this message in context: http://drools.46999.n3.nabble.com/Is-a-single-StatefulKnowledgeSession-with-Distributed-Memory-cache-possible-tp4017968.html
Sent from the Drools: User forum mailing list archive at Nabble.com.
_______________________________________________
rules-users mailing list
rules-users <at> lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users

_______________________________________________
rules-users mailing list
rules-users <at> lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users
Mauricio Salatino | 14 Jun 2012 19:58
Picon
Gravatar

Re: Is a single StatefulKnowledgeSession with Distributed Memory cache possible?

Yes, that's another option too.. it really depends on how you can accumulate data or split your data to be analyzed.

Cheers

On Thu, Jun 14, 2012 at 1:22 PM, Vincent LEGENDRE <vincent.legendre <at> eurodecision.com> wrote:
May be you can think differently : Instead of keeping all objects for 90 days, keep only the accumulate results for each previous day.

----- Mail original -----
De: "chrisLi" <shengtao0077 <at> 163.com>
À: rules-users <at> lists.jboss.org
Envoyé: Jeudi 14 Juin 2012 18:06:50
Objet: [rules-users] Is a single StatefulKnowledgeSession with Distributed Memory cache possible?

Hi All,

    I am working on a banking fraud detection project with Drools Fusion,
which will match a transaction against hunreds of rules to check whether the
transaction is suspicious.

    In some rules, I use time-based sliding window to calculate the average
transaction amount of an account in the past 3 or 6 months. One possible
rule will be as below:

   rule "Single Large Amount Transaction"
   dialect "mvel"
   when
       $account : Account($number : number)
       $averageAmount : BigDecimal() from accumulate(
           TransactionCompletedEvent(fromAccountNumber == $account.number,
$amount : amount)
           over window:time(90d)
           from entry-point TransactionStream,
           bigDecimalAverage($amount))
       $t1 : TransactionCreatedEvent(fromAccountNumber == $account.number,
           amount > $account.creditAmount * 0.5, amount > $averageAmount *
3.0)
          from entry-point TransactionStream
   then
   end

   In such cases, the Fusion Engine will hold TransactionCompletedEvent in
its memory for 90 days. And we have about 1 billion Accounts in total, so
the TransactionCompletedEvent will be huge, we will very soon run out of
memory.

   I have been blocked here for a long time! Is it possible to distribute a
single StatefulKnowledgeSession in multiple JVMs or machines using
Distributed Memory cache such as Hazelcast? If yes, could you give me some
opinion on the solution? Or is this the problem the Drools Grid project try
to handle? Or there are other techonology to handle large numbers of facts
or events problem?

   As far as I know, Drools Grid distribute multilple ksessions on multiple
machines in the Grid, each or several kseesions on one node? Is my
understandings right?

   Any response or opinion from you will be appriciated! Thank you very
much!

--
View this message in context: http://drools.46999.n3.nabble.com/Is-a-single-StatefulKnowledgeSession-with-Distributed-Memory-cache-possible-tp4017968.html
Sent from the Drools: User forum mailing list archive at Nabble.com.
_______________________________________________
rules-users mailing list
rules-users <at> lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users

_______________________________________________
rules-users mailing list
rules-users <at> lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users



--
 - MyJourney <at> http://salaboy.wordpress.com
 - Co-Founder <at> http://www.jugargentina.org
 - Co-Founder <at> http://www.jbug.com.ar
 
 - Salatino "Salaboy" Mauricio -

_______________________________________________
rules-users mailing list
rules-users <at> lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users
chrisLi | 15 Jun 2012 03:32
Favicon

Re: Is a single StatefulKnowledgeSession with Distributed Memory cache possible?

Hi, 

   Thank you very much for so qucik response. I cannot even believe it!

   As far as I know, the Fusion engine has to store the event details for
sliding windows. Because if an event 

in the window is expired,  the Fusion engine still need this event details
to update the accumulate results. 

So, I think storing the accumulate results for per day could not conform to
Fusion's logic. 

   
   Thank you, all!

--
View this message in context: http://drools.46999.n3.nabble.com/Is-a-single-StatefulKnowledgeSession-with-Distributed-Memory-cache-possible-tp4017968p4017977.html
Sent from the Drools: User forum mailing list archive at Nabble.com.
_______________________________________________
rules-users mailing list
rules-users <at> lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users

Wolfgang Laun | 15 Jun 2012 06:03
Picon

Re: Is a single StatefulKnowledgeSession with Distributed Memory cache possible?

It's difficult proposing alternatives without knowing all the requirements.

If there are no rules that require the presence of transactions with
unknown/indeterminate account numbers, a caching strategy might
be considered. Clearly, this would render relying on Fusion features
obsolete, and (temporary) retractions would have to be done
explicitly. Most frequently used accounts would remain in WM up
for the period you have to consider. Others would drop out due to
being idle for a certain period. A new transaction for some
swapped out account would trigger reloading of old transactions.
(This is akin to some page swapping strategy used to implement
virtual memory in operating systems.)

This isn't a simple task, but given the huge amount of transactions
you have to cope with you may not have any feasible alternative
"out of the box".

-W


On 15 June 2012 03:32, chrisLi <shengtao0077 <at> 163.com> wrote:
Hi,

  Thank you very much for so qucik response. I cannot even believe it!

  As far as I know, the Fusion engine has to store the event details for
sliding windows. Because if an event

in the window is expired,  the Fusion engine still need this event details
to update the accumulate results.

So, I think storing the accumulate results for per day could not conform to
Fusion's logic.


  Thank you, all!


--
View this message in context: http://drools.46999.n3.nabble.com/Is-a-single-StatefulKnowledgeSession-with-Distributed-Memory-cache-possible-tp4017968p4017977.html
Sent from the Drools: User forum mailing list archive at Nabble.com.
_______________________________________________
rules-users mailing list
rules-users <at> lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users

_______________________________________________
rules-users mailing list
rules-users <at> lists.jboss.org
https://lists.jboss.org/mailman/listinfo/rules-users

Gmane