Gerhard Lipp | 23 Apr 14:53 2012

missing events ZMQ_FD / ZMQ_EVENTS

Hello,

I can observe the same behavior as stated here
(http://lists.zeromq.org/pipermail/zeromq-dev/2011-November/014615.html).
What I observe is also a XREP/XREQ (ROUTER/DEALER) prob, where the
XREQ is waiting forever to receive a message (which has been
definitely sent). When I poll (timer based) the ZMQ_EVENTs, the XREQ
is readable as expected. I am using libev (select based) for doing IO
and I am aware of the edge-based trigger behaviour (I am
reading/forwarding messages until ZMQ_EVENTs does not include the
ZMQ_POLLIN bit any more).

What is the status of this issue?
Unfortunately my setup is a bit complicated to share, but i would like
to help as much as possible.

Regards,
Gerhard

A libev workaround is to use both EV_READ and EV_WRITE bits, though
this adds a lot of unnecessary wake ups / callbacks etc.
Paul Colomiets | 23 Apr 21:13 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

Hi Gerhard,

On Mon, Apr 23, 2012 at 3:53 PM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
> Hello,
>
> I can observe the same behavior as stated here
> (http://lists.zeromq.org/pipermail/zeromq-dev/2011-November/014615.html).
> What I observe is also a XREP/XREQ (ROUTER/DEALER) prob, where the
> XREQ is waiting forever to receive a message (which has been
> definitely sent). When I poll (timer based) the ZMQ_EVENTs, the XREQ
> is readable as expected. I am using libev (select based) for doing IO
> and I am aware of the edge-based trigger behaviour (I am
> reading/forwarding messages until ZMQ_EVENTs does not include the
> ZMQ_POLLIN bit any more).
>
> What is the status of this issue?
> Unfortunately my setup is a bit complicated to share, but i would like
> to help as much as possible.
>

We are using zeromq with libev without any issues. The only non-obvious
thing is that even if you doing send to a socket, you need to check whether
it became readable (and vice versa). You can look at the code at:

    https://github.com/tailhook/zerogw/blob/master/src/http.c:300

It looks like:

    // Must wake up reading and on each send, because the way zmq sockets work
    ev_feed_event(root.loop, &route->zmq_forward.socket._watch, EV_READ);
(Continue reading)

Martin Hurton | 24 Apr 00:50 2012
Picon

Re: missing events ZMQ_FD / ZMQ_EVENTS

Hi Gerhard, is there an open issue for this? If not, can you file one
and provide a simple program reproducing this problem.
I would like to look into this.

- Martin

On Mon, Apr 23, 2012 at 2:53 PM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
> Hello,
>
> I can observe the same behavior as stated here
> (http://lists.zeromq.org/pipermail/zeromq-dev/2011-November/014615.html).
> What I observe is also a XREP/XREQ (ROUTER/DEALER) prob, where the
> XREQ is waiting forever to receive a message (which has been
> definitely sent). When I poll (timer based) the ZMQ_EVENTs, the XREQ
> is readable as expected. I am using libev (select based) for doing IO
> and I am aware of the edge-based trigger behaviour (I am
> reading/forwarding messages until ZMQ_EVENTs does not include the
> ZMQ_POLLIN bit any more).
>
> What is the status of this issue?
> Unfortunately my setup is a bit complicated to share, but i would like
> to help as much as possible.
>
> Regards,
> Gerhard
>
> A libev workaround is to use both EV_READ and EV_WRITE bits, though
> this adds a lot of unnecessary wake ups / callbacks etc.
> _______________________________________________
> zeromq-dev mailing list
(Continue reading)

Gerhard Lipp | 25 Apr 17:08 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

i figured out to boil down an example, which shows this bug.
it consists of three files:
1) x.lua doing the XREP XREQ stuff, must be started once
2) rep.lua implementing a simple echo replier, must be started once
3) req.lua making the request to rep.lua through x.lua. must be
started TWICE to produce the error. THIS PROCESS LOCKS. uncommenting
the ev.WRITE is a bad workaround to this issue.

-----------
-- x.lua:
------------
local zmq = require'zmq'
local ev = require'ev'
local c = zmq.init(1)
local xreq = c:socket(zmq.XREQ)
xreq:bind('tcp://127.0.0.1:13333')
local xrep = c:socket(zmq.XREP)
xrep:bind('tcp://127.0.0.1:13334')
local forward_io =
   function(src,dst)
      return ev.IO.new(
         function(loop,io)
            while true do
               local events = src:getopt(zmq.EVENTS)
               if events == zmq.POLLIN or events == (zmq.POLLIN +
zmq.POLLOUT) then
                  local more
                  repeat
                     local data = src:recv()
                     local more = src:getopt(zmq.RCVMORE) > 0
(Continue reading)

Paul Colomiets | 25 Apr 20:29 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

Hi Gerhard,

On Wed, Apr 25, 2012 at 6:08 PM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
> i figured out to boil down an example, which shows this bug.
> it consists of three files:
> 1) x.lua doing the XREP XREQ stuff, must be started once
> 2) rep.lua implementing a simple echo replier, must be started once
> 3) req.lua making the request to rep.lua through x.lua. must be
> started TWICE to produce the error. THIS PROCESS LOCKS. uncommenting
> the ev.WRITE is a bad workaround to this issue.

As far as I can see, it not a workaround. It's just the way ZMQ_FD works.
Uze zmq_poll if you don't feel comfortable for that. The only way you can change
that is returning getopt(zmq.EVENTS) instead of hardcoding ev.READ + ev.WRITE

--

-- 
Paul
Gerhard Lipp | 26 Apr 09:02 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

Hello Paul!

On Wed, Apr 25, 2012 at 8:29 PM, Paul Colomiets <paul <at> colomiets.name> wrote:
> Hi Gerhard,
>
> On Wed, Apr 25, 2012 at 6:08 PM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
>> i figured out to boil down an example, which shows this bug.
>> it consists of three files:
>> 1) x.lua doing the XREP XREQ stuff, must be started once
>> 2) rep.lua implementing a simple echo replier, must be started once
>> 3) req.lua making the request to rep.lua through x.lua. must be
>> started TWICE to produce the error. THIS PROCESS LOCKS. uncommenting
>> the ev.WRITE is a bad workaround to this issue.
>
> As far as I can see, it not a workaround. It's just the way ZMQ_FD works.
> Uze zmq_poll if you don't feel comfortable for that. The only way you can change
> that is returning getopt(zmq.EVENTS) instead of hardcoding ev.READ + ev.WRITE
>

According to the manual, the fd returned by zmq_getsockopt(ZMQ_FD)
"signals any pending events on the socket in an edge-triggered fashion
by making the file descriptor become ready for reading".

If ev.WRITE is required to get all ZMQ_POLLIN and/or ZMQ_POLLOUT
events, the doc should be clearer. Anyhow, as the source looks like,
the ZMQ_FD is the fd associated with the socket's "mailbox", which is
used for all kinds communication (state transitions?) inside of ZMQ. A
"selecting/polling" user process should not wake up unnecessarily to
avoid context switches, which are really expensive on our (embedded)
device. Thus i'd like to minimize the wakeups by just specifying
(Continue reading)

Paul Colomiets | 27 Apr 10:29 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

Hi Gerhard,

On Thu, Apr 26, 2012 at 10:02 AM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
> Hello Paul!
>
> On Wed, Apr 25, 2012 at 8:29 PM, Paul Colomiets <paul <at> colomiets.name> wrote:
>> Hi Gerhard,
>>
>> On Wed, Apr 25, 2012 at 6:08 PM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
>>> i figured out to boil down an example, which shows this bug.
>>> it consists of three files:
>>> 1) x.lua doing the XREP XREQ stuff, must be started once
>>> 2) rep.lua implementing a simple echo replier, must be started once
>>> 3) req.lua making the request to rep.lua through x.lua. must be
>>> started TWICE to produce the error. THIS PROCESS LOCKS. uncommenting
>>> the ev.WRITE is a bad workaround to this issue.
>>
>> As far as I can see, it not a workaround. It's just the way ZMQ_FD works.
>> Uze zmq_poll if you don't feel comfortable for that. The only way you can change
>> that is returning getopt(zmq.EVENTS) instead of hardcoding ev.READ + ev.WRITE
>>
>
> According to the manual, the fd returned by zmq_getsockopt(ZMQ_FD)
> "signals any pending events on the socket in an edge-triggered fashion
> by making the file descriptor become ready for reading".
>
> If ev.WRITE is required to get all ZMQ_POLLIN and/or ZMQ_POLLOUT
> events, the doc should be clearer. Anyhow, as the source looks like,
> the ZMQ_FD is the fd associated with the socket's "mailbox", which is
> used for all kinds communication (state transitions?) inside of ZMQ. A
(Continue reading)

Gerhard Lipp | 27 Apr 10:41 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

On Fri, Apr 27, 2012 at 10:29 AM, Paul Colomiets <paul <at> colomiets.name> wrote:
> Hi Gerhard,
>
> On Thu, Apr 26, 2012 at 10:02 AM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
>> Hello Paul!
>>
>> On Wed, Apr 25, 2012 at 8:29 PM, Paul Colomiets <paul <at> colomiets.name> wrote:
>>> Hi Gerhard,
>>>
>>> On Wed, Apr 25, 2012 at 6:08 PM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
>>>> i figured out to boil down an example, which shows this bug.
>>>> it consists of three files:
>>>> 1) x.lua doing the XREP XREQ stuff, must be started once
>>>> 2) rep.lua implementing a simple echo replier, must be started once
>>>> 3) req.lua making the request to rep.lua through x.lua. must be
>>>> started TWICE to produce the error. THIS PROCESS LOCKS. uncommenting
>>>> the ev.WRITE is a bad workaround to this issue.
>>>
>>> As far as I can see, it not a workaround. It's just the way ZMQ_FD works.
>>> Uze zmq_poll if you don't feel comfortable for that. The only way you can change
>>> that is returning getopt(zmq.EVENTS) instead of hardcoding ev.READ + ev.WRITE
>>>
>>
>> According to the manual, the fd returned by zmq_getsockopt(ZMQ_FD)
>> "signals any pending events on the socket in an edge-triggered fashion
>> by making the file descriptor become ready for reading".
>>
>> If ev.WRITE is required to get all ZMQ_POLLIN and/or ZMQ_POLLOUT
>> events, the doc should be clearer. Anyhow, as the source looks like,
>> the ZMQ_FD is the fd associated with the socket's "mailbox", which is
(Continue reading)

Paul Colomiets | 27 Apr 11:07 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

Hi Gerhard,

On Fri, Apr 27, 2012 at 11:41 AM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
>> Probably I don't understand the code. You must poll only for reading
>> on ZMQ_FD. But every zmq_send and zmq_recv cosumes mailbox.
>> Which means you must update you applications' state of readable
>> and writable flags (I mean your IO framework doesn't know that
>> socket became readable or writable).
>>
>> If you don't care about ZMQ_POLLOUT event, you still must check
>> ZMQ_EVENTS for reading on each zmq_send.
>
> You are right, I am actually just waiting to be able to zmq_recv with
> ZMQ_NOBLOCK. I dont care about the ZMQ_POLLOUT in this example. As the
> docs state, either event is signaled by the mailbox (ZMQ_FD) becoming
> ready to read (ev.READ). That is why i am checking for ZMQ_POLLIN
> before entering the zmq_recv/zmq_send.
>

So the real problem is misleading documentation? I think it would
be nice if you'd update documentation in a way that's understandable
for you, and send a pull request.

--

-- 
Paul
Gerhard Lipp | 27 Apr 11:37 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

I don't think it is a docu thing.

What the docu says  (and what the source looks like)
zmq_getsockopt(ZMQ_FD) returns a fd (the mailbox's), which becomes
readable, whenever the corresponding socket might have become readable
and/or writeable for operation with the NOBLOCK option. To check which
of these conditions are true, you have to use
zmq_getsockopt(ZMQ_EVENTS) and check for ZMQ_POLLIN / ZMQ_POLLOUT
respectively.

If this is true, users should ONLY select/poll for the read event,
e.g. using libev EV_READ, regardless if the user wants to
zmq_recv(ZMQ_NOBLOCK) or zmq_send(ZMQ_NOBLOCK). Then the example code
shows a bug or I am using the XREQ/XREP in a wrong way.

Else, you are right and the documentation has to be updated to
select/poll for read AND write events.

I guess the "solution/workaround" of the example (using ev.READ +
ev.WRITE) does not work reliable and under all circumstances, but just
in this primitive scenario.

greets

On Fri, Apr 27, 2012 at 11:07 AM, Paul Colomiets <paul <at> colomiets.name> wrote:
> Hi Gerhard,
>
> On Fri, Apr 27, 2012 at 11:41 AM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
>>> Probably I don't understand the code. You must poll only for reading
>>> on ZMQ_FD. But every zmq_send and zmq_recv cosumes mailbox.
(Continue reading)

Paul Colomiets | 27 Apr 11:57 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

Hi Gerhard,

On Fri, Apr 27, 2012 at 12:37 PM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
> I don't think it is a docu thing.
>
> What the docu says  (and what the source looks like)
> zmq_getsockopt(ZMQ_FD) returns a fd (the mailbox's), which becomes
> readable, whenever the corresponding socket might have become readable
> and/or writeable for operation with the NOBLOCK option. To check which
> of these conditions are true, you have to use
> zmq_getsockopt(ZMQ_EVENTS) and check for ZMQ_POLLIN / ZMQ_POLLOUT
> respectively.
>

This is totally true. The but it's silent on some things.

> If this is true, users should ONLY select/poll for the read event,
> e.g. using libev EV_READ, regardless if the user wants to
> zmq_recv(ZMQ_NOBLOCK) or zmq_send(ZMQ_NOBLOCK).

Yes. Only for EV_READ. I don't know how lua works with
libev so I've made an ill advice, sorry. (see below)

>
> I guess the "solution/workaround" of the example (using ev.READ +
> ev.WRITE) does not work reliable and under all circumstances, but just
> in this primitive scenario.
>

Adding ev.WRITE helps only because socket is *always* ready
(Continue reading)

Gerhard Lipp | 27 Apr 13:10 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

On Fri, Apr 27, 2012 at 11:57 AM, Paul Colomiets <paul <at> colomiets.name> wrote:
> Hi Gerhard,
>
> On Fri, Apr 27, 2012 at 12:37 PM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
>> I don't think it is a docu thing.
>>
>> What the docu says  (and what the source looks like)
>> zmq_getsockopt(ZMQ_FD) returns a fd (the mailbox's), which becomes
>> readable, whenever the corresponding socket might have become readable
>> and/or writeable for operation with the NOBLOCK option. To check which
>> of these conditions are true, you have to use
>> zmq_getsockopt(ZMQ_EVENTS) and check for ZMQ_POLLIN / ZMQ_POLLOUT
>> respectively.
>>
>
> This is totally true. The but it's silent on some things.
>
>> If this is true, users should ONLY select/poll for the read event,
>> e.g. using libev EV_READ, regardless if the user wants to
>> zmq_recv(ZMQ_NOBLOCK) or zmq_send(ZMQ_NOBLOCK).
>
> Yes. Only for EV_READ. I don't know how lua works with
> libev so I've made an ill advice, sorry. (see below)
>
>>
>> I guess the "solution/workaround" of the example (using ev.READ +
>> ev.WRITE) does not work reliable and under all circumstances, but just
>> in this primitive scenario.
>>
>
(Continue reading)

Paul Colomiets | 27 Apr 21:10 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

Hi Gerhard,

On Fri, Apr 27, 2012 at 2:10 PM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
> Ok, so i must always check if there are more events to process before
> returning from the io handler (frankly I don't understand the
> explanation). A short test still shows the lock explained earlier:

Try the following:

local zmq = require'zmq'
local ev = require'ev'
local c = zmq.init(1)
local xreq = c:socket(zmq.XREQ)
xreq:bind('tcp://127.0.0.1:13333')
local xrep = c:socket(zmq.XREP)
xrep:bind('tcp://127.0.0.1:13334')

local is_readable =
  function(sock)
     local events = sock:getopt(zmq.EVENTS)
     return events == zmq.POLLIN or events == (zmq.POLLIN + zmq.POLLOUT)
  end

local forward_io =
  function(src,dst)
     return ev.IO.new(
        function(loop,io) -- called whenever src:getopt(zmq.FD) becomes readable
            while is_readable(src) or is_readable(dst) do
               if is_readable(src) do
                  repeat
(Continue reading)

Gerhard Lipp | 2 May 09:11 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

hello paul!

i dont understand the background of your approach. why should the src
fd's io handler check the dst's events (and vice versa)?
even if this worked in this scenario, wouldn't it be a coincidence?
well, at least it is better than busy waiting / polling ...
regards

On Fri, Apr 27, 2012 at 9:10 PM, Paul Colomiets <paul <at> colomiets.name> wrote:
> Hi Gerhard,
>
> On Fri, Apr 27, 2012 at 2:10 PM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
>> Ok, so i must always check if there are more events to process before
>> returning from the io handler (frankly I don't understand the
>> explanation). A short test still shows the lock explained earlier:
>
> Try the following:
>
> local zmq = require'zmq'
> local ev = require'ev'
> local c = zmq.init(1)
> local xreq = c:socket(zmq.XREQ)
> xreq:bind('tcp://127.0.0.1:13333')
> local xrep = c:socket(zmq.XREP)
> xrep:bind('tcp://127.0.0.1:13334')
>
> local is_readable =
>  function(sock)
>     local events = sock:getopt(zmq.EVENTS)
>     return events == zmq.POLLIN or events == (zmq.POLLIN + zmq.POLLOUT)
(Continue reading)

Paul Colomiets | 2 May 11:27 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

Hi Gerhard,

On Wed, May 2, 2012 at 10:11 AM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
> hello paul!
>
> i dont understand the background of your approach. why should the src
> fd's io handler check the dst's events (and vice versa)?

It's the simplest way I've found to solve a problem.

> even if this worked in this scenario, wouldn't it be a coincidence?

No, it's not coincidence.

> well, at least it is better than busy waiting / polling ...

Yes. There are various way to optimize presented code, I've
just picked up something on to of my head.

On Wed, May 2, 2012 at 11:26 AM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
> btw, using the build in poller just works:

Yes the builtin poller works in intuitive way.
ZMQ_FD is meant for experts. So if you don't
understand how it works, you can just use
builtin poller without the problems.

--

-- 
Paul
(Continue reading)

Gerhard Lipp | 2 May 11:45 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

On Wed, May 2, 2012 at 11:27 AM, Paul Colomiets <paul <at> colomiets.name> wrote:
> Hi Gerhard,
>
> On Wed, May 2, 2012 at 10:11 AM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
>> hello paul!
>>
>> i dont understand the background of your approach. why should the src
>> fd's io handler check the dst's events (and vice versa)?
>
> It's the simplest way I've found to solve a problem.
>
>> even if this worked in this scenario, wouldn't it be a coincidence?
>
> No, it's not coincidence.
>
>> well, at least it is better than busy waiting / polling ...
>
> Yes. There are various way to optimize presented code, I've
> just picked up something on to of my head.

I really appreciate any help and ideas to solve this issue! I just did
not get the idea behind this attempt.
Could you explain it in more detail (something particular to observe)?

>
> On Wed, May 2, 2012 at 11:26 AM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
>> btw, using the build in poller just works:
>
> Yes the builtin poller works in intuitive way.
> ZMQ_FD is meant for experts. So if you don't
(Continue reading)

Paul Colomiets | 2 May 12:27 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

Hi Gerhard,

On Wed, May 2, 2012 at 12:45 PM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
>
> I really appreciate any help and ideas to solve this issue! I just did
> not get the idea behind this attempt.
> Could you explain it in more detail (something particular to observe)?
>

Ok. Behind the scenes ZMQ_FD, is basically a counter, which wakes up
poll when is non-zero. The counter is reset on each getsockopt ZMQ_EVENTS,
zmq_send and zmq_recv.

The following diagram shows race condition with two sockets A and B,
in a scenario similar to yours:

https://docs.google.com/drawings/d/1F97jpdbYMjjb6-2VzRtiL2LpHy638-AEOyrUX84HL78/edit

Note: the last poll is entered with both counters set to zero, so it
will not wake up, despite the fact that there is pending message.

--

-- 
Paul
Gerhard Lipp | 14 May 18:49 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

Hello Paul,

thanks for the diagram! I would like to locate the variables cntA  /
cntB in source to understand what is going on (and why). Could you
please point me in the right direction?

Regards

On Wed, May 2, 2012 at 12:27 PM, Paul Colomiets <paul <at> colomiets.name> wrote:
> Hi Gerhard,
>
> On Wed, May 2, 2012 at 12:45 PM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
>>
>> I really appreciate any help and ideas to solve this issue! I just did
>> not get the idea behind this attempt.
>> Could you explain it in more detail (something particular to observe)?
>>
>
> Ok. Behind the scenes ZMQ_FD, is basically a counter, which wakes up
> poll when is non-zero. The counter is reset on each getsockopt ZMQ_EVENTS,
> zmq_send and zmq_recv.
>
> The following diagram shows race condition with two sockets A and B,
> in a scenario similar to yours:
>
> https://docs.google.com/drawings/d/1F97jpdbYMjjb6-2VzRtiL2LpHy638-AEOyrUX84HL78/edit
>
> Note: the last poll is entered with both counters set to zero, so it
> will not wake up, despite the fact that there is pending message.
>
(Continue reading)

Paul Colomiets | 14 May 21:15 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

Hi Gerhard,

On Mon, May 14, 2012 at 7:49 PM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
> thanks for the diagram! I would like to locate the variables cntA  /
> cntB in source to understand what is going on (and why). Could you
> please point me in the right direction?
>

Look at src/signaller.cpp. When it's on linux, and eventfd is
supported, the real counter
is inside that eventfd. In other implementations the counter is number
of bytes that
are currently in pipe's buffer. In any case it's value is read inside
signaler_t::recv.

--

-- 
Paul
Justin Karneges | 26 Jun 01:16 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

On Wednesday, May 02, 2012 03:27:42 AM Paul Colomiets wrote:
> Ok. Behind the scenes ZMQ_FD, is basically a counter, which wakes up
> poll when is non-zero. The counter is reset on each getsockopt ZMQ_EVENTS,
> zmq_send and zmq_recv.
> 
> The following diagram shows race condition with two sockets A and B,
> in a scenario similar to yours:
> 
> https://docs.google.com/drawings/d/1F97jpdbYMjjb6-2VzRtiL2LpHy638-AEOyrUX84
> HL78/edit
> 
> Note: the last poll is entered with both counters set to zero, so it
> will not wake up, despite the fact that there is pending message.

Was there ever a resolution on this?

I am using ZMQ_FD now to integrate into an event loop, and I am seeing some 
odd behavior when testing a hello world REQ/REP on the REP side.

The REP server binds and waits for data. The fd is indicated as readable 
twice. First, the events are 0 (maybe this happens when the client connects?), 
then the events are 1 (ZMQ_POLLIN). The server considers the REP socket 
readable and so it reads a message without blocking. Now it wants to reply, 
but it considers the socket not yet writable. I was expecting that after 
reading from the socket, the fd would be indicated as readable and the events 
would be 2 (ZMQ_POLLOUT). However, this event never comes and so the server 
just idles.

Now here's where it gets weird: if I kill the client (which was also waiting 
around, as it never got a reply), then the server gets new events with 
(Continue reading)

Paul Colomiets | 27 Jun 21:44 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

Hi Justin,

On Tue, Jun 26, 2012 at 2:16 AM, Justin Karneges <justin <at> affinix.com> wrote:
> On Wednesday, May 02, 2012 03:27:42 AM Paul Colomiets wrote:
>> Ok. Behind the scenes ZMQ_FD, is basically a counter, which wakes up
>> poll when is non-zero. The counter is reset on each getsockopt ZMQ_EVENTS,
>> zmq_send and zmq_recv.
>>
>> The following diagram shows race condition with two sockets A and B,
>> in a scenario similar to yours:
>>
>> https://docs.google.com/drawings/d/1F97jpdbYMjjb6-2VzRtiL2LpHy638-AEOyrUX84
>> HL78/edit
>>
>> Note: the last poll is entered with both counters set to zero, so it
>> will not wake up, despite the fact that there is pending message.
>
> Was there ever a resolution on this?
>
> I am using ZMQ_FD now to integrate into an event loop, and I am seeing some
> odd behavior when testing a hello world REQ/REP on the REP side.
>
> The REP server binds and waits for data. The fd is indicated as readable
> twice. First, the events are 0 (maybe this happens when the client connects?),
> then the events are 1 (ZMQ_POLLIN). The server considers the REP socket
> readable and so it reads a message without blocking. Now it wants to reply,
> but it considers the socket not yet writable. I was expecting that after
> reading from the socket, the fd would be indicated as readable and the events
> would be 2 (ZMQ_POLLOUT). However, this event never comes and so the server
> just idles.
(Continue reading)

Justin Karneges | 27 Jun 21:57 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

On Wednesday, June 27, 2012 12:44:45 PM Paul Colomiets wrote:
> On Tue, Jun 26, 2012 at 2:16 AM, Justin Karneges <justin <at> affinix.com> wrote:
> > Does this mean that maybe I need to check ZMQ_EVENTS not only after read
> > indications on the fd, but also after anytime I call zmq_recv() ?
> 
> I've not tried REP sockets with asynchronous event loop (XREP usually
> needed). But I'm pretty sure, you're right. You need to recheck
> ZMQ_EVENTS after doing zmq_recv(), as the state of the socket changes
> at that time (it's not writable before not because of network issues
> but because of state machine).

Yeah I understand the ability to write is part of the state change that occurs 
by reading. I just wonder why the ZMQ_FD isn't triggered internally by 
zmq_recv(). That would have been more intuitive I think.

> However, checking ZMQ_EVENTS after each zmq_recv and zmq_send is
> needed anyway, as described in current documentation and in this ML
> thread.

In which document is this described? I do not see this in the ZMQ_EVENTS 
section of the zmq_getsockopt man page in 2.2.0.

In any case, thanks for clarifying. I'd actually gone ahead and changed my 
code to check ZMQ_EVENTS after all three scenarios (post zmq_recv, post 
zmq_send, and upon read indication of the ZMQ_FD), and that managed to get 
things to work properly.

Justin
Paul Colomiets | 27 Jun 22:19 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

Hi,

On Wed, Jun 27, 2012 at 10:57 PM, Justin Karneges <justin <at> affinix.com> wrote:
> On Wednesday, June 27, 2012 12:44:45 PM Paul Colomiets wrote:
>> On Tue, Jun 26, 2012 at 2:16 AM, Justin Karneges <justin <at> affinix.com> wrote:
>> > Does this mean that maybe I need to check ZMQ_EVENTS not only after read
>> > indications on the fd, but also after anytime I call zmq_recv() ?
>>
>> I've not tried REP sockets with asynchronous event loop (XREP usually
>> needed). But I'm pretty sure, you're right. You need to recheck
>> ZMQ_EVENTS after doing zmq_recv(), as the state of the socket changes
>> at that time (it's not writable before not because of network issues
>> but because of state machine).
>
> Yeah I understand the ability to write is part of the state change that occurs
> by reading. I just wonder why the ZMQ_FD isn't triggered internally by
> zmq_recv(). That would have been more intuitive I think.
>

For performance reasons: its cheaper to call zmq_getsockopt, than to
write to, poll and read from fd.

>> However, checking ZMQ_EVENTS after each zmq_recv and zmq_send is
>> needed anyway, as described in current documentation and in this ML
>> thread.
>
> In which document is this described? I do not see this in the ZMQ_EVENTS
> section of the zmq_getsockopt man page in 2.2.0.
>

(Continue reading)

Martin Hurton | 26 Apr 10:45 2012
Picon

Re: missing events ZMQ_FD / ZMQ_EVENTS

Thanks Gerhard, could you please create an issue in Jira? And please
attach those Lua programs too. Thanks.

- mh

On Wed, Apr 25, 2012 at 5:08 PM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
> i figured out to boil down an example, which shows this bug.
> it consists of three files:
> 1) x.lua doing the XREP XREQ stuff, must be started once
> 2) rep.lua implementing a simple echo replier, must be started once
> 3) req.lua making the request to rep.lua through x.lua. must be
> started TWICE to produce the error. THIS PROCESS LOCKS. uncommenting
> the ev.WRITE is a bad workaround to this issue.
>
> -----------
> -- x.lua:
> ------------
> local zmq = require'zmq'
> local ev = require'ev'
> local c = zmq.init(1)
> local xreq = c:socket(zmq.XREQ)
> xreq:bind('tcp://127.0.0.1:13333')
> local xrep = c:socket(zmq.XREP)
> xrep:bind('tcp://127.0.0.1:13334')
> local forward_io =
>   function(src,dst)
>      return ev.IO.new(
>         function(loop,io)
>            while true do
>               local events = src:getopt(zmq.EVENTS)
(Continue reading)

Gerhard Lipp | 26 Apr 11:32 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

Issue created.

On Thu, Apr 26, 2012 at 10:45 AM, Martin Hurton <hurtonm <at> gmail.com> wrote:
> Thanks Gerhard, could you please create an issue in Jira? And please
> attach those Lua programs too. Thanks.
>
> - mh
>
> On Wed, Apr 25, 2012 at 5:08 PM, Gerhard Lipp <gelipp <at> googlemail.com> wrote:
>> i figured out to boil down an example, which shows this bug.
>> it consists of three files:
>> 1) x.lua doing the XREP XREQ stuff, must be started once
>> 2) rep.lua implementing a simple echo replier, must be started once
>> 3) req.lua making the request to rep.lua through x.lua. must be
>> started TWICE to produce the error. THIS PROCESS LOCKS. uncommenting
>> the ev.WRITE is a bad workaround to this issue.
>>
>> -----------
>> -- x.lua:
>> ------------
>> local zmq = require'zmq'
>> local ev = require'ev'
>> local c = zmq.init(1)
>> local xreq = c:socket(zmq.XREQ)
>> xreq:bind('tcp://127.0.0.1:13333')
>> local xrep = c:socket(zmq.XREP)
>> xrep:bind('tcp://127.0.0.1:13334')
>> local forward_io =
>>   function(src,dst)
>>      return ev.IO.new(
(Continue reading)

Gerhard Lipp | 2 May 10:26 2012

Re: missing events ZMQ_FD / ZMQ_EVENTS

btw, using the build in poller just works:

-------
-- xpoller.lua
---------
local zmq = require'zmq'
zmq.poller = require'zmq.poller'
local ev = require'ev'
local c = zmq.init(1)
local xreq = c:socket(zmq.XREQ)
xreq:bind('tcp://127.0.0.1:13333')
local xrep = c:socket(zmq.XREP)
xrep:bind('tcp://127.0.0.1:13334')

local is_readable =
   function(sock)
      local events = sock:getopt(zmq.EVENTS)
      return events == zmq.POLLIN or events == (zmq.POLLIN + zmq.POLLOUT)
   end

local forward =
   function(src,dst)
      while is_readable(src) do
         repeat
            local data = assert(src:recv(zmq.NOBLOCK))
            local more = src:getopt(zmq.RCVMORE) > 0
            dst:send(data,more and zmq.SNDMORE or 0)
         until not more
      end
   end
(Continue reading)


Gmane