Julian Bui | 7 Jun 2012 22:59
Picon

event_base_dispatch returning -1 - help debugging

Hi everyone,


I am having trouble with my libevent 2.0.18 server.  The dispatch loop keeps returning -1 and I cannot determine the cause.

I have not tried searching the mailing list as there does not seem to be a search option (http://archives.seul.org/libevent/users/)

PROBLEM:
My server works most of the time but every once in a while when there are pending requests going into it and I restart the server, it claims it has successfully written to socket.  As soon as the callback from the EV_WRITE event returns, then the dispatch loop exits with -1.  As I understand it, this means there was an error.

OVERVIEW/CONTEXT:
At a high level, I add an event with EV_READ | EV_PERSIST that listens to my server socket to my base and start the dispatch loop and calls onAccept when triggered.

onAccept will create a client socket and then I add an event with EV_READ | EV_PERSIST that listens to my client socket to my base and calls onRead when triggered.

onRead will call recv and then call event_del on the read event.  It then adds an event with EV_WRITE | EV_PERSIST that listens to my client socket to my base and calls onWrite when triggered.

onWrite always removes all events (read/write) associated with this client socket and then closes the socket.  As far as I can tell, before the dispatch loop exits, there was no socket error on write/close.

MY STEPS FOR DEBUGGING:
  1. tried adding logging, but it never seems to print anything.  I create a onLogEvent(int severity, const char* msg) method and pass it to event_set_log_callback(...).  I also create an onFatalError method and pass it to event_fatal_callback(...) then enable debug mode.  In these callback methods, I just print out the msg and/or error code to stdout.
  2. After I break out of the dispatch loop and see -1 returned, I attempt to determine if the loop was broken or exited.  I call if(!event_base_got_break(s_event_base)) and if(!event_base_got_exit(s_event_base)), and both evaluate to true, meaning the dispatch loop was not broken or exited.
  3. After every windows networking call, I get the error code and if it is equal to SOCKET_ERROR, I get the WSAGetLastError().  However, when my dispatch loop exits, I can see that there was no error returned from the previous networking call.
  4. I test event_pending(my_event, EV_READ | EV_PERSIST, NULL) on the persistent event I set up to listen for incoming connections.  This event should be persistent and I never event_del() it.  However, event_pending keeps returning 0 at all times (even before exiting the dispatch loop).  I am suspecting that this method is broken or, more likely, I am not using it correctly.
QUESTIONS:

I am stumped here.  I am also frustrated that my debugging/logging attempts are failing.  

I am looking for suggestions and possible explanations.

Maybe there are still things I need to do after the socket is closed?

Please help me out.

Thanks in advance,
-Julian

Nick Mathewson | 8 Jun 2012 22:10

Re: event_base_dispatch returning -1 - help debugging

On Thu, Jun 7, 2012 at 4:59 PM, Julian Bui <julianbui <at> gmail.com> wrote:
> Hi everyone,
>
> I am having trouble with my libevent 2.0.18 server.  The dispatch loop keeps
> returning -1 and I cannot determine the cause.
>
> I have not tried searching the mailing list as there does not seem to be a
> search option (http://archives.seul.org/libevent/users/)

Step one might be to build with debugging support and try that log
trick again, with debugging logs enabled; that might shed some light
on what's going on.

It sounds (Based on your mention of WSAGetLastError()) like you're
using windows here.  The only ways that I can see for win32select's
win32_dispatch functino to return -1 are if select() returns -1, or
realloc returns -1.

A select() failure seems likelier.  You could set a debugging
tracepoint at the part of win32_dispatch that says:
   if (res <= 0) {
      return res;
   }
and see if it ever returns with res = -1, and if so what the value of
WSAGetLastError is.  Or you could insert a printf there if you don't
want to mess with the debugger.

Do any of the error codes in the documentation for select() seem
plausible to you?  The likeliest one as far as I can tell is that
there is a nonexistent (or no-longer-existent) socket still in the
list of sockets that select() is looking at.
***********************************************************************
To unsubscribe, send an e-mail to majordomo <at> freehaven.net with
unsubscribe libevent-users    in the body.

Julian Bui | 12 Jun 2012 18:47
Picon

Re: event_base_dispatch returning -1 - help debugging

Thanks for the quick response, Nick.


Step one might be to build with debugging support 

Ah, I didn't realize debugging required you to build with support for it.

I am using windows and have built with just `nmake` in visual studio.  What is the preferred method of building in windows?  When I use the visual studio command line, I cannot seem to use configure.  `configure`  and `configure --enable-debug-mode` show "is not recognized as an internal or external command, operable program or batch file"

A select() failure seems likelier

I think I might possibly be misunderstanding the usage of libevent.  I thought libevent abstracts away the select mechanism so I don't have to deal with it.  When I was looking at example code, they never once had to deal with the underlying mechanism like select().  Could you possibly comment on this?  What is the general strategy/structure/architecture of a program if I am to use both libevent and select()?  Is there any example code that shows this interaction?

Please let me know.

Thanks for all your help
-Julian


On Fri, Jun 8, 2012 at 1:10 PM, Nick Mathewson <nickm <at> freehaven.net> wrote:
On Thu, Jun 7, 2012 at 4:59 PM, Julian Bui <julianbui <at> gmail.com> wrote:
> Hi everyone,
>
> I am having trouble with my libevent 2.0.18 server.  The dispatch loop keeps
> returning -1 and I cannot determine the cause.
>
> I have not tried searching the mailing list as there does not seem to be a
> search option (http://archives.seul.org/libevent/users/)

Step one might be to build with debugging support and try that log
trick again, with debugging logs enabled; that might shed some light
on what's going on.

It sounds (Based on your mention of WSAGetLastError()) like you're
using windows here.  The only ways that I can see for win32select's
win32_dispatch functino to return -1 are if select() returns -1, or
realloc returns -1.

A select() failure seems likelier.  You could set a debugging
tracepoint at the part of win32_dispatch that says:
  if (res <= 0) {
     return res;
  }
and see if it ever returns with res = -1, and if so what the value of
WSAGetLastError is.  Or you could insert a printf there if you don't
want to mess with the debugger.

Do any of the error codes in the documentation for select() seem
plausible to you?  The likeliest one as far as I can tell is that
there is a nonexistent (or no-longer-existent) socket still in the
list of sockets that select() is looking at.
***********************************************************************
To unsubscribe, send an e-mail to majordomo <at> freehaven.net with
unsubscribe libevent-users    in the body.

Julian Bui | 12 Jun 2012 19:06
Picon

Re: event_base_dispatch returning -1 - help debugging

Oh, one more thing.  I would really like to verify that there is indeed an event still pending - because I can at least restart the dispatch loop if there is an event pending.  But I have not been able to correctly use event_pending for some reason.  My code is below and it always prints "event is not pending" regardless of whether or not the dispatch loop is running.  The code sets up a persistent event that will handle every incoming connection to my server.  I would expect that at some point it would print out that the event is pending...because like I said, this code DOES work most of the time, and if it's not pending I have no idea why the handlers would even still be triggered.


struct event* my_event = event_new(s_event_base, server_socket, EV_READ | EV_PERSIST, &TileSocketServer::onAccept, (void*) client);

      if(my_event != NULL)
      {
         int add_err = event_add(my_event, NULL);

         if(add_err != 0)
            cout << "add_err:" << add_err << endl;

         int is_evt_pending = event_pending(my_event, EV_READ | EV_PERSIST, NULL);
         if(is_evt_pending == 1)
         {
            cout << "event is still pending" << endl;
         }
         else
         {
            cout << "event is still pending" << endl;
         }



On Tue, Jun 12, 2012 at 9:47 AM, Julian Bui <julianbui <at> gmail.com> wrote:
Thanks for the quick response, Nick.

Step one might be to build with debugging support 

Ah, I didn't realize debugging required you to build with support for it.

I am using windows and have built with just `nmake` in visual studio.  What is the preferred method of building in windows?  When I use the visual studio command line, I cannot seem to use configure.  `configure`  and `configure --enable-debug-mode` show "is not recognized as an internal or external command, operable program or batch file"

A select() failure seems likelier

I think I might possibly be misunderstanding the usage of libevent.  I thought libevent abstracts away the select mechanism so I don't have to deal with it.  When I was looking at example code, they never once had to deal with the underlying mechanism like select().  Could you possibly comment on this?  What is the general strategy/structure/architecture of a program if I am to use both libevent and select()?  Is there any example code that shows this interaction?

Please let me know.

Thanks for all your help
-Julian


On Fri, Jun 8, 2012 at 1:10 PM, Nick Mathewson <nickm <at> freehaven.net> wrote:
On Thu, Jun 7, 2012 at 4:59 PM, Julian Bui <julianbui <at> gmail.com> wrote:
> Hi everyone,
>
> I am having trouble with my libevent 2.0.18 server.  The dispatch loop keeps
> returning -1 and I cannot determine the cause.
>
> I have not tried searching the mailing list as there does not seem to be a
> search option (http://archives.seul.org/libevent/users/)

Step one might be to build with debugging support and try that log
trick again, with debugging logs enabled; that might shed some light
on what's going on.

It sounds (Based on your mention of WSAGetLastError()) like you're
using windows here.  The only ways that I can see for win32select's
win32_dispatch functino to return -1 are if select() returns -1, or
realloc returns -1.

A select() failure seems likelier.  You could set a debugging
tracepoint at the part of win32_dispatch that says:
  if (res <= 0) {
     return res;
  }
and see if it ever returns with res = -1, and if so what the value of
WSAGetLastError is.  Or you could insert a printf there if you don't
want to mess with the debugger.

Do any of the error codes in the documentation for select() seem
plausible to you?  The likeliest one as far as I can tell is that
there is a nonexistent (or no-longer-existent) socket still in the
list of sockets that select() is looking at.
***********************************************************************
To unsubscribe, send an e-mail to majordomo <at> freehaven.net with
unsubscribe libevent-users    in the body.


Dave Hart | 12 Jun 2012 21:09
Picon

Re: event_base_dispatch returning -1 - help debugging

On Tue, Jun 12, 2012 at 5:06 PM, Julian Bui <julianbui <at> gmail.com> wrote:
> Oh, one more thing.  I would really like to verify that there is indeed an
> event still pending - because I can at least restart the dispatch loop if
> there is an event pending.  But I have not been able to correctly use
> event_pending for some reason.  My code is below

You have incomplete pseudocode below.

> and it always prints "event is not pending" regardless of

My reading has it always printing "event is still pending" regardless
of anything.

> whether or not the dispatch loop is running.
>  The code sets up a persistent event that will handle every incoming
> connection to my server.  I would expect that at some point it would print
> out that the event is pending...because like I said, this code DOES work
> most of the time, and if it's not pending I have no idea why the handlers
> would even still be triggered.
>
> struct event* my_event = event_new(s_event_base, server_socket, EV_READ |
> EV_PERSIST, &TileSocketServer::onAccept, (void*) client);
>
>       if(my_event != NULL)
>       {
>          int add_err = event_add(my_event, NULL);
>
>          if(add_err != 0)
>             cout << "add_err:" << add_err << endl;
>
>          int is_evt_pending = event_pending(my_event, EV_READ | EV_PERSIST,
> NULL);
>          if(is_evt_pending == 1)
>          {
>             cout << "event is still pending" << endl;
>          }
>          else
>          {
>             cout << "event is still pending" << endl;
>          }

I suggest providing a complete, freestanding example that others can
actually compile rather than simply speculate about.

Cheers,
Dave Hart
***********************************************************************
To unsubscribe, send an e-mail to majordomo <at> freehaven.net with
unsubscribe libevent-users    in the body.

Nick Mathewson | 12 Jun 2012 22:21

Re: event_base_dispatch returning -1 - help debugging

On Tue, Jun 12, 2012 at 3:09 PM, Dave Hart <davehart <at> gmail.com> wrote:
> On Tue, Jun 12, 2012 at 5:06 PM, Julian Bui <julianbui <at> gmail.com> wrote:
>> Oh, one more thing.  I would really like to verify that there is indeed an
>> event still pending - because I can at least restart the dispatch loop if
>> there is an event pending.  But I have not been able to correctly use
>> event_pending for some reason.  My code is below
>
> You have incomplete pseudocode below.
>
>> and it always prints "event is not pending" regardless of
>
> My reading has it always printing "event is still pending" regardless
> of anything.

Actually I think it's a C error.  Check out the documentation for
event_pending():
/*
   <at> return true if the event is pending on any of the events in 'what', (that
  is to say, it has been added), or 0 if the event is not added.
*/

Note that it says " <at> return true"; it doesn't say " <at> return 1".  In C,
any nonzero integer is true.

The documentation should probably be more clear that what it actually
returns its a bitfield of which flags are set, anded with its "events"
argument.

yrs,
--

-- 
Nick
***********************************************************************
To unsubscribe, send an e-mail to majordomo <at> freehaven.net with
unsubscribe libevent-users    in the body.

Nir Soffer | 13 Jun 2012 23:44
Picon
Gravatar

Re: event_base_dispatch returning -1 - help debugging


On Jun 12, 2012, at 11:21 PM, Nick Mathewson wrote:

Actually I think it's a C error.  Check out the documentation for
event_pending():
/*
  <at> return true if the event is pending on any of the events in 'what', (that
 is to say, it has been added), or 0 if the event is not added.
*/

Note that it says " <at> return true"; it doesn't say " <at> return 1".  In C,
any nonzero integer is true.

The documentation should probably be more clear that what it actually
returns its a bitfield of which flags are set, anded with its "events"
argument.

I don't think it will be a good idea to document what is returned for "true", because it will prevent changes in the implementation later.

How about "non-zero" instead of "true"?

Julian Bui | 12 Jun 2012 22:31
Picon

Re: event_base_dispatch returning -1 - help debugging

You're right, Dave, about the print statements.  I accidentally pasted my code and then modified what was printed out.  The else should read: else { cout << "event is not pending" << endl; } and this line of code always gets executed, despite it still getting triggered on incoming connection.


I'll work on getting a more basic piece of code that I can put on the internet. I just difficult to release a complete piece of code with this being a commercial product.


On Tue, Jun 12, 2012 at 12:09 PM, Dave Hart <davehart <at> gmail.com> wrote:
On Tue, Jun 12, 2012 at 5:06 PM, Julian Bui <julianbui <at> gmail.com> wrote:
> Oh, one more thing.  I would really like to verify that there is indeed an
> event still pending - because I can at least restart the dispatch loop if
> there is an event pending.  But I have not been able to correctly use
> event_pending for some reason.  My code is below

You have incomplete pseudocode below.

> and it always prints "event is not pending" regardless of

My reading has it always printing "event is still pending" regardless
of anything.

> whether or not the dispatch loop is running.
>  The code sets up a persistent event that will handle every incoming
> connection to my server.  I would expect that at some point it would print
> out that the event is pending...because like I said, this code DOES work
> most of the time, and if it's not pending I have no idea why the handlers
> would even still be triggered.
>
> struct event* my_event = event_new(s_event_base, server_socket, EV_READ |
> EV_PERSIST, &TileSocketServer::onAccept, (void*) client);
>
>       if(my_event != NULL)
>       {
>          int add_err = event_add(my_event, NULL);
>
>          if(add_err != 0)
>             cout << "add_err:" << add_err << endl;
>
>          int is_evt_pending = event_pending(my_event, EV_READ | EV_PERSIST,
> NULL);
>          if(is_evt_pending == 1)
>          {
>             cout << "event is still pending" << endl;
>          }
>          else
>          {
>             cout << "event is still pending" << endl;
>          }

I suggest providing a complete, freestanding example that others can
actually compile rather than simply speculate about.

Cheers,
Dave Hart
***********************************************************************
To unsubscribe, send an e-mail to majordomo <at> freehaven.net with
unsubscribe libevent-users    in the body.

Dave Hart | 12 Jun 2012 21:08
Picon

Re: event_base_dispatch returning -1 - help debugging

On Tue, Jun 12, 2012 at 4:47 PM, Julian Bui <julianbui <at> gmail.com> wrote:
> Thanks for the quick response, Nick.
>
>>  Step one might be to build with debugging support
>
> Ah, I didn't realize debugging required you to build with support for it.
>
> I am using windows and have built with just `nmake` in visual studio.  What
> is the preferred method of building in windows?  When I use the visual
> studio command line, I cannot seem to use configure.  `configure`  and
> `configure --enable-debug-mode` show "is not recognized as an internal or
> external command, operable program or batch file"

configure scripts generally don't work on Windows.  If you search
configure.ac for debug-mode you should find the snippet that handles
that option.  On systems where configure works, the resulting
selection is enacted by a #define in config.h.  If you arrange for the
same #define to be in place for libevent and your code, you'll get the
same debugging-capable libevent.

>> A select() failure seems likelier
>
> I think I might possibly be misunderstanding the usage of libevent.  I
> thought libevent abstracts away the select mechanism so I don't have to deal
> with it.  When I was looking at example code, they never once had to deal
> with the underlying mechanism like select().  Could you possibly comment on
> this?  What is the general strategy/structure/architecture of a program if I
> am to use both libevent and select()?  Is there any example code that shows
> this interaction?

You're misunderstanding Nick's response.  He's trying to find the root
cause of the failure, which necessarily means looking inside the
libevent "black box" to its implementation.  libevent's async I/O
abstraction is imperfect and steadily improving, recently due in great
part to Nick's workmanlike efforts.  Nick is a great resource and I
recommend you use him sparingly -- to help guide your investigation
into your problem, and assist if needed in crafting a suitable fix
once the failure is understood.

More often than not, Nick's patient "customer service" of programmers
with libevent issues reveals no obvious improvement needed in the
libevent code, though sometimes highlighting need for better
documentation, which he's also been tackling.  Still, while I'm sure
his assistance is valued by those he helps, in terms of the libevent
project, Nick's talents are underutilized compared to situations where
libevent bugs are exposed and corrected.

Cheers,
Dave Hart
***********************************************************************
To unsubscribe, send an e-mail to majordomo <at> freehaven.net with
unsubscribe libevent-users    in the body.

Nick Mathewson | 12 Jun 2012 22:19

Re: event_base_dispatch returning -1 - help debugging

On Tue, Jun 12, 2012 at 3:08 PM, Dave Hart <davehart <at> gmail.com> wrote:
> On Tue, Jun 12, 2012 at 4:47 PM, Julian Bui <julianbui <at> gmail.com> wrote:
>> Thanks for the quick response, Nick.
>>
>>>  Step one might be to build with debugging support
>>
>> Ah, I didn't realize debugging required you to build with support for it.
>>
>> I am using windows and have built with just `nmake` in visual studio.  What
>> is the preferred method of building in windows?  When I use the visual
>> studio command line, I cannot seem to use configure.  `configure`  and
>> `configure --enable-debug-mode` show "is not recognized as an internal or
>> external command, operable program or batch file"
>
> configure scripts generally don't work on Windows.  If you search
> configure.ac for debug-mode you should find the snippet that handles
> that option.  On systems where configure works, the resulting
> selection is enacted by a #define in config.h.  If you arrange for the
> same #define to be in place for libevent and your code, you'll get the
> same debugging-capable libevent.

The autoconf script works fine on windows under mingw.  Under nmake,
the best option is probably to find the right option like Dave says,
and then add it to WIN32-Code/event2/event-config.h .

>>> A select() failure seems likelier
>>
>> I think I might possibly be misunderstanding the usage of libevent.  I
>> thought libevent abstracts away the select mechanism so I don't have to deal
>> with it.  When I was looking at example code, they never once had to deal
>> with the underlying mechanism like select().  Could you possibly comment on
>> this?  What is the general strategy/structure/architecture of a program if I
>> am to use both libevent and select()?  Is there any example code that shows
>> this interaction?
>
> You're misunderstanding Nick's response.  He's trying to find the root
> cause of the failure, which necessarily means looking inside the
> libevent "black box" to its implementation.

For what it's worth, I think the issue is not necessarily a bug in the
Libevent code here.  Like I said, it is much likelier that there is
some event still added whose socket has closed, or which was never
open, or something like that.  I think this is making the select()
call inside win32_dispatch() fail.  If I'm guessing right, the right
fix is probably to find that event and event_del() it before closing
its socket.

--

-- 
Nick
***********************************************************************
To unsubscribe, send an e-mail to majordomo <at> freehaven.net with
unsubscribe libevent-users    in the body.


Gmane