Jeremy C. Reed | 2 Jul 2009 23:10

trial buildbot.test failures on NetBSD

I am testing BuildBot version 0.7.10p1 on NetBSD/i386 4.99.62. I plan on 
importing this as a "pkgsrc" package. My existing installed pkgsrc 
dependencies are py25-setuptools-0.6c9, py25-twisted-8.1.0, and 
python25-2.5.2nb5. (Plus I hope to use BuildBot to automate builds and 
tests on several platforms for an upcoming modular, high performance, 
lightweight DNS server.) Thank you for your BuildBot software!

The "/usr/bin/env PYTHONPATH =. trial buildbot.test" had some problems.

FAILED (skips=39, failures=12, errors=14, successes=354)

The following is the output with most of the OK and SKIPPED removed. I 
don't know if this is NetBSD or pkgsrc related or maybe this is normal.

If this is covered in docs or there is a better way for me to share these 
issues, please let me know. (Yes, I am new to Buildbot :)

Traceback (most recent call last):
  File "usr/pkg/lib/python2.5/site-packages/twisted/python/usage.py", line 241, in parseOptions

  File "usr/pkg/lib/python2.5/site-packages/twisted/scripts/trial.py", line 293, in postOptions

  File "usr/pkg/lib/python2.5/site-packages/twisted/scripts/trial.py", line 279, in _loadReporterByName

  File "usr/pkg/lib/python2.5/site-packages/twisted/plugin.py", line 200, in getPlugins

--- <exception caught here> ---
  File "usr/pkg/lib/python2.5/site-packages/twisted/plugin.py", line 179, in getCache

  File "usr/pkg/lib/python2.5/site-packages/twisted/python/filepath.py", line 574, in setContent
(Continue reading)

Dustin J. Mitchell | 3 Jul 2009 18:22
Favicon
Gravatar

Re: trial buildbot.test failures on NetBSD

On Thu, Jul 2, 2009 at 5:10 PM, Jeremy C. Reed<reed@...> wrote:
> FAILED (skips=39, failures=12, errors=14, successes=354)

Hmm, I assume there's some common cause behind all of these.  The
thing I noticed is that all of those tests involve starting a new
buildslave and communicating (via TCP) with it.  Is this something
that your system may make difficult?

Dustin

--

-- 
Open Source Storage Engineer
http://www.zmanda.com

------------------------------------------------------------------------------
Jeremy C. Reed | 3 Jul 2009 20:08

Re: trial buildbot.test failures on NetBSD

On Fri, 3 Jul 2009, Dustin J. Mitchell wrote:

> On Thu, Jul 2, 2009 at 5:10 PM, Jeremy C. Reed<reed@...> wrote:
> > FAILED (skips=39, failures=12, errors=14, successes=354)
> 
> Hmm, I assume there's some common cause behind all of these.  The
> thing I noticed is that all of those tests involve starting a new
> buildslave and communicating (via TCP) with it.  Is this something
> that your system may make difficult?

Nothing that I know about.

Note I am running the trial buildbot.test as a non-root user.

------------------------------------------------------------------------------
Dustin J. Mitchell | 5 Jul 2009 22:49
Favicon
Gravatar

Re: trial buildbot.test failures on NetBSD

Can you try running just one of the failing tests, and send me the
resulting trial.log?

Dustin Sallings has donated a FreeBSD 7 buildbot to the
new-and-improved metabuildbot.  Do you have a NetBSD system we could
use?

Here are the errors from that system:
  http://buildbot.net/metabuildbot/builders/os-freebsd/builds/0/steps/trial/logs/problems

Dustin

--

-- 
Open Source Storage Engineer
http://www.zmanda.com

------------------------------------------------------------------------------
Jeremy C. Reed | 6 Jul 2009 16:40

Re: trial buildbot.test failures on NetBSD

On Sun, 5 Jul 2009, Dustin J. Mitchell wrote:

> Can you try running just one of the failing tests, and send me the
> resulting trial.log?

I can try. Can you provide instructions or point me to docs on running 
an individual test?

> Dustin Sallings has donated a FreeBSD 7 buildbot to the
> new-and-improved metabuildbot.  Do you have a NetBSD system we could
> use?

Do you need to login? Or just have me setup a "slave"?

> Here are the errors from that system:
>   http://buildbot.net/metabuildbot/builders/os-freebsd/builds/0/steps/trial/logs/problems

Fewer problems than mine. Maybe I am missing some dependencies.

------------------------------------------------------------------------------
Dustin J. Mitchell | 6 Jul 2009 18:59
Favicon
Gravatar

Re: trial buildbot.test failures on NetBSD

On Mon, Jul 6, 2009 at 10:40 AM, Jeremy C. Reed<reed@...> wrote:
> I can try. Can you provide instructions or point me to docs on running
> an individual test?

trial buildbot.test.test_run.Triggers.testTriggerBuild

> Do you need to login? Or just have me setup a "slave"?

Well, having a login might help me diagnose these problems, but
long-term I'd just like a buildslave connected to
buildbot.net/metabuildbot.

> Fewer problems than mine. Maybe I am missing some dependencies.

I suspect there's some assumption that Buildbot or Twisted is making
about the underlying operating system that's not satisfied in NetBSD
-- most of these errors are basically "things didn't happen in the
order I expected with interactions between multiple processes".  This
isn't the fork/vfork distinction (which process runs first), but might
have something to do with handling of inherited file descriptors or
perhaps pty's.  Does this remind you of any common
NetBSD-compatibility problems?

Dustin

--

-- 
Open Source Storage Engineer
http://www.zmanda.com

------------------------------------------------------------------------------
(Continue reading)

Jeremy C. Reed | 7 Jul 2009 17:01

Re: trial buildbot.test failures on NetBSD

On Mon, 6 Jul 2009, Dustin J. Mitchell wrote:

> trial buildbot.test.test_run.Triggers.testTriggerBuild

Okay I sent the details for trial 
buildbot.test.test_run.BuildPrioritization.testPriority to you off-list.

> > Do you need to login? Or just have me setup a "slave"?
> 
> Well, having a login might help me diagnose these problems, but
> long-term I'd just like a buildslave connected to
> buildbot.net/metabuildbot.

Sure I will contribute a slave. I am still learning Buildbot and have not 
yet been able to test it locally yet. I found 
http://buildbot.net/trac/wiki/MetaBuildbot but don't see any docs on what 
to steps to run or configure on my local system. Any pointers would be 
appreciated. I am now reading 
http://djmitche.github.com/buildbot/docs/0.7.10/#Creating-a-buildslave
which says "Get the buildmaster host/port, botname, and password"

> > Fewer problems than mine. Maybe I am missing some dependencies.
> 
> I suspect there's some assumption that Buildbot or Twisted is making
> about the underlying operating system that's not satisfied in NetBSD
> -- most of these errors are basically "things didn't happen in the
> order I expected with interactions between multiple processes".  This
> isn't the fork/vfork distinction (which process runs first), but might
> have something to do with handling of inherited file descriptors or
> perhaps pty's.  Does this remind you of any common
(Continue reading)

Dustin J. Mitchell | 7 Jul 2009 17:25
Favicon
Gravatar

Re: trial buildbot.test failures on NetBSD

On Tue, Jul 7, 2009 at 11:01 AM, Jeremy C. Reed<reed@...> wrote:
> Nothing that I recognize. I am now running the "trial twisted" tests in
> the twisted source tree and it has many errors too.

Hmm, I suspect something funny is going on here.  See:
  http://twistedmatrix.com/pipermail/twisted-python/2003-December/006630.html
which is about someone using buildbot on NetBSD back in '03 -- he may
be able to help out.

By the way, in the run you sent me, I see:
  exceptions.IOError: [Errno 13] Permission denied:
'/usr/pkg/lib/python2.5/site-packages/twisted/plugins/dropin.cache.new'
which is harmless (http://twistedmatrix.com/trac/ticket/2409), but may
be interfering with the subprocesses the tests are spawning..

One thing I do notice in the log is that build steps which should take
just a shade over three seconds ("sleep 3") are taking 10-12 seconds:

2009-07-07 09:44:23-0500 [Broker,client]  sleep 3
2009-07-07 09:44:23-0500 [Broker,client]   in dir
slavebase-bot1/quickdir1/build (timeout 1200 secs)
...
2009-07-07 09:44:33-0500 [-] command finished with signal None, exit
code 0, elapsedTime: 10.516390

The test is trying to run 20 of these steps, and is only running about
12 of them in the 120-second timeout period.  Any idea why this might
be happening on this system?

Dustin
(Continue reading)

Dustin J. Mitchell | 8 Jul 2009 02:00
Favicon
Gravatar

Re: trial buildbot.test failures on NetBSD

On Tue, Jul 7, 2009 at 11:25 AM, Dustin J. Mitchell<dustin@...> wrote:
> The test is trying to run 20 of these steps, and is only running about
> 12 of them in the 120-second timeout period.  Any idea why this might
> be happening on this system?

Looking at another failure, this time
buildbot.test.test_run.Disconnect.testBuild2 as run on your system by
the metabuildbot:

2009-07-07 15:31:52-0500 [Broker,0,127.0.0.1] acquireLocks(step
<buildbot.steps.dummy.Dummy instance at 0x7f7ff6e62dd0>, locks [])
2009-07-07 15:31:54-0500 [-] doing shutdownAllSlaves

this test's intent is to start a build (at 15:31:52, and then 0.5s
later kill it.  The build is only slated to take one second, yet in
the timing above we see two seconds elapsed.

It looks like the twisted reactor is not doing well at getting the
timing right.  From what you wrote earlier, this is also affecting the
Twisted unit tests.  See what happens when you run these tests with a
different reactor?  trial has the --help-reactors switch that lists
the available reactors, and the --reactor= option to specify one.

I'm sure there's a Twisted bug on this topic somewhere, but google is
not finding it for me.

Dustin

--

-- 
Open Source Storage Engineer
(Continue reading)

Jeremy C. Reed | 8 Jul 2009 02:31

Re: trial buildbot.test failures on NetBSD

On Tue, 7 Jul 2009, Dustin J. Mitchell wrote:

> Looking at another failure, this time
> buildbot.test.test_run.Disconnect.testBuild2 as run on your system by
> the metabuildbot:
> 
> 2009-07-07 15:31:52-0500 [Broker,0,127.0.0.1] acquireLocks(step
> <buildbot.steps.dummy.Dummy instance at 0x7f7ff6e62dd0>, locks [])
> 2009-07-07 15:31:54-0500 [-] doing shutdownAllSlaves
> 
> this test's intent is to start a build (at 15:31:52, and then 0.5s
> later kill it.  The build is only slated to take one second, yet in
> the timing above we see two seconds elapsed.
> 
> It looks like the twisted reactor is not doing well at getting the
> timing right.  From what you wrote earlier, this is also affecting the
> Twisted unit tests.  See what happens when you run these tests with a
> different reactor?  trial has the --help-reactors switch that lists
> the available reactors, and the --reactor= option to specify one.

Still had two seconds when using poll versus select. (I don't know what 
default is and kqueue is not working due to "No module named kqsyscall".)

But with --reactor=poll it failed with

Failure: twisted.spread.pb.PBConnectionLost: [Failure instance: Traceback 
(failure with no frames): <class 'twisted.internet.error.ConnectionDone'>: 
Connection was closed cleanly.

And with --reactor=select it failed with:
(Continue reading)

Dustin J. Mitchell | 8 Jul 2009 18:46
Favicon
Gravatar

Re: trial buildbot.test failures on NetBSD

I just saw this from the buildbot:

http://buildbot.net/metabuildbot/builders/os-netbsd/builds/1/steps/versions/logs/stdio

python -c 'import sys; print sys.version; import twisted; print twisted.version'

2.4.4 (#1, Dec  1 2007, 15:56:18)
[GCC 4.1.2 20061021 prerelease (NetBSD nb3 20061125)]
Traceback (most recent call last):
  File "<string>", line 1, in ?
ImportError: No module named twisted

Are there conflicting or overlapping versions of Python and Twisted
installed here?

Dustin

--

-- 
Open Source Storage Engineer
http://www.zmanda.com

------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
Jeremy C. Reed | 8 Jul 2009 19:08

Re: trial buildbot.test failures on NetBSD

On Wed, 8 Jul 2009, Dustin J. Mitchell wrote:

> ImportError: No module named twisted

That was me reinstalling twisted while buildbot was still running. Sorry. 
Should I stop and start the buildbot slave again?

My goal was to try upgrading the twisted to newer version to see if that 
solved some of the buildbot and trial test failures.

Now I see #2 test failed same. I can repeat:

$ pkg_info | grep twisted
py25-twisted-8.2.0  Framework for writing networked applications
$ python -c 'import sys; print sys.version; import twisted; print twisted.version'
2.4.4 (#1, Dec  1 2007, 15:56:18) 
[GCC 4.1.2 20061021 prerelease (NetBSD nb3 20061125)]
Traceback (most recent call last):
  File "<string>", line 1, in ?
ImportError: No module named twisted

Now I see what happened ... "python" was a symlink in my own home bin 
directory and in my path. I removed it, since the tools on my system 
should use full python2.4 or python2.5. Not sure yet what to do for 
buildbot. Anywhere to configure this?

------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
(Continue reading)


Gmane