4 May 2012 06:00
Timers becoming unscheduled
Keith Browne <tuxedo <at> deepsky.com>
2012-05-04 04:00:16 GMT
2012-05-04 04:00:16 GMT
We're using timers to get some loose real-time behavior in a couple of applications. We've found that over the long term, our timers silently fail; they stop calling the function registered at timer creation. This is happening both with timers that are scheduled to recur with a fixed repeat interval and with timers without a repeat interval--in the latter case, we are rescheduling the timer in the callback function itself. In all cases, we're setting :thread t when calling sb-ext:make-timer, so as to have the callbacks execute in their own threads. The failure we're seeing is happening when we're connecting to an SBCL instance using Emacs and SLIME. We've run some tests in which we run SBCL directly from the shell, and we haven't seen a failure in that case. The problem may be related to signal handling when SLIME/SWANK is running. When we're just running a handful of timers that keep rescheduling themselves every few seconds, we can run our application for days or weeks before we see a failure. When one timer fails, though, it seems to take all the rest of the timers in the image with it--also suggestive of a problem in signal handling. We've run test cases with hundreds or even thousands of timers turning over at once, and in those cases we can usually see a failure within minutes, or an hour to two hours. Some sample code demonstrating the problem with recurring timers is available at http://www.deepsky.com/~tuxedo/exercise-timers.tar.gz. Is this a known problem with the timer implementation in SBCL? In order to be certain that our application won't stop executing timer-related callbacks, should we have a dedicated thread that sleeps and periodically wakes up to reschedule all the timers? Or an external process that tickles our Lisp program for the same purpose?(Continue reading)
RSS Feed