Gary Byers | 17 Jul 21:54

Re: Random crashing


On Thu, 17 Jul 2008, Osei Poku wrote:

> Hello,
>
> I updated today from svn but this thing happened again.  Again the PC was in 
> the pthread memory region and %rdi was 0.  I verified that the fix (r9997 i 
> think) was in my ccl working directory (somewhere in thread_manager.c 
> right?).

Yes; there are 3 calls to pthread_kill() in that file.  One of 
them (in resume_tcr()) is conditionlized out; the other two
(in raise_thread_interrupt() and suspend_tcr()) should check
to make sure that the thread that they'd pass as the first
argument to pthread_kill is non-zero before doing the call.)

>
> My current version is:
> Clozure Common Lisp Version 1.2-r10073M-RC1  (LinuxX8664)!
>
> Is there anything other than (rebuild-ccl :force t) that I need to do to 
> recompile the c source for the lisp kernel?

As Gail just pointed out, :full t (or :kernel t) is necessary
in order to get the kernel updated. (:force t will recompile
FASLs even if they're newer than the corresponding source;
that's occasionally useful, but not really what you want here.)

If the kernel that you're running had its modified date change
by the rebuild process, it likely incorporates those changes.  If
(Continue reading)

Osei Poku | 18 Jul 18:25

Re: Random crashing

Ok... It happened again after recompiling the kernel.  I managed to  
attach a gdb session to the process and it is still running so I can  
possible provide more feedback if you need.  My current gdb session  
log is inserted below.

 > /usr/bin/gdb
GNU gdb 6.6.50.20070726-cvs
Copyright (C) 2007 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and  
you are
welcome to change it and/or distribute copies of it under certain  
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for  
details.
This GDB was configured as "x86_64-suse-linux".
(gdb) attach 3268
Attaching to process 3268
Reading symbols from /home/opoku/local/share/ccl/lx86cl64...done.
Using host libthread_db library "/lib64/libthread_db.so.1".
Reading symbols from /lib64/libdl.so.2...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libm.so.6...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libpthread.so.0...done.
[Thread debugging using libthread_db enabled]
[New Thread 0x2adafe820880 (LWP 3268)]
[New Thread 0x410bb950 (LWP 6095)]
[New Thread 0x4131f950 (LWP 6094)]
[New Thread 0x40e57950 (LWP 6093)]
(Continue reading)

Gary Byers | 18 Jul 18:45

Re: Random crashing


On Fri, 18 Jul 2008, Osei Poku wrote:

> Ok... It happened again after recompiling the kernel.  I managed to attach a 
> gdb session to the process and it is still running so I can possible provide 
> more feedback if you need.  My current gdb session log is inserted below.
>

It basically shows that one thread is reading (from standard input)
and that all other threads are waiting for a semaphore that'll
allow them to wake from a suspended state.)

In other words, you're in the kernel debugger.

> (gdb) info threads
> 9 Thread 0x40263950 (LWP 3271)  0x00002adafe2ca2cb in sem_timedwait () from 
> /lib64/libpthread.so.0
> 8 Thread 0x404c7950 (LWP 3272)  0x00002adafe2ca2cb in sem_timedwait () from 
> /lib64/libpthread.so.0
> 7 Thread 0x4072b950 (LWP 3305)  0x00002adafe2ca2cb in sem_timedwait () from 
> /lib64/libpthread.so.0
> 6 Thread 0x4098f950 (LWP 3306)  0x00002adafe2ca2cb in sem_timedwait () from 
> /lib64/libpthread.so.0
> 5 Thread 0x40bf3950 (LWP 3307)  0x00002adafe2ca2cb in sem_timedwait () from 
> /lib64/libpthread.so.0
> 4 Thread 0x40e57950 (LWP 6093)  0x00002adafe591bfb in read () from 
> /lib64/libc.so.6
> 3 Thread 0x4131f950 (LWP 6094)  0x00002adafe2ca2cb in sem_timedwait () from 
> /lib64/libpthread.so.0
> 2 Thread 0x410bb950 (LWP 6095)  0x00002adafe2ca2cb in sem_timedwait () from 
(Continue reading)

Osei Poku | 21 Jul 22:14

Re: Random crashing

Got it to crash again....

>
> Where are you (where is address *0x00002ADAFE2CA325) and how did
> you get there ('bt' in GDB) ?
>
>
This time %rip = 0x00002ABAFDCCD325.  After I set the break point,  
continued and typed X into the kernel debugger, I arrive here in gdb.   
I will try not to screw up the debugging session like last time so  
that I can provide additional information.

(gdb) bt
#0  0x00002abafdccd325 in sem_post () from /lib64/libpthread.so.0
#1  0x000000000041b3e2 in resume_tcr (tcr=0x40e577d0) at ../ 
thread_manager.c:1376
#2  0x000000000041c0ba in resume_other_threads (for_gc=<value  
optimized out>) at ../thread_manager.c:1544
#3  0x000000000041d62e in lisp_Debugger (xp=0x4131dd60,  
info=0x4131e110, why=11, in_foreign_code=1, message=0x4131db10  
"Unhandled exception 11 at 0x2abafdccd325, context->regs at  
#x4131dd88") at ../lisp-debug.c:919
#4  0x000000000041a2c6 in signal_handler (signum=11, info=0x4131e110,  
context=0x4131dd60, tcr=0x4131f7d0, old_valence=1) at ../x86- 
exceptions.c:1070
#5  <signal handler called>
#6  0x00002abafdccd325 in sem_post () from /lib64/libpthread.so.0
#7  0x000000000041b3e2 in resume_tcr (tcr=0x417e77d0) at ../ 
thread_manager.c:1376
#8  0x000000000041c146 in lisp_resume_tcr (tcr=0x417e77d0) at ../ 
(Continue reading)

Gary Byers | 21 Jul 23:53

Re: Random crashing

If you still have the debugging session running, could you do:

(gdb) p/x *(TCR *)0x417e77d0

That address is the value of the "tcr" argument to "resume_tcr()" in
frame #7 in the backtrace below, so if you don't still have the
debugging session and reproduce the problem, we want to see what
the value of the "tcr" argument to resume_tcr() at the point was
at the point where resume_tcr() called sem_post() and crashed.

The gdb command above means "print, in hex, this contents of
what this address points to, interpreting that address as
being of type "pointer to TCR" (where a TCR is a "Thread Context
Record" that contains several interesting fields.)

'resume_tcr()' basically does 'sem_post(tcr->resume)', and a crash
would make sense if tcr->resume was NULL.  If it was, then one of
the threads that's doing sem_timedwait() on its 'resume' semaphore
would presumably be waiting on a NULL semahore, and that doesn't
make sense.

On Mon, 21 Jul 2008, Osei Poku wrote:

> Got it to crash again....
>
>> 
>> Where are you (where is address *0x00002ADAFE2CA325) and how did
>> you get there ('bt' in GDB) ?
>> 
>> 
(Continue reading)

Osei Poku | 22 Jul 00:02

Re: Random crashing


On Jul 21, 2008, at 5:53 PM, Gary Byers wrote:

> If you still have the debugging session running, could you do:
>
> (gdb) p/x *(TCR *)0x417e77d0

(gdb) p/x *(TCR *)0x417e77d0
$1 = {next = 0x0, prev = 0x0, single_float_convert = {tag = 0x1, f =  
0x0}, linear = 0x0, save_rbp = 0x2aaaadd49ab0, lisp_mxcsr = 0x1920,  
foreign_mxcsr = 0x1f80, db_link = 0x0, catch_top = 0x0, save_vsp =  
0x2aaaadd49a58, save_tsp = 0x2aaaade5b000, foreign_sp = 0x417e6da0,  
cs_area = 0x0, vs_area = 0x0, ts_area = 0x0, cs_limit = 0x415b6000,  
bytes_allocated = 0x0,
   log2_allocation_quantum = 0x11, interrupt_pending = 0x0, xframe =  
0x0, errno_loc = 0x417e7770, ffi_exception = 0x1f80, osid = 0x0,  
valence = 0x1, foreign_exception_status = 0x0, native_thread_info =  
0x0, native_thread_id = 0x1847, last_allocptr = 0x3000455e0000,  
save_allocptr = 0x3000455db200, save_allocbase = 0x3000455c0000,  
reset_completion = 0x0, activate = 0x0,
   suspend_count = 0x0, suspend_context = 0x0,  
pending_exception_context = 0x0, suspend = 0x0, resume = 0x0, flags =  
0x0, gc_context = 0x0, termination_semaphore = 0x0, unwinding = 0x0,  
tlb_limit = 0x0, tlb_pointer = 0x0, shutdown_count = 0x0, next_tsp =  
0x2aaaade5b000, safe_ref_address = 0x0}

To save your eyes scanning,

resume = 0x0

(Continue reading)

Gary Byers | 22 Jul 00:42

Re: Random crashing

Thanks.  Curiouser and curiouser, not only is the "resume" field 0,
but many other fields are as well, including 'next' and 'prev'.  (TCR
structures are maintained in a circular, doubly-linked list; this guy
seems to have died and spliced himself out of that list.)  Enough
fields are set that this looks like a dead thread rather than a
newly-created one.

The backtrace indicates that this was coming from
'lisp_resume_other_threads()", which is called as part of the expansion
of WITH-OTHER-THREADS-SUSPENDED.  And lisp_resume_other_threads()
and lisp_suspend_other_threads() don't bother to grab and release
the lock which allows modification of the tcr list.

I'm not quite sure why what happened happened, but the code that
walks this doubly-linked list suspending and resuming threads should
be confident that other threads aren't splicing themselves on and off
that list while it's being walked.

On Mon, 21 Jul 2008, Osei Poku wrote:

>
> On Jul 21, 2008, at 5:53 PM, Gary Byers wrote:
>
>> If you still have the debugging session running, could you do:
>> 
>> (gdb) p/x *(TCR *)0x417e77d0
>
> (gdb) p/x *(TCR *)0x417e77d0
> $1 = {next = 0x0, prev = 0x0, single_float_convert = {tag = 0x1, f = 0x0}, 
> linear = 0x0, save_rbp = 0x2aaaadd49ab0, lisp_mxcsr = 0x1920, foreign_mxcsr = 
(Continue reading)

Osei Poku | 6 Aug 18:59

Re: Random crashing

This thing is not going away....
lisp debugger and gdb session below...

====lisp debugger  
session 
= 
= 
= 
= 
= 
= 
= 
========================================================================

? exception in foreign context
Exception occurred while executing foreign code
? for help
[17455] OpenMCL kernel debugger: ?
(G)  Set specified GPR to new value
(R)  Show raw GPR/SPR register values
(L)  Show Lisp values of tagged registers
(F)  Show FPU registers
(S)  Find and describe symbol matching specified name
(B)  Show backtrace
(T)  Show info about current thread
(X)  Exit from this debugger, asserting that any exception was handled
(K)  Kill OpenMCL process
(?)  Show this help
[17455] OpenMCL kernel debugger: R
%rax = 0x0000000000000000      %r8  = 0x0000000000000000
(Continue reading)

Wade Humeniuk | 10 Aug 19:19

Re: Random crashing

Maybe a hardware problem with your computer?  Could
be faulty RAM/Processor/Motherboard.....  You said this problem is
happening on a
particular machine.  Perhaps running some diagnostics might show up something
(though I have no suggestions what that diagnostic program might be.)

Wade

On Wed, Aug 6, 2008 at 10:59 AM, Osei Poku <osei.poku <at> gmail.com> wrote:
> This thing is not going away....
> lisp debugger and gdb session below...
>
> ====lisp debugger
> session
> =
> =
> =
> =
> =
> =
> =
> ========================================================================
>
> ? exception in foreign context
> Exception occurred while executing foreign code
> ? for help
> [17455] OpenMCL kernel debugger: ?
> (G)  Set specified GPR to new value
> (R)  Show raw GPR/SPR register values
> (L)  Show Lisp values of tagged registers
(Continue reading)

Gary Byers | 10 Aug 22:14

Re: Random crashing

I've said this (and been wrong) a few times already, but I think that
I (partly) fixed this in svn a few days ago.  (Or at least fixed the
part that led to the crash.)

Some things that try to examine the status of a process (PROCESS-WHOSTATE)
do so by briefly suspending and resuming the process.  Unfortunately,
the code that does this doesn't reliably ensure that the thread
hasn't exited before we try to suspend it, and trying to (unconditionally)
resume a thread that exited before it was suspended can wind up trying
to signal a NULL semaphore (which is the symptom that Osei is seeing.)

That's sort of a perfect storm of everyhing that could go wrong
going wrong at the same time.  I'm not 100% sure that PROCESS-WHOSTATE
is the culprit; there's at least one other thing (SYMBOL-VALUE-IN-PROCESS)
that does similar things and has similar race conditions that it doesn't
handle.

Whatever the culprit(s) is or are, there are ways to reach the C
function 'resume_tcr()' in the lisp kernel, and that function can
afford to check to see if the semaphore that it's going to signal
is NULL before blindly signaling it.  (Not checking - on Linux,
at least - leads to the crash that Osei's seeing.)

If you do:

? (process-run-function "do nothing" (lambda ()))

in the listener, you'll probably see the result print as something
like:

(Continue reading)

Osei Poku | 18 Aug 20:42

Re: Random crashing

Just a quick report...  I updated to r10465M-RC1 and have had no  
crashes yet.   So I'm keeping my fingers crossed :)

Something else strange happened (the same day I updated), where it was  
not in the debugger but I could not evaluate any forms both in emacs/ 
slime and in the plain tty repl.  It hasn't happened again since then  
so I think I probably screwed something up.

Anyhow, thanks for all the help tracking down this issue and improving  
the situation.  I was this ( || ) close to ponying up a few thousand  
bucks for LW64 :)

Osei

On Aug 10, 2008, at 4:14 PM, Gary Byers wrote:

> I've said this (and been wrong) a few times already, but I think that
> I (partly) fixed this in svn a few days ago.  (Or at least fixed the
> part that led to the crash.)
>
> Some things that try to examine the status of a process (PROCESS- 
> WHOSTATE)
> do so by briefly suspending and resuming the process.  Unfortunately,
> the code that does this doesn't reliably ensure that the thread
> hasn't exited before we try to suspend it, and trying to  
> (unconditionally)
> resume a thread that exited before it was suspended can wind up trying
> to signal a NULL semaphore (which is the symptom that Osei is seeing.)
>
> That's sort of a perfect storm of everyhing that could go wrong
(Continue reading)

Alexander Repenning | 10 Nov 20:04
Favicon

crash without report

this may have been discussed in some other context but I cannot find any trace. Anyway, while usually pretty stable CCL 1.2 (mac) works well with Cocoa in general and even reports, without crashing on some memory management issues. But once in a while CCL really does crash but unfortunately without creating a crashlog file. What is missing? I have 

COREDUMPS=-YES-

in etc/hostconfig

but when getting a Nov 10 11:54:15 Ristretto-to-Go-7 com.apple.launchd[67] ([0x0-0x15015].com.clozure.Clozure CL[119]): Exited: Killed

there is no crash.log

Am I missing something?

all the best,  Alex



Prof. Alexander Repenning


University of Colorado

Computer Science Department

Boulder, CO 80309-430


vCard: http://www.cs.colorado.edu/~ralex/AlexanderRepenning.vcf



_______________________________________________
Openmcl-devel mailing list
Openmcl-devel <at> clozure.com
http://clozure.com/mailman/listinfo/openmcl-devel

Re: crash without report

On Mon, Nov 10, 2008 at 20:04, Alexander Repenning
<ralex <at> cs.colorado.edu> wrote:
> this may have been discussed in some other context but I cannot find any
> trace. Anyway, while usually pretty stable CCL 1.2 (mac) works well with
> Cocoa in general and even reports, without crashing on some memory
> management issues. But once in a while CCL really does crash but
> unfortunately without creating a crashlog file. What is missing? I have
> COREDUMPS=-YES-
> in etc/hostconfig
> but when getting a Nov 10 11:54:15 Ristretto-to-Go-7 com.apple.launchd[67]
> ([0x0-0x15015].com.clozure.Clozure CL[119]): Exited: Killed
> there is no crash.log

To me, this looks as if the operating system has killed the CCL
process, presumably because of a swap space shortage.  Check your
'messages' file (presumably /var/log/messages, but could be somewhere
else on Mac OS X) for "out of swap space" messages?

-Hans
Gary Byers | 10 Nov 21:57

Re: crash without report

If you're asking "what should have been logged somewhere but wasn't?",
I don't know.  (That's kind of like a Zen koan, only instead of
achieving enlightenment by contemplating it you wind up with a bad
headache.)

If lisp code does something that results in an illegal memory reference,
the lisp kernel catches the resulting exception and signals a lisp error.

? (%get-byte (%null-ptr))
> Error: Fault during read of memory address #x0
> While executing: %GET-BYTE, in process Listener(5).
> Type :POP to abort, :R for a list of available restarts.
> Type :? for other options.
1

For a simple case like this, we can slap ourselves in the forehead
(figuratively ...) and remind ourselves not to dereference obviously
null pointers.  Even in more realistic cases, it may be easy to figure
out what caused the memory fault and convince ourselves that the
damage was localized.  (In general, it's possible to scribble randomly
over memory for a while before we try to write to an address that'll
cause a fault, so if we don't understand what caused a memory fault
like this we should view the lisp session with suspicion: if something's
doing incorrect memory accesses, it might have overwritten something
important before writing to an address that caused a fault.)  From
the lisp kernel's point of view, trying to report this as a lisp error
is "worth a try", and it often works well in practice.

If foreign (C) code does an invalid memory access, it's much harder to
know how to recover from that: we don't know what state that foreign
code may have changed and we don't know what the consequences of
signaling a lisp error in the middle of some unknown foreign code
might be.  (E.g., if we get a fault in the middle of #_malloc or
something similar, trying to signal a lisp error at that point might
just lead to a lot of secondary problems and not get very far.)

When any kind of unhandled exception (memory fault or other) happens
in foreign code, the lisp enters its kernel debugger.  It's not much
of a debugger, and what there is of it is oriented towards printing
lisp objects (with varying degrees of success ...) and lisp
backtraces.  There's a little information in the Wiki about debugging
under GDB:

<http://trac.clozure.com/openmcl/wiki/CclUnderGdb>

but it's probably fair to say that trying to figure out how/why
some foreign code crashed can be a hard problem.  (Many great
minds have spent countless hours on this problem ...)

If we're running the lisp as a non-OSX-GUI application and we
do something like:

? (ff-call (%null-ptr) :void)

we get:

Unhandled exception 10 at 0x0, context->regs at #xb029b8f0
Exception occurred while executing foreign code
? for help
[50778] OpenMCL kernel debugger:

Well, yes: we did a foreign function call to an invalid address,
and now we're pretty much stuck.  In a more realistic example -
where we were in some real foreign code and that code caused a
fault - the kernel debugger will try to print the name of a
known foreign function whose address is near the PC at the time
of the exception.

We can ask the kernel debugger to show us the values of the machine
registers (x8664 in this case):

[50778] OpenMCL kernel debugger: r
%rax = 0x0000000000000000      %r8  = 0x000000000000031a
%rcx = 0x00000000006a5a30      %r9  = 0x00000000001047f0
%rdx = 0x00000000b029bde0      %r10 = 0x00003000400090f4
%rbx = 0x0000000000104be0      %r11 = 0x0000000000000000
%rsp = 0x00000000b029bdc8      %r12 = 0x0000000000000000
%rbp = 0x00000000b029bdd0      %r13 = 0x0000000000000000
%rsi = 0x0000000000000200      %r14 = 0x000000000001300b
%rdi = 0x00000000001047c0      %r15 = 0x0000000000000200
%rip = 0x0000000000000000   %rflags = 0x00010206

which shows us that %rip (the instruction pointer/program counter) is
at address 0, and if we try to get a lisp backtrace at this point
we can see how we got here (this may or may not work in 1.2):

(#x00000000006A5A58) #x000030004000821C : #<Function %DO-FF-CALL #x00003000400081CF> + 77
(#x00000000006A5A68) #x00003000400090F4 : #<Function %FF-CALL #x00003000400082CF> + 3621
(#x00000000006A5AE0) #x00003000404C5A84 : #<Function CALL-CHECK-REGS #x00003000404C599F> + 229
(#x00000000006A5B18) #x00003000404BCA9C : #<Function TOPLEVEL-EVAL #x00003000404BC7BF> + 733
(#x00000000006A5BB8) #x00003000404BEB0C : #<Function READ-LOOP #x00003000404BE3EF> + 1821
(#x00000000006A5DD8) #x00003000404C556C : #<Function TOPLEVEL-LOOP #x00003000404C54EF> + 125

from which we -might- be able to conclude that FF-CALLing a null pointer
is a bad idea.  (This example may not convince anyone who's skeptical
of my assertion that it's hard to reliably recover from an exception
in foreign code; I honestly do think that that's a hard problem.)

The kernel debugger just writes to the (Unix) process-level standard error
descriptor and reads from the process's standard input.

An OSX's GUI application's standard I/O descriptors are ordinarily
redirected: input usually comes from /dev/null (the null device, which
always returns EOF on input) and output and error (supposedly) go to a
logfile somewhere.  (On Leopard, "somewhere" seems to be
/private/tmp.)  It's probably the case that we get the EOF (reading
from /dev/null) before anything's actually flushed to that logfile
when the kernel debugger's entered from the IDE.

While waiting for someone to figure out what to do about that ...
you can run a GUI application in Terminal (or equivalent); when
it's run this way, its standard I/O file descriptors remain unchanged
(and therefore the kernel debugger works.)  The general idea is
to invoke the executable program inside the .app bundle:

shell> /path/to/Clozure\ CL.app/Contents/MacOS/dx86cl64

The good news is that that'll leave standard I/O attached to the
"terminal" (or Emacs shell buffer, or ...) and it's possible to
interact with the kernel debugger (and entering the kernel debugger
won't cause the lisp to exit unless/until it gets an EOF when
reading from standard input).  The bad news is that the standard
error of a GUI application often gets filled with diagnostic
messages that are probably more meaningful to whoever wrote them
than to anyone else, and the fact that that the kernel debugger
is better than nothing doesn't mean that it's a whole lot better
than nothing ...

There are a variety of reasons why Apple's Crash Reporter doesn't
get invoked in this case (they're related to the reasons why it
sometimes gets invoked whenever some lisps get exceptions that
they routinely handle.)  If it were invoked, it wouldn't be
able to make a whole lot of sense out of the lisp-specific side
of things.  (If lisp crashes generated Crash Reporter logs, I
wouldn't often find them very useful and I doubt if other people
would, either.)  Generating someting somewhat like a crash
reporter log would be useful (even if that's equivalent to
having the kernel debugger invoke as many of its options as
might be useful and save the output somewhere.)  Just exiting
on EOF because the EOF comes from /dev/null is probably less
useful.

In the short term, running the IDE from the terminal might be enough
to let the kernel debugger point you in the general direction of the
problem.

On Mon, 10 Nov 2008, Alexander Repenning wrote:

> this may have been discussed in some other context but I cannot find any 
> trace. Anyway, while usually pretty stable CCL 1.2 (mac) works well with 
> Cocoa in general and even reports, without crashing on some memory management 
> issues. But once in a while CCL really does crash but unfortunately without 
> creating a crashlog file. What is missing? I have
>
> COREDUMPS=-YES-
>
> in etc/hostconfig
>
> but when getting a Nov 10 11:54:15 Ristretto-to-Go-7 com.apple.launchd[67] 
> ([0x0-0x15015].com.clozure.Clozure CL[119]): Exited: Killed
>
> there is no crash.log
>
> Am I missing something?
>
> all the best,  Alex
>
>
>
> Prof. Alexander Repenning
>
> University of Colorado
> Computer Science Department
> Boulder, CO 80309-430
>
> vCard: http://www.cs.colorado.edu/~ralex/AlexanderRepenning.vcf
>
>
Alexander Repenning | 14 Nov 00:57
Favicon

Lisp DOC

It is simple to write little code producing a LOT of documentation  
with Lisp. The trivial little hack below produces documentation for  
the entire CL class tree. Especially when classes  
include :documentation even the code below is somewhat useful.  
However, my real question is this. Is anybody aware of some Java DOC- 
like tool for Lisp? That is, something like that thing below but with  
formated output (e.g., HTML with style sheet)? It would seem to be  
such an obvious and simple thing to do in Lisp that I would assume it  
already exists?

Alex

;; ---- lisp-doc.lisp -------

(in-package :ccl)

(defparameter *Classes-Documented* (make-hash-table))

(defun RENDER-CLASS-DOC (Classes &optional (Level 0))
   (when (not (listp Classes))
     (render-class-doc (list Classes) Level)
     (return-from render-class-doc))
   (when (= Level 0) (setf *Classes-Documented* (make-hash-table)))
   (dolist (Class Classes)
     (when (symbolp Class) (setf Class (find-class Class)))
     (unless (gethash (slot-value Class 'name) *Classes-Documented*)
       (setf (gethash (slot-value Class 'name) *Classes-Documented*)  
Class)
       (dotimes (I (* 2 Level)) (princ #\space))
       (princ (slot-value Class 'name))
       (let ((Documentation (documentation (slot-value Class 'name)  
'type)))
         (when Documentation
           (format t ": ~A" Documentation)))
       (terpri)
       (let ((Slot-Names (mapcar #'slot-definition-name (class-direct- 
slots Class))))
         (when Slot-Names
           (dotimes (I (* 2 (+ Level 2))) (princ #\space))
           (format t "slots: ")
           (dolist (Slot-Name (butlast Slot-Names))
             (format t "~:(~A~), " Slot-Name))
           (format t "~:(~A~)" (first (last Slot-Names)))
           (terpri)))
       (render-class-doc (slot-value Class 'direct-subclasses) (1+  
Level)))))

#| Examples:

(render-class-doc 'number)  ;; no much :documentation here in CCL

(render-class-doc 't) ;; same here

|#
R. Matthew Emerson | 14 Nov 01:42

Re: Lisp DOC


On Nov 13, 2008, at 6:57 PM, Alexander Repenning wrote:

> It is simple to write little code producing a LOT of documentation
> with Lisp. The trivial little hack below produces documentation for
> the entire CL class tree. Especially when classes
> include :documentation even the code below is somewhat useful.
> However, my real question is this. Is anybody aware of some Java DOC-
> like tool for Lisp? That is, something like that thing below but with
> formated output (e.g., HTML with style sheet)? It would seem to be
> such an obvious and simple thing to do in Lisp that I would assume it
> already exists?

You might look at the links at the following page:

http://www.cliki.net/Documentation%20tool

(I don't use any of them, so I can't really offer any opinions on  
which ones are good/bad.)
Robert Goldman | 14 Nov 11:35

Re: Lisp DOC

Alexander Repenning wrote:
> It is simple to write little code producing a LOT of documentation  
> with Lisp. The trivial little hack below produces documentation for  
> the entire CL class tree. Especially when classes  
> include :documentation even the code below is somewhat useful.  
> However, my real question is this. Is anybody aware of some Java DOC- 
> like tool for Lisp? That is, something like that thing below but with  
> formated output (e.g., HTML with style sheet)? It would seem to be  
> such an obvious and simple thing to do in Lisp that I would assume it  
> already exists?

Edi Weitz has developed an asdf package, documentation-template
(http://www.weitz.de/documentation-template/), that grovels over the
symbols of a package and assembles them into an HTML manual.  We have
used it at my company, because it's simple, but it's not very general
--- it's very tailored to Edi's own uses, and he's not able to support
it.  We have modified it to be more general (e.g., allow for different
licenses, different download instructions, apply an arbitrary function
to filter the symbols whose documentation is to be incorporated, etc.),
but haven't released our changes.  We could probably be persuaded to do
so, if anyone was interested.

Gary King has a much more ambitious doc tool, but it relies on a very
large tree of software libraries.  We have been too cautious to use it
for that reason.

One thing that would be nice would be if we had a markup language (e.g.,
Markdown, texinfo) to use in documentation strings that would be
readable by just invoking DOCUMENTATION, but that could be postprocessed
to support hyperlinks and rudimentary text attributes.

Best,
Robert
Joshua TAYLOR | 14 Nov 15:14
Favicon

Re: Lisp DOC

For smaller projects, Edi's documentation template is fairly nice, but
it usually requires a fair amount of modification afterward, e.g., if
you want the table of contents to be something other than alphabetic.
If you want something a bit more like JavaDoc, you might try CLDOC
(http://common-lisp.net/project/cldoc/):

"Unlike  Albert  it does not allow programmers to insert comments at
the source code level which are incorporated into the generated
documentation. Its goal was not to produce a LispDoc ala JavaDoc but
to create a simple and easy way to take advantage of the Lisp
documentation strings. So instead of copying and pasting it in some
commentary section with extra special documentation tool markup stuff,
the idea was to find an elegant way of parsing the doc string. "

I do recognize that I compared it Javadoc, and that they point out
that it's /not/ "ala JavaDoc", but between the style it encourages in
docstrings and the HTML output, I think there are some significant
similarities.

The CLDOC documentation (generated by CLDOC, so it's an example) is
available at http://common-lisp.net/project/cldoc/HTMLdoc/ .

//JT
(I have no affiliation with CLDOC, but I've used it in the past and
have been rather happy with the results.)

On Thu, Nov 13, 2008 at 6:57 PM, Alexander Repenning
<ralex <at> cs.colorado.edu> wrote:
> It is simple to write little code producing a LOT of documentation
> with Lisp. The trivial little hack below produces documentation for
> the entire CL class tree. Especially when classes
> include :documentation even the code below is somewhat useful.
> However, my real question is this. Is anybody aware of some Java DOC-
> like tool for Lisp? That is, something like that thing below but with
> formated output (e.g., HTML with style sheet)? It would seem to be
> such an obvious and simple thing to do in Lisp that I would assume it
> already exists?
>
>
> Alex
>
> ;; ---- lisp-doc.lisp -------
>
> (in-package :ccl)
>
>
> (defparameter *Classes-Documented* (make-hash-table))
>
>
> (defun RENDER-CLASS-DOC (Classes &optional (Level 0))
>   (when (not (listp Classes))
>     (render-class-doc (list Classes) Level)
>     (return-from render-class-doc))
>   (when (= Level 0) (setf *Classes-Documented* (make-hash-table)))
>   (dolist (Class Classes)
>     (when (symbolp Class) (setf Class (find-class Class)))
>     (unless (gethash (slot-value Class 'name) *Classes-Documented*)
>       (setf (gethash (slot-value Class 'name) *Classes-Documented*)
> Class)
>       (dotimes (I (* 2 Level)) (princ #\space))
>       (princ (slot-value Class 'name))
>       (let ((Documentation (documentation (slot-value Class 'name)
> 'type)))
>         (when Documentation
>           (format t ": ~A" Documentation)))
>       (terpri)
>       (let ((Slot-Names (mapcar #'slot-definition-name (class-direct-
> slots Class))))
>         (when Slot-Names
>           (dotimes (I (* 2 (+ Level 2))) (princ #\space))
>           (format t "slots: ")
>           (dolist (Slot-Name (butlast Slot-Names))
>             (format t "~:(~A~), " Slot-Name))
>           (format t "~:(~A~)" (first (last Slot-Names)))
>           (terpri)))
>       (render-class-doc (slot-value Class 'direct-subclasses) (1+
> Level)))))
>
>
>
> #| Examples:
>
>
> (render-class-doc 'number)  ;; no much :documentation here in CCL
>
>
> (render-class-doc 't) ;; same here
>
>
>
> |#
> _______________________________________________
> Openmcl-devel mailing list
> Openmcl-devel <at> clozure.com
> http://clozure.com/mailman/listinfo/openmcl-devel
>

--

-- 
=====================
Joshua Taylor
tayloj <at> cs.rpi.edu, jtaylor <at> alum.rpi.edu

"In the Mountains of New Hampshire,
   God Almighty has hung out a sign
     to show that there He makes men."
       Daniel Webster

"A lot of good things went down one time,
  back in the goodle days."
    John Hartford
Daniel Dickison | 14 Nov 16:26

Re: Lisp DOC

On Nov 13, 2008, at 6:57 PM, Alexander Repenning wrote:

> It is simple to write little code producing a LOT of documentation
> with Lisp. The trivial little hack below produces documentation for
> the entire CL class tree. Especially when classes
> include :documentation even the code below is somewhat useful.
> However, my real question is this. Is anybody aware of some Java DOC-
> like tool for Lisp? That is, something like that thing below but with
> formated output (e.g., HTML with style sheet)? It would seem to be
> such an obvious and simple thing to do in Lisp that I would assume it
> already exists?

There is one called Tinaa by Gary King (http://metabang.com), which  
works at the ASDF level to document a system and its ASDF  
dependencies.  I've used it before and it's quite nice.  If you have  
CL-Markdown loaded, it'll apply Markdown formatting to all of your  
docstrings.

http://common-lisp.net/project/tinaa/
http://common-lisp.net/project/cl-markdown/
Osei Poku | 18 Jul 18:29

Re: Random crashing

The following info might also be useful..

[3268] OpenMCL kernel debugger: R
%rax = 0x0000000000000000      %r8  = 0x0000000000000000
%rcx = 0x0000000000000000      %r9  = 0x0000000040E577D0
%rdx = 0x0000000000000001      %r10 = 0x0000000000000008
%rbx = 0x00000000415837D0      %r11 = 0x0000000000000246
%rsp = 0x0000000040E56218      %r12 = 0x0000000040E577D0
%rbp = 0x0000000040E566F0      %r13 = 0x0000000040E56718
%rsi = 0x0000000000000001      %r14 = 0x0000000000000004
%rdi = 0x0000000000000000      %r15 = 0x0000000040E56AA0
%rip = 0x00002ADAFE2CA325   %rflags = 0x0000000000010246
[3268] OpenMCL kernel debugger: x
Unhandled exception 11 at 0x2adafe2ca325, context->regs at #x40e55d88
Exception occurred while executing foreign code
? for help
[3268] OpenMCL kernel debugger: x
exception in foreign context
Exception occurred while executing foreign code
? for help
[3268] OpenMCL kernel debugger: x
Unhandled exception 11 at 0x2adafe2ca325, context->regs at #x40e55d88
Exception occurred while executing foreign code
? for help
[3268] OpenMCL kernel debugger: t
Current Thread Context Record (tcr) = 0x40e577d0
Control (C) stack area:  low = 0x40c04000, high = 0x40e58000
Value (lisp) stack area: low = 0x2aaaacfa1000, high = 0x2aaaad1b2000
Exception stack pointer = 0x40e56218

On Jul 17, 2008, at 3:54 PM, Gary Byers wrote:

>
>
> On Thu, 17 Jul 2008, Osei Poku wrote:
>
>> Hello,
>>
>> I updated today from svn but this thing happened again.  Again the  
>> PC was in the pthread memory region and %rdi was 0.  I verified  
>> that the fix (r9997 i think) was in my ccl working directory  
>> (somewhere in thread_manager.c right?).
>
> Yes; there are 3 calls to pthread_kill() in that file.  One of them  
> (in resume_tcr()) is conditionlized out; the other two
> (in raise_thread_interrupt() and suspend_tcr()) should check
> to make sure that the thread that they'd pass as the first
> argument to pthread_kill is non-zero before doing the call.)
>
>>
>> My current version is:
>> Clozure Common Lisp Version 1.2-r10073M-RC1  (LinuxX8664)!
>>
>> Is there anything other than (rebuild-ccl :force t) that I need to  
>> do to recompile the c source for the lisp kernel?
>
> As Gail just pointed out, :full t (or :kernel t) is necessary
> in order to get the kernel updated. (:force t will recompile
> FASLs even if they're newer than the corresponding source;
> that's occasionally useful, but not really what you want here.)
>
> If the kernel that you're running had its modified date change
> by the rebuild process, it likely incorporates those changes.  If
> those changes didn't fix the problem, then I don't have a good
> guess as to what the problem is: there aren't too many places
> where the lisp calls into the threads library: it creates threads
> and sends them signals via pthread_kill().  (There's another place  
> where a thread will send itself a signal via pthread_kill(),
> but that is pretty much guaranteed to be a valid thread ...)
>
>
>>
>> Thanks,
>> Osei
>>
>> On Jul 9, 2008, at 3:05 PM, Gary Byers wrote:
>>
>>> --On July 9, 2008 2:26:56 PM -0400 Osei Poku <osei.poku <at> gmail.com>  
>>> wrote:
>>>> Hi,
>>>> It crashed again for me.  This time I managed to grab the  
>>>> contents of
>>>> /proc/pid/maps before I killed it.  Logs of the tty session and  
>>>> memory
>>>> maps are attached.  I had also managed to update from the  
>>>> repository to
>>>> r9890-RC1.
>>>> Osei
>>> It seems to be crashed in the threads library (libpthread.so).
>>> There's a race condition in the code which suspends threads
>>> on entry to the GC: the thread that's running the GC looks
>>> at each thread that it wants to suspend to see if it's
>>> still alive (the data structure that represents a thread
>>> might still be around, even if the OS-level thread has
>>> exited.)  The suspending thread looks at the tcr->osid
>>> field of the target, notes that it's non-zero, then
>>> calls a function to send the os-level thread a signal.
>>> That function accesses the tcr->osid field again (which,
>>> when non-zero, represents a POSIX thread ID) and calls
>>> pthread_kill()).
>>> When a thread dies, it clears its tcr->osid field, so
>>> if the target thread dies between the point when the
>>> suspending thread looks and the point where it leaps,
>>> we wind up calling pthread_kill() with a first argument
>>> of 0, and it crashes.  That's consistent with the
>>> register information: we're somewhere in the threads
>>> library (possibly in pthread_kill()), and the register
>>> in which C functions receive their first argument (%rdi)
>>> is  0.
>>> I'll try to check in a fix for that (look before leaping)
>>> soon.  As I understand it, SLIME will sometimes (depending
>>> on the setting of a "communication style" variable)
>>> spawn a thread in which to run each form being evaluated
>>> (via C-M-x or whatever); whether that's a good idea or
>>> not, consing short-lived threads all the time is probably
>>> a good way to trigger this bug.  I don't use SLIME, and
>>> don't know what the consequences of changing the communication
>>> style variable would be.
>>
Osei Poku | 18 Jul 18:32

Re: Random crashing

More debug info... Sorry about the multiple emails, I'm figuring  
things out as I go.

(gdb) info threads
   9 Thread 0x40263950 (LWP 3271)  0x00002adafe2ca2cb in sem_timedwait  
() from /lib64/libpthread.so.0
   8 Thread 0x404c7950 (LWP 3272)  0x00002adafe2ca2cb in sem_timedwait  
() from /lib64/libpthread.so.0
   7 Thread 0x4072b950 (LWP 3305)  0x00002adafe2ca2cb in sem_timedwait  
() from /lib64/libpthread.so.0
   6 Thread 0x4098f950 (LWP 3306)  0x00002adafe2ca2cb in sem_timedwait  
() from /lib64/libpthread.so.0
   5 Thread 0x40bf3950 (LWP 3307)  0x00002adafe2ca2cb in sem_timedwait  
() from /lib64/libpthread.so.0
   4 Thread 0x40e57950 (LWP 6093)  0x00002adafe591bfb in read () from / 
lib64/libc.so.6
   3 Thread 0x4131f950 (LWP 6094)  0x00002adafe2ca2cb in sem_timedwait  
() from /lib64/libpthread.so.0
   2 Thread 0x410bb950 (LWP 6095)  0x00002adafe2ca2cb in sem_timedwait  
() from /lib64/libpthread.so.0
   1 Thread 0x2adafe820880 (LWP 3268)  0x00002adafe2ca2cb in  
sem_timedwait () from /lib64/libpthread.so.0
(gdb) thread 4
[Switching to thread 4 (Thread 0x40e57950 (LWP 6093))]#0   
0x00002adafe591bfb in read () from /lib64/libc.so.6
(gdb) bt
#0  0x00002adafe591bfb in read () from /lib64/libc.so.6
#1  0x00002adafe545553 in _IO_file_underflow () from /lib64/libc.so.6
#2  0x00002adafe545d0e in _IO_default_uflow () from /lib64/libc.so.6
#3  0x00002adafe541404 in getc () from /lib64/libc.so.6
#4  0x000000000041d43d in readc () at /usr/include/bits/stdio.h:43
#5  0x000000000041d590 in lisp_Debugger (xp=0x40e55d60,  
info=0x40e56110, why=11, in_foreign_code=1, message=0x40e55b10  
"Unhandled exception 11 at 0x2adafe2ca325, context->regs at  
#x40e55d88") at ../lisp-debug.c:914
#6  0x000000000041a2c6 in signal_handler (signum=11, info=0x40e56110,  
context=0x40e55d60, tcr=0x40e577d0, old_valence=1) at ../x86- 
exceptions.c:1070
#7  <signal handler called>
#8  0x00002adafe2ca325 in sem_post () from /lib64/libpthread.so.0
#9  0x000000000041b3e2 in resume_tcr (tcr=0x415837d0) at ../ 
thread_manager.c:1376
#10 0x000000000041c146 in lisp_resume_tcr (tcr=0x415837d0) at ../ 
thread_manager.c:1418
#11 0x000000000041a0c8 in handle_exception (signum=<value optimized  
out>, info=0x40e56aa0, context=0x40e566f0, tcr=0x40e577d0,  
old_valence=0) at ../x86-exceptions.c:910
#12 0x000000000041a218 in signal_handler (signum=4, info=0x40e56aa0,  
context=0x40e566f0, tcr=0x40e577d0, old_valence=0) at ../x86- 
exceptions.c:1064
#13 <signal handler called>
#14 0x00003000400110ab in ?? ()
#15 0x000030004042660c in ?? ()
#16 0x000000000040e0ac in _SPnthrowvalues () at ../x86-spentry64.s:1404
#17 0x00002aaaad1b1110 in ?? ()
#18 0x0000000000000008 in ?? ()
#19 0x0000000000000000 in ?? ()
(gdb)

On Jul 17, 2008, at 3:54 PM, Gary Byers wrote:

>
>
> On Thu, 17 Jul 2008, Osei Poku wrote:
>
>> Hello,
>>
>> I updated today from svn but this thing happened again.  Again the  
>> PC was in the pthread memory region and %rdi was 0.  I verified  
>> that the fix (r9997 i think) was in my ccl working directory  
>> (somewhere in thread_manager.c right?).
>
> Yes; there are 3 calls to pthread_kill() in that file.  One of them  
> (in resume_tcr()) is conditionlized out; the other two
> (in raise_thread_interrupt() and suspend_tcr()) should check
> to make sure that the thread that they'd pass as the first
> argument to pthread_kill is non-zero before doing the call.)
>
>>
>> My current version is:
>> Clozure Common Lisp Version 1.2-r10073M-RC1  (LinuxX8664)!
>>
>> Is there anything other than (rebuild-ccl :force t) that I need to  
>> do to recompile the c source for the lisp kernel?
>
> As Gail just pointed out, :full t (or :kernel t) is necessary
> in order to get the kernel updated. (:force t will recompile
> FASLs even if they're newer than the corresponding source;
> that's occasionally useful, but not really what you want here.)
>
> If the kernel that you're running had its modified date change
> by the rebuild process, it likely incorporates those changes.  If
> those changes didn't fix the problem, then I don't have a good
> guess as to what the problem is: there aren't too many places
> where the lisp calls into the threads library: it creates threads
> and sends them signals via pthread_kill().  (There's another place  
> where a thread will send itself a signal via pthread_kill(),
> but that is pretty much guaranteed to be a valid thread ...)
>
>
>>
>> Thanks,
>> Osei
>>
>> On Jul 9, 2008, at 3:05 PM, Gary Byers wrote:
>>
>>> --On July 9, 2008 2:26:56 PM -0400 Osei Poku <osei.poku <at> gmail.com>  
>>> wrote:
>>>> Hi,
>>>> It crashed again for me.  This time I managed to grab the  
>>>> contents of
>>>> /proc/pid/maps before I killed it.  Logs of the tty session and  
>>>> memory
>>>> maps are attached.  I had also managed to update from the  
>>>> repository to
>>>> r9890-RC1.
>>>> Osei
>>> It seems to be crashed in the threads library (libpthread.so).
>>> There's a race condition in the code which suspends threads
>>> on entry to the GC: the thread that's running the GC looks
>>> at each thread that it wants to suspend to see if it's
>>> still alive (the data structure that represents a thread
>>> might still be around, even if the OS-level thread has
>>> exited.)  The suspending thread looks at the tcr->osid
>>> field of the target, notes that it's non-zero, then
>>> calls a function to send the os-level thread a signal.
>>> That function accesses the tcr->osid field again (which,
>>> when non-zero, represents a POSIX thread ID) and calls
>>> pthread_kill()).
>>> When a thread dies, it clears its tcr->osid field, so
>>> if the target thread dies between the point when the
>>> suspending thread looks and the point where it leaps,
>>> we wind up calling pthread_kill() with a first argument
>>> of 0, and it crashes.  That's consistent with the
>>> register information: we're somewhere in the threads
>>> library (possibly in pthread_kill()), and the register
>>> in which C functions receive their first argument (%rdi)
>>> is  0.
>>> I'll try to check in a fix for that (look before leaping)
>>> soon.  As I understand it, SLIME will sometimes (depending
>>> on the setting of a "communication style" variable)
>>> spawn a thread in which to run each form being evaluated
>>> (via C-M-x or whatever); whether that's a good idea or
>>> not, consing short-lived threads all the time is probably
>>> a good way to trigger this bug.  I don't use SLIME, and
>>> don't know what the consequences of changing the communication
>>> style variable would be.
>>

Gmane