3 Jun 22:30
DIRECTORY calling DELETE-DUPLICATES is slow
From: Lynn Quam <quam <at> AI.SRI.COM>
Subject: DIRECTORY calling DELETE-DUPLICATES is slow
Newsgroups: gmane.lisp.cmucl.devel
Date: 2008-06-03 20:33:05 GMT
Subject: DIRECTORY calling DELETE-DUPLICATES is slow
Newsgroups: gmane.lisp.cmucl.devel
Date: 2008-06-03 20:33:05 GMT
I waited about 20 minutes for DIRECTORY to finish getting the pathnames of a large directory containing about 27000 files. The problem is that DIRECTORY calls DELETE-DUPLICATES which is implemented using an order n-squared algorithm! For large sequences, the use of a hash-table would make sense, but would cause consing. I better change would be to the DIRECTORY function itself. Note that the result of DELETE-DUPLICATES is passed to SORT. If the list were sorted first, then it is easy (order n) to remove duplicates.
RSS Feed