Joerg Sonnenberger | 7 Jun 2012 08:36
Picon

db(3) removal and lastlogx

Hi all,
there is a bit mechnical work left to remove db(3) dependencies e.g. for
the password database, but that's mostly that. The last real consume of
the db(3) API in libc is the lastlogx handling in utmpx. Basically, this
logs a record in a file indexed by uid. I can think of three different
approaches for dealing with this:

(1) Just use a sparse file. This requires by far the least amount of
code, just some verification logic for a file header and writing to "uid
* size of entry". Writes should be short enough to ensure atomic writes
even on NFS. Same for reads. No manual locking needed.

(2) Implement a simple hierachical byte-level index, e.g. start by a
table of 256 off_t entries, index with the LSB continue. For read
hitting an off_t of 0 means the entry doesn't exist, for write get a
byte range lock for that area, append a new index page / the entry to
the file and use a single write to update it. Provide iterator interface
to find all entries in the file. Somewhat more involved, but not too
much.

(3) Implement on-disk PATRICA lookup. Still much simpler than b-tree,
but also more work than (2).

Comments?

Joerg

Alan Barrett | 7 Jun 2012 09:07
Gravatar

Re: db(3) removal and lastlogx

On Thu, 07 Jun 2012, Joerg Sonnenberger wrote:
> The last real consume of the db(3) API in libc is the lastlogx 
> handling in utmpx. Basically, this logs a record in a file 
> indexed by uid. I can think of three different approaches for 
> dealing with this:
>
> (1) Just use a sparse file. This requires by far the least 
> amount of code, just some verification logic for a file header 
> and writing to "uid * size of entry". Writes should be short 
> enough to ensure atomic writes even on NFS. Same for reads. No 
> manual locking needed.

With 32-bit uid_t, the file size could appear to be multiple 
gigabytes, even though not that much space is actually used.  This 
might confuse people and naive copy or backup processes.

>(2) Implement a simple hierachical byte-level index, [...]
>(3) Implement on-disk PATRICA lookup. [...]

(4) Use sqlite.

(0) continue to use db(3).

--apb (Alan Barrett)

Joerg Sonnenberger | 7 Jun 2012 09:25
Picon

Re: db(3) removal and lastlogx

On Thu, Jun 07, 2012 at 09:07:44AM +0200, Alan Barrett wrote:
> On Thu, 07 Jun 2012, Joerg Sonnenberger wrote:
> >The last real consume of the db(3) API in libc is the lastlogx
> >handling in utmpx. Basically, this logs a record in a file indexed
> >by uid. I can think of three different approaches for dealing with
> >this:
> >
> >(1) Just use a sparse file. This requires by far the least amount
> >of code, just some verification logic for a file header and
> >writing to "uid * size of entry". Writes should be short enough to
> >ensure atomic writes even on NFS. Same for reads. No manual
> >locking needed.
> 
> With 32-bit uid_t, the file size could appear to be multiple
> gigabytes, even though not that much space is actually used.  This
> might confuse people and naive copy or backup processes.

Only if you actually use uids that large.

> >(2) Implement a simple hierachical byte-level index, [...]
> >(3) Implement on-disk PATRICA lookup. [...]
> 
> (4) Use sqlite.

This is still libc we are talking about. So no, sqlite is not
acceptable.

> (0) continue to use db(3).

I don't really consider this an option either. Especially for a file
(Continue reading)

David Holland | 7 Jun 2012 10:11
Picon

Re: db(3) removal and lastlogx

On Thu, Jun 07, 2012 at 09:25:41AM +0200, Joerg Sonnenberger wrote:
 > > >(1) Just use a sparse file. This requires by far the least amount
 > > >of code, just some verification logic for a file header and
 > > >writing to "uid * size of entry". Writes should be short enough to
 > > >ensure atomic writes even on NFS. Same for reads. No manual
 > > >locking needed.

Historically, the lastlog file has always been exactly this. I see no
reason to do anything else.

 > I don't really consider this an option either. Especially for a file
 > where updates happen all the time and e.g. the crash reliability of
 > db(3) is a real concern.

The lastlog file isn't exactly mission-critical data.

--

-- 
David A. Holland
dholland <at> netbsd.org

David Laight | 7 Jun 2012 09:09
Picon

Re: db(3) removal and lastlogx

On Thu, Jun 07, 2012 at 08:36:36AM +0200, Joerg Sonnenberger wrote:
> Hi all,
> there is a bit mechnical work left to remove db(3) dependencies e.g. for
> the password database, but that's mostly that. The last real consume of
> the db(3) API in libc is the lastlogx handling in utmpx. Basically, this
> logs a record in a file indexed by uid. I can think of three different
> approaches for dealing with this:
> 
> (1) Just use a sparse file. This requires by far the least amount of
> code, just some verification logic for a file header and writing to "uid
> * size of entry". Writes should be short enough to ensure atomic writes
> even on NFS. Same for reads. No manual locking needed.

Isn't that the 'traditional' format?
Presumably, at some point, that was replaced by the db format.

> (2) Implement a simple hierachical byte-level index, e.g. start by a
> table of 256 off_t entries, index with the LSB continue. For read
> hitting an off_t of 0 means the entry doesn't exist, for write get a
> byte range lock for that area, append a new index page / the entry to
> the file and use a single write to update it. Provide iterator interface
> to find all entries in the file. Somewhat more involved, but not too
> much.

A variant on that is to make the indexed 'off_t' either a pointer to
the actual data, or a pointer to the next level index block.
The data blocks need to contain their own value - and a check done.
(or put the number into the index array).
The lookup sequence is a simple loop - you don't need to know the
table level.
(Continue reading)

Joerg Sonnenberger | 8 Jun 2012 21:56
Picon

Re: db(3) removal and lastlogx

On Thu, Jun 07, 2012 at 08:36:36AM +0200, Joerg Sonnenberger wrote:
> (1) Just use a sparse file. This requires by far the least amount of
> code, just some verification logic for a file header and writing to "uid
> * size of entry". Writes should be short enough to ensure atomic writes
> even on NFS. Same for reads. No manual locking needed.

After some thinking, I will go with this route. File format will look
like:

magic number
record size
64KB of used bytes, set if any uid / 64K is present.

64K blocks of:
64KB of used bytes, set if uid / 64K matches the block number and uid %
64K is present.

64K records of given size

Each record gets has a version number at the beginning.

lastlogx2 or however to call it gets created at boot, so that the next
time the format changes, statically linked programs can just continue to
write the old format, they might just not be able to read entries.

Joerg

David Laight | 8 Jun 2012 22:24
Picon

Re: db(3) removal and lastlogx

On Fri, Jun 08, 2012 at 09:56:28PM +0200, Joerg Sonnenberger wrote:
> On Thu, Jun 07, 2012 at 08:36:36AM +0200, Joerg Sonnenberger wrote:
> > (1) Just use a sparse file. This requires by far the least amount of
> > code, just some verification logic for a file header and writing to "uid
> > * size of entry". Writes should be short enough to ensure atomic writes
> > even on NFS. Same for reads. No manual locking needed.
> 
> After some thinking, I will go with this route. File format will look
> like:
> 
> magic number
> record size

Must be worth adding some extra 0 bytes here.

> 64KB of used bytes, set if any uid / 64K is present.
> 
> 64K blocks of:
{
> 64KB of used bytes, set if uid / 64K matches the block number and uid %
> 64K is present.
> 
> 64K records of given size
}

I not sure I follow the above, possibly some { ... } might help!

I presume the byte maps are intended to make it possible to scan for
used uid values without using the 'skip hole in file' functions?

(Continue reading)

Joerg Sonnenberger | 8 Jun 2012 22:38
Picon

Re: db(3) removal and lastlogx

On Fri, Jun 08, 2012 at 09:24:54PM +0100, David Laight wrote:
> On Fri, Jun 08, 2012 at 09:56:28PM +0200, Joerg Sonnenberger wrote:
> > On Thu, Jun 07, 2012 at 08:36:36AM +0200, Joerg Sonnenberger wrote:
> > > (1) Just use a sparse file. This requires by far the least amount of
> > > code, just some verification logic for a file header and writing to "uid
> > > * size of entry". Writes should be short enough to ensure atomic writes
> > > even on NFS. Same for reads. No manual locking needed.
> > 
> > After some thinking, I will go with this route. File format will look
> > like:
> > 
> > magic number
> > record size
> 
> Must be worth adding some extra 0 bytes here.

Sure.

> > 64KB of used bytes, set if any uid / 64K is present.
> > 
> > 64K blocks of:
> {
> > 64KB of used bytes, set if uid / 64K matches the block number and uid %
> > 64K is present.
> > 
> > 64K records of given size
> }
> 
> I not sure I follow the above, possibly some { ... } might help!
> 
(Continue reading)

David Laight | 8 Jun 2012 22:46
Picon

Re: db(3) removal and lastlogx

On Fri, Jun 08, 2012 at 10:38:56PM +0200, Joerg Sonnenberger wrote:
> 
> > I think your access scheme is:
> > 
> > Most accesses are writes, so the code must read the file header first
> > to verify the magic etc.
> > So the header could contain the offest to the first block as well.
> > It can then read the required entry before writing it back, if the
> > read value was all-zero it must write out the two marker bytes.
> 
> Actually, the other way around. First write the entry and then check if
> the marker bytes are set.

You can save the read of the marker bytes by checking that the entry
itself is non-zero. So you only need to read the part of the disk
containing the markers for new entries.

Hmmm.... how big a uid can you save on a 1k/8k ffs filesystem?
(ie how big do the triple-indirect blocks go).
You might have a uid limit below 2^32.

And, if someone does create a large uid you do have a very large (if
sparse) file.

	David

--

-- 
David Laight: david <at> l8s.co.uk

(Continue reading)

Martin Husemann | 9 Jun 2012 08:44
Picon

Re: db(3) removal and lastlogx

I missed the start of this thread and have a stupid question: why are we
trying to get rid of db(3) in libc?

Martin

Matthew Mondor | 10 Jun 2012 20:14

Re: db(3) removal and lastlogx

On Sat, 9 Jun 2012 08:44:54 +0200
Martin Husemann <martin <at> duskware.de> wrote:

> I missed the start of this thread and have a stupid question: why are we
> trying to get rid of db(3) in libc?

I wonder exactly the same thing:

Why get rid of a mature, versatile system, already in libc under an
acceptable license and size, that even currently does a good job at
maintaining the lastlog without the sparse file issues?  I also expect
db(3) to be used by third parties, as it's been part of BSD for a long
time (and db4 is under a more restrictive license)...

Do we have kernel support for efficiently detecting file system holes,
and are our copying/moving/archiving tools dealing gracefully with
sparse files by default (unlikely)?

Thanks,
--

-- 
Matt

David Holland | 22 Jul 2012 06:07
Picon

Re: db(3) removal and lastlogx

On Sun, Jun 10, 2012 at 02:14:27PM -0400, Matthew Mondor wrote:
 > > I missed the start of this thread and have a stupid question: why are we
 > > trying to get rid of db(3) in libc?
 > 
 > I wonder exactly the same thing:

Because the db 1.85 we have in libc is severely outdated and has a
bunch of known and basically unfixable bugs, and we can't move to db4
or db5 because of licensing.

--

-- 
David A. Holland
dholland <at> netbsd.org

Christos Zoulas | 22 Jul 2012 16:19

Re: db(3) removal and lastlogx

In article <20120722040706.GA28095 <at> netbsd.org>,
David Holland  <dholland-tech <at> netbsd.org> wrote:
>On Sun, Jun 10, 2012 at 02:14:27PM -0400, Matthew Mondor wrote:
> > > I missed the start of this thread and have a stupid question: why are we
> > > trying to get rid of db(3) in libc?
> > 
> > I wonder exactly the same thing:
>
>Because the db 1.85 we have in libc is severely outdated

Yes

>and has a bunch of known and basically unfixable bugs,

Like which ones? Are there PR's? TO my knowledge we have the only db1.85
that works. I don't like db much, I am just curious..

>and we can't move to db4 or db5 because of licensing.

Yes.

christos

David Holland | 22 Jul 2012 19:10
Picon

Re: db(3) removal and lastlogx

On Sun, Jul 22, 2012 at 02:19:27PM +0000, Christos Zoulas wrote:
 > > > > I missed the start of this thread and have a stupid question:
 > > > > why are we
 > > > > trying to get rid of db(3) in libc?
 > > > 
 > > > I wonder exactly the same thing:
 > >
 > >Because the db 1.85 we have in libc is severely outdated
 > 
 > Yes
 > 
 > >and has a bunch of known and basically unfixable bugs,
 > 
 > Like which ones? Are there PR's? TO my knowledge we have the only db1.85
 > that works. I don't like db much, I am just curious..

Well, the major problem is that the old suggested way of handling
multiple writers doesn't quite work and can't really be made to work.

There's one I think have on file somewhere (although there doesn't
appear to be a PR) and according to internal sources at Sleepycat a
long time ago there's a small but nonzero number of problems with 1.85
that can't be fixed without changing the file format... even if we
wanted to get into that, it's not really a good idea.

 > >and we can't move to db4 or db5 because of licensing.
 > 
 > Yes.

Well, we could; just not in libc.
(Continue reading)

Christos Zoulas | 22 Jul 2012 20:52

Re: db(3) removal and lastlogx

On Jul 22,  5:10pm, dholland-tech <at> netbsd.org (David Holland) wrote:
-- Subject: Re: db(3) removal and lastlogx

| Well, the major problem is that the old suggested way of handling
| multiple writers doesn't quite work and can't really be made to work.

Yes, multiple writers cannot be supported.

| There's one I think have on file somewhere (although there doesn't
| appear to be a PR) and according to internal sources at Sleepycat a
| long time ago there's a small but nonzero number of problems with 1.85
| that can't be fixed without changing the file format... even if we
| wanted to get into that, it's not really a good idea.

Yes, I heard about that too a long while ago. Yes, there were bugs in
the page handling code and could be others in the way pages are spilled,
but we have not seen any recently. We probably have the only functioning
1.85 that passes the torture tests...

|  > >and we can't move to db4 or db5 because of licensing.
| 
| Well, we could; just not in libc.

Not worth it I think; they have the Oracle license now.

christos

Matthew Mondor | 22 Jul 2012 23:29

Re: db(3) removal and lastlogx

On Sun, 22 Jul 2012 14:52:49 -0400
christos <at> zoulas.com (Christos Zoulas) wrote:

> |  > >and we can't move to db4 or db5 because of licensing.
> | 
> | Well, we could; just not in libc.
> 
> Not worth it I think; they have the Oracle license now.

Indeed, db4 is being increasingly replaced by sqlite in projects,
partly because of the licensing issues, but also because of its
simple interface, and I agree that db4 is unsuitable for the base system.

I was not aware of the problems with the old BDB we have, thanks to
those who discussed the details.
--

-- 
Matt

Joerg Sonnenberger | 11 Jun 2012 13:44
Picon

Re: db(3) removal and lastlogx

On Sat, Jun 09, 2012 at 08:44:54AM +0200, Martin Husemann wrote:
> I missed the start of this thread and have a stupid question: why are we
> trying to get rid of db(3) in libc?

Because as far as database implementation goes, it is extremely flawed.
The biggest issue is that every program using db(3) in a read-write
environment has to deal with inconsistent data, if the system might have
crashed during a change cycle. I'm not even talking about transactional
integrity, but just plain old "random output".

Using db(3) as constant database is inefficient as best. It comes with
both a significant overhead in terms of database size and CPU cycles.

Considering lastlogx, it seems like the db(3) use provides an
interesting DOS vector. If I run login and SIGSTOP it after it opened
the file with exclusive log, but before it finished writting the entry,
it should provide other instances from doing the same. In other words,
we really want to have a file format that works without any exclusive
lock on the (database) file.

Joerg

YAMAMOTO Takashi | 12 Jun 2012 04:25
Picon

Re: db(3) removal and lastlogx

hi,

> Considering lastlogx, it seems like the db(3) use provides an
> interesting DOS vector. If I run login and SIGSTOP it after it opened
> the file with exclusive log, but before it finished writting the entry,
> it should provide other instances from doing the same. In other words,
> we really want to have a file format that works without any exclusive
> lock on the (database) file.

does your proprosed format work without locks?

YAMAMOTO Takashi

Joerg Sonnenberger | 12 Jun 2012 18:03
Picon

Re: db(3) removal and lastlogx

On Tue, Jun 12, 2012 at 02:25:42AM +0000, YAMAMOTO Takashi wrote:
> hi,
> 
> > Considering lastlogx, it seems like the db(3) use provides an
> > interesting DOS vector. If I run login and SIGSTOP it after it opened
> > the file with exclusive log, but before it finished writting the entry,
> > it should provide other instances from doing the same. In other words,
> > we really want to have a file format that works without any exclusive
> > lock on the (database) file.
> 
> does your proprosed format work without locks?

Variant 1 (sparse file) yes. The others could be made to work using CAS
and mmap, but would leak space in the race case for initial inserts.

Joerg

Alistair Crooks | 13 Jul 2012 06:35

Re: db(3) removal and lastlogx

On Mon, Jun 11, 2012 at 01:44:57PM +0200, Joerg Sonnenberger wrote:
> On Sat, Jun 09, 2012 at 08:44:54AM +0200, Martin Husemann wrote:
> > I missed the start of this thread and have a stupid question: why are we
> > trying to get rid of db(3) in libc?
> 
> Because as far as database implementation goes, it is extremely flawed.
> The biggest issue is that every program using db(3) in a read-write
> environment has to deal with inconsistent data, if the system might have
> crashed during a change cycle. I'm not even talking about transactional
> integrity, but just plain old "random output".

We've been using db for this for almost 20 years, I fail to see why
it's just become a problem just recently.

> Using db(3) as constant database is inefficient as best. It comes with
> both a significant overhead in terms of database size and CPU cycles.

And yet disk space is not quite as stretched as it once was, and CPUs
are way more powerful than they used to be.  Even tier 3 platforms, or
those from 20 years ago, could perform db queries efficiently.  So
these aren't real issues.

All in all, I'm still trying to find out what problem you're trying to
solve.

It would be a non-issue if a db-wrapper for the cdb was used.  Then we
would not be worrying about modifying source code needlessly.

Regards,
Alistair
(Continue reading)

Alistair Crooks | 21 Jul 2012 19:56

Re: db(3) removal and lastlogx

On Fri, Jul 13, 2012 at 06:35:27AM +0200, Alistair Crooks wrote:
> On Mon, Jun 11, 2012 at 01:44:57PM +0200, Joerg Sonnenberger wrote:
> > On Sat, Jun 09, 2012 at 08:44:54AM +0200, Martin Husemann wrote:
> > > I missed the start of this thread and have a stupid question: why are we
> > > trying to get rid of db(3) in libc?
> > 
> > Because as far as database implementation goes, it is extremely flawed.
> > The biggest issue is that every program using db(3) in a read-write
> > environment has to deal with inconsistent data, if the system might have
> > crashed during a change cycle. I'm not even talking about transactional
> > integrity, but just plain old "random output".
> 
> We've been using db for this for almost 20 years, I fail to see why
> it's just become a problem just recently.
>  
> > Using db(3) as constant database is inefficient as best. It comes with
> > both a significant overhead in terms of database size and CPU cycles.
> 
> And yet disk space is not quite as stretched as it once was, and CPUs
> are way more powerful than they used to be.  Even tier 3 platforms, or
> those from 20 years ago, could perform db queries efficiently.  So
> these aren't real issues.
> 
> All in all, I'm still trying to find out what problem you're trying to
> solve.
> 
> It would be a non-issue if a db-wrapper for the cdb was used.  Then we
> would not be worrying about modifying source code needlessly.
> 
> Regards,
(Continue reading)

Joerg Sonnenberger | 22 Jul 2012 00:45
Picon

Re: db(3) removal and lastlogx

On Sat, Jul 21, 2012 at 07:56:45PM +0200, Alistair Crooks wrote:
> On Fri, Jul 13, 2012 at 06:35:27AM +0200, Alistair Crooks wrote:
> > On Mon, Jun 11, 2012 at 01:44:57PM +0200, Joerg Sonnenberger wrote:
> > > On Sat, Jun 09, 2012 at 08:44:54AM +0200, Martin Husemann wrote:
> > > > I missed the start of this thread and have a stupid question: why are we
> > > > trying to get rid of db(3) in libc?
> > > 
> > > Because as far as database implementation goes, it is extremely flawed.
> > > The biggest issue is that every program using db(3) in a read-write
> > > environment has to deal with inconsistent data, if the system might have
> > > crashed during a change cycle. I'm not even talking about transactional
> > > integrity, but just plain old "random output".
> > 
> > We've been using db for this for almost 20 years, I fail to see why
> > it's just become a problem just recently.
> >  
> > > Using db(3) as constant database is inefficient as best. It comes with
> > > both a significant overhead in terms of database size and CPU cycles.
> > 
> > And yet disk space is not quite as stretched as it once was, and CPUs
> > are way more powerful than they used to be.  Even tier 3 platforms, or
> > those from 20 years ago, could perform db queries efficiently.  So
> > these aren't real issues.
> > 
> > All in all, I'm still trying to find out what problem you're trying to
> > solve.
> > 
> > It would be a non-issue if a db-wrapper for the cdb was used.  Then we
> > would not be worrying about modifying source code needlessly.
> > 
(Continue reading)

Alistair Crooks | 22 Jul 2012 08:20

Re: db(3) removal and lastlogx

On Sun, Jul 22, 2012 at 12:45:49AM +0200, Joerg Sonnenberger wrote:
> > Well, in lieu of any supporting arguments for the migration of db to cdb
> > format, let's revert them all.
> 
> "I don't agree with your arguments, so you haven't given any." You are
> not being helpful.

I did ask a week ago (July 13th), and got no reply.  Others have asked
the same thing.  I guess we should all try to be more helpful.

I've seen the figures for raw database builds for cdb vs db-1.85, and
they are impressive.  But various databases are built at different
times - some at boot time, some when they change (like the db ones),
some at build time, like the terminfo db.

But these are not common occurrences, and the run time is swamped by
everything else happening on the machine.

You state that db recovery is problematic and may not be possible, but
we've had 20 years in the base system of db recovery - it's never been
an issue until now. cdb recovers a whole lot better? How can we tell?
It sounds like a non-problem has been solved.

When this migration first happened, it was pointed out that a
db-like interface for cdb would mean that much less would have to be
changed in order to transition from db to cdb.  That still seems
valid, and I don't recall any further discussion of this, other than
it being dismissed out of hand.  Why was that?

So, all in all, it would really help me a lot if you could tell us
(Continue reading)

Christos Zoulas | 22 Jul 2012 01:28

Re: db(3) removal and lastlogx

In article <20120721175645.GM20677 <at> nef.pbox.org>,
Alistair Crooks  <agc <at> pkgsrc.org> wrote:
>
>Well, in lieu of any supporting arguments for the migration of db to cdb
>format, let's revert them all.

Aside the compatibility issues (which I believe are mostly fine), the cdb
changes for the read-only databases is a strict improvement.

>Especially in view of the fact that marking terminfo.db 'obsolete' has
>broken backwards compatibility for standalone-tcsh, to name but one.

That was not joerg's fault, or related to cdb. This was part of the termcap
/terminfo migration.

christos

Alistair Crooks | 22 Jul 2012 08:27

Re: db(3) removal and lastlogx

On Sat, Jul 21, 2012 at 11:28:38PM +0000, Christos Zoulas wrote:
> In article <20120721175645.GM20677 <at> nef.pbox.org>,
> Alistair Crooks  <agc <at> pkgsrc.org> wrote:
> >
> >Well, in lieu of any supporting arguments for the migration of db to cdb
> >format, let's revert them all.
> 
> Aside the compatibility issues (which I believe are mostly fine), the cdb
> changes for the read-only databases is a strict improvement.

Creates or reads, or both?

Recovery was another issue which was flagged under db1 as being
problematic - how is it done for cdb?  How is it superior?  How
does this relate to a readonly db? For readwrite dbs like the passwd
ones?

Transactions - how do they relate to a readonly database? I've seen
that used as justification too.

And the compat issues - since we'll have to keep the db1 code in libc
- they're kinda difficult, especially if we have any statically-linked
programs which use termcap/terminfo, or the user databases, or
services, etc.

> >Especially in view of the fact that marking terminfo.db 'obsolete' has
> >broken backwards compatibility for standalone-tcsh, to name but one.
> 
> That was not joerg's fault, or related to cdb. This was part of the termcap
> /terminfo migration.
(Continue reading)

Christos Zoulas | 22 Jul 2012 16:02

Re: db(3) removal and lastlogx

On Jul 22,  8:27am, agc <at> pkgsrc.org (Alistair Crooks) wrote:
-- Subject: Re: db(3) removal and lastlogx

| Creates or reads, or both?

reads.

| Recovery was another issue which was flagged under db1 as being
| problematic - how is it done for cdb?  How is it superior?  How
| does this relate to a readonly db? For readwrite dbs like the passwd
| ones?

It is not, it is read only.

| Transactions - how do they relate to a readonly database? I've seen
| that used as justification too.

I don't think that this is a valid justification.

| And the compat issues - since we'll have to keep the db1 code in libc
| - they're kinda difficult, especially if we have any statically-linked
| programs which use termcap/terminfo, or the user databases, or
| services, etc.

It is harder to remove the code if it is used; Having said that I don't
see a pressing reason to remove it.

christos

(Continue reading)

David Holland | 22 Jul 2012 19:14
Picon

Re: db(3) removal and lastlogx

On Sun, Jul 22, 2012 at 08:27:27AM +0200, Alistair Crooks wrote:
 > And the compat issues - since we'll have to keep the db1 code in libc
 > - they're kinda difficult, especially if we have any statically-linked
 > programs which use termcap/terminfo, or the user databases, or
 > services, etc.

If we ever do the mythical libc bump, we'd like to be ready to remove
it then, i.e., have no more users within libc.

--

-- 
David A. Holland
dholland <at> netbsd.org

Joerg Sonnenberger | 22 Jul 2012 20:58
Picon

Re: db(3) removal and lastlogx

On Sun, Jul 22, 2012 at 08:27:27AM +0200, Alistair Crooks wrote:
> On Sat, Jul 21, 2012 at 11:28:38PM +0000, Christos Zoulas wrote:
> > In article <20120721175645.GM20677 <at> nef.pbox.org>,
> > Alistair Crooks  <agc <at> pkgsrc.org> wrote:
> > >
> > >Well, in lieu of any supporting arguments for the migration of db to cdb
> > >format, let's revert them all.
> > 
> > Aside the compatibility issues (which I believe are mostly fine), the cdb
> > changes for the read-only databases is a strict improvement.
> 
> Creates or reads, or both?

Both. I think we talked enough about creation already. The read path in
in cdbr is a slightly fancy memory mapped hash table. No locking
involved and all processes directly use the kernel file cache. db185
doesn't use mmap and has its own set of userland buffers. I imagine the
resulting thread-safety issues are one of the reason for the "giant
lock" in the nsdispatch layer, to name only one popular place.

> Recovery was another issue which was flagged under db1 as being
> problematic - how is it done for cdb?  How is it superior?  How
> does this relate to a readonly db? For readwrite dbs like the passwd
> ones?
> 
> Transactions - how do they relate to a readonly database? I've seen
> that used as justification too.

Just to reduce the general amount of confusion. Both points (recovery
and transaction handling) are about db185 being outdated in general, not
(Continue reading)

David Holland | 11 Jun 2012 18:55
Picon

Re: db(3) removal and lastlogx

On Fri, Jun 08, 2012 at 09:56:28PM +0200, Joerg Sonnenberger wrote:
 > On Thu, Jun 07, 2012 at 08:36:36AM +0200, Joerg Sonnenberger wrote:
 > > (1) Just use a sparse file. This requires by far the least amount of
 > > code, just some verification logic for a file header and writing to "uid
 > > * size of entry". Writes should be short enough to ensure atomic writes
 > > even on NFS. Same for reads. No manual locking needed.
 > 
 > After some thinking, I will go with this route. File format will look
 > like:
 > 
 > magic number
 > record size
 > 64KB of used bytes, set if any uid / 64K is present.

This seems pointless?

 > 64K blocks of:
 > 64KB of used bytes, set if uid / 64K matches the block number and uid %
 > 64K is present.

As does this.

 > 64K records of given size
 > 
 > Each record gets has a version number at the beginning.
 > 
 > lastlogx2 or however to call it gets created at boot, so that the next
 > time the format changes, statically linked programs can just continue to
 > write the old format, they might just not be able to read entries.
 > 
(Continue reading)

Manuel Bouyer | 11 Jun 2012 18:21

Re: db(3) removal and lastlogx

On Thu, Jun 07, 2012 at 08:36:36AM +0200, Joerg Sonnenberger wrote:
> Hi all,
> there is a bit mechnical work left to remove db(3) dependencies e.g. for
> the password database, but that's mostly that. The last real consume of
> the db(3) API in libc is the lastlogx handling in utmpx. Basically, this
> logs a record in a file indexed by uid. I can think of three different
> approaches for dealing with this:
> 
> (1) Just use a sparse file. This requires by far the least amount of
> code, just some verification logic for a file header and writing to "uid
> * size of entry". Writes should be short enough to ensure atomic writes
> even on NFS. Same for reads. No manual locking needed.

With 32bit uids, this will be very large files. We should avoid this.

--

-- 
Manuel Bouyer <bouyer <at> antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--

David Holland | 11 Jun 2012 18:50
Picon

Re: db(3) removal and lastlogx

On Mon, Jun 11, 2012 at 06:21:23PM +0200, Manuel Bouyer wrote:
 > On Thu, Jun 07, 2012 at 08:36:36AM +0200, Joerg Sonnenberger wrote:
 > > Hi all,
 > > there is a bit mechnical work left to remove db(3) dependencies e.g. for
 > > the password database, but that's mostly that. The last real consume of
 > > the db(3) API in libc is the lastlogx handling in utmpx. Basically, this
 > > logs a record in a file indexed by uid. I can think of three different
 > > approaches for dealing with this:
 > > 
 > > (1) Just use a sparse file. This requires by far the least amount of
 > > code, just some verification logic for a file header and writing to "uid
 > > * size of entry". Writes should be short enough to ensure atomic writes
 > > even on NFS. Same for reads. No manual locking needed.
 > 
 > With 32bit uids, this will be very large files. We should avoid this.

Only if you carefully choose your UID space relative to the file
system block size to maximize the amount of unused space that must
nonetheless be materialized. That is foolish.

Unless you expect to have a billion users...?

--

-- 
David A. Holland
dholland <at> netbsd.org

Michael van Elst | 11 Jun 2012 20:00
Picon

Re: db(3) removal and lastlogx

dholland-tech <at> NetBSD.org (David Holland) writes:

>Unless you expect to have a billion users...?

Unless it's not you who sets the UID space.

-- 
--

-- 
                                Michael van Elst
Internet: mlelstv <at> serpens.de
                                "A potential Snark may lurk in every tree."

Manuel Bouyer | 11 Jun 2012 20:07

Re: db(3) removal and lastlogx

On Mon, Jun 11, 2012 at 04:50:54PM +0000, David Holland wrote:
>  > With 32bit uids, this will be very large files. We should avoid this.
> 
> Only if you carefully choose your UID space relative to the file
> system block size to maximize the amount of unused space that must
> nonetheless be materialized. That is foolish.

My UID space is not dense, but I meant large by what's displayed with ls -l.

> 
> Unless you expect to have a billion users...?

No, just a few users with large uids.

--

-- 
Manuel Bouyer <bouyer <at> antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--

David Holland | 11 Jun 2012 20:35
Picon

Re: db(3) removal and lastlogx

On Mon, Jun 11, 2012 at 08:07:37PM +0200, Manuel Bouyer wrote:
 > On Mon, Jun 11, 2012 at 04:50:54PM +0000, David Holland wrote:
 > >  > With 32bit uids, this will be very large files. We should avoid this.
 > > 
 > > Only if you carefully choose your UID space relative to the file
 > > system block size to maximize the amount of unused space that must
 > > nonetheless be materialized. That is foolish.
 > 
 > My UID space is not dense, but I meant large by what's displayed with ls -l.

And that's important how?

--

-- 
David A. Holland
dholland <at> netbsd.org

Manuel Bouyer | 11 Jun 2012 20:46

Re: db(3) removal and lastlogx

On Mon, Jun 11, 2012 at 06:35:38PM +0000, David Holland wrote:
> On Mon, Jun 11, 2012 at 08:07:37PM +0200, Manuel Bouyer wrote:
>  > On Mon, Jun 11, 2012 at 04:50:54PM +0000, David Holland wrote:
>  > >  > With 32bit uids, this will be very large files. We should avoid this.
>  > > 
>  > > Only if you carefully choose your UID space relative to the file
>  > > system block size to maximize the amount of unused space that must
>  > > nonetheless be materialized. That is foolish.
>  > 
>  > My UID space is not dense, but I meant large by what's displayed with ls -l.
> 
> And that's important how?

- people usually uses ls -l to find what could free some space on a full
 filesystem
- some tools don't deal with sparse files.
- once a block has been allocated, it won't ever be freed unless you kill
  the file.

and I'm probably forgetting more problems with sparse files.

--

-- 
Manuel Bouyer <bouyer <at> antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--

David Holland | 24 Jun 2012 23:24
Picon

Re: db(3) removal and lastlogx

On Mon, Jun 11, 2012 at 08:46:21PM +0200, Manuel Bouyer wrote:
 >>>>> With 32bit uids, this will be very large files. We should avoid this.
 >>>> 
 >>>> Only if you carefully choose your UID space relative to the file
 >>>> system block size to maximize the amount of unused space that must
 >>>> nonetheless be materialized. That is foolish.
 >>> 
 >>> My UID space is not dense, but I meant large by what's displayed with ls -l.
 >> 
 >> And that's important how?
 > 
 > - people usually uses ls -l to find what could free some space on a full
 >  filesystem

They should know to use du.

 > - some tools don't deal with sparse files.

Which mostly doesn't matter. Especially with the lastlog file, which
is not exactly valuable data and can be (and routinely is) blown away
if a problem arises.

 > - once a block has been allocated, it won't ever be freed unless you kill
 >   the file.

It is easier to work around this than to worry about it, e.g. by
zeroing out lastlog records when accounts are removed, and then once a
year running a program that reclaims zero space as holes.

--

-- 
(Continue reading)

Manuel Bouyer | 6 Jul 2012 20:25

Re: db(3) removal and lastlogx

On Sun, Jun 24, 2012 at 09:24:52PM +0000, David Holland wrote:
> On Mon, Jun 11, 2012 at 08:46:21PM +0200, Manuel Bouyer wrote:
>  >>>>> With 32bit uids, this will be very large files. We should avoid this.
>  >>>> 
>  >>>> Only if you carefully choose your UID space relative to the file
>  >>>> system block size to maximize the amount of unused space that must
>  >>>> nonetheless be materialized. That is foolish.
>  >>> 
>  >>> My UID space is not dense, but I meant large by what's displayed with ls -l.
>  >> 
>  >> And that's important how?
>  > 
>  > - people usually uses ls -l to find what could free some space on a full
>  >  filesystem
> 
> They should know to use du.

You usually use du to find usage per-directory and ls -l in each
directory to find big files. 

> 
>  > - some tools don't deal with sparse files.
> 
> Which mostly doesn't matter. Especially with the lastlog file, which
> is not exactly valuable data and can be (and routinely is) blown away
> if a problem arises.

But before you've blown it away you created a problem.

> 
(Continue reading)

Christos Zoulas | 7 Jul 2012 01:50

Re: db(3) removal and lastlogx

In article <20120706182539.GA866 <at> antioche.eu.org>,
Manuel Bouyer  <bouyer <at> antioche.eu.org> wrote:
>
>So we have a new program to work around the issue.

That program is called tar. And see all the trouble sparse files have caused
to it.

>this shows that sparse files gives more troubles than it's worth.

Agreed.

christos

David Holland | 22 Jul 2012 19:20
Picon

Re: db(3) removal and lastlogx

On Fri, Jul 06, 2012 at 08:25:39PM +0200, Manuel Bouyer wrote:
 >>>>>>> With 32bit uids, this will be very large files. We should avoid this.
 >>>>>> 
 >>>>>> Only if you carefully choose your UID space relative to the file
 >>>>>> system block size to maximize the amount of unused space that must
 >>>>>> nonetheless be materialized. That is foolish.
 >>>>> 
 >>>>> My UID space is not dense, but I meant large by what's
 >>>>> displayed with ls -l.
 >>>> 
 >>>> And that's important how?
 >>> 
 >>> - people usually uses ls -l to find what could free some space on a full
 >>>  filesystem
 >> 
 >> They should know to use du.
 > 
 > You usually use du to find usage per-directory and ls -l in each
 > directory to find big files. 

Do I? I usually do du * | sort -n, but anyway, the argument here seems
to be that people who don't know what they're doing might be confused
by sparse files, so therefore they shouldn't be used.

This seems like a poor line of reasoning.

 >>> - some tools don't deal with sparse files.
 >> 
 >> Which mostly doesn't matter. Especially with the lastlog file, which
 >> is not exactly valuable data and can be (and routinely is) blown away
(Continue reading)

Michael van Elst | 11 Jun 2012 19:54
Picon

Re: db(3) removal and lastlogx

bouyer <at> antioche.eu.org (Manuel Bouyer) writes:

>With 32bit uids, this will be very large files. We should avoid this.

In particular because we went to use db to avoid sparse files.

-- 
--

-- 
                                Michael van Elst
Internet: mlelstv <at> serpens.de
                                "A potential Snark may lurk in every tree."

Manuel Bouyer | 11 Jun 2012 20:08

Re: db(3) removal and lastlogx

On Mon, Jun 11, 2012 at 05:54:51PM +0000, Michael van Elst wrote:
> bouyer <at> antioche.eu.org (Manuel Bouyer) writes:
> 
> >With 32bit uids, this will be very large files. We should avoid this.
> 
> In particular because we went to use db to avoid sparse files.

Yes, the proposed change sounds like a regression to me.

--

-- 
Manuel Bouyer <bouyer <at> antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--

Joerg Sonnenberger | 11 Jun 2012 23:31
Picon

Re: db(3) removal and lastlogx

On Mon, Jun 11, 2012 at 05:54:51PM +0000, Michael van Elst wrote:
> bouyer <at> antioche.eu.org (Manuel Bouyer) writes:
> 
> >With 32bit uids, this will be very large files. We should avoid this.
> 
> In particular because we went to use db to avoid sparse files.

It's impossible to tell from the history, which is why I brought this
up. Note that the lack of used markers in lastlog (not lastlogx) made it
hard to find which entries are present. That was a serious limitation of
the old format.

Joerg


Gmane