Eric Paris | 8 Aug 2012 21:31
Picon
Favicon

A filename to label translation daemon

We know that utilities like install disable their SELinux support
because of the enormous amount of time it takes to load the matchpathcon
regex database.  We know that systemd spends time loading the database
at least twice.  Other utilities like the krb5libs complain about the
size and time it takes to load the database.  We've added hacks (I
believe all in Fedora, but maybe upstream as well) which try to pare
down the database to some prefix(es) on database load.  If systemd only
needs to label in /var why load all the stuff for /etc?  These prefix
hacks don't work particularly well as fallback labels (such as
default_t) are hard to capture and the prefixes cannot be long as the
regexes are usually quite short.  They also don't work well with label
equivalencies.

So today I wrote a little daemon which listens in the abstract namespace
for requests and returns the context.  It really really rough, I admit,
but it works quite well.  My first perf numbers looking at /home/eparis
make sense:

$ ./initonce /home/eparis
 0.180 seconds used by the processor.
$ ./initalways /home/eparis
 19.200 seconds used by the processor.
$ ./client /home/eparis
 0.570 seconds used by the processor.

If I init the DB one time and do the same lookup (for /home/eparis) 1000
times it takes .18 seconds.  Doing 1000 lookups init-ing and fini-ing
the db every time it took 19.2.  Connecting to the server and asking
1000 times took .57 seconds.  This means that if you have to do about 48
lookups, it's faster to do your own init.  If <48, you should use the
(Continue reading)

Stephen Smalley | 8 Aug 2012 22:05
Picon

Re: A filename to label translation daemon

On Wed, 2012-08-08 at 15:31 -0400, Eric Paris wrote:
> We know that utilities like install disable their SELinux support
> because of the enormous amount of time it takes to load the matchpathcon
> regex database.  We know that systemd spends time loading the database
> at least twice.  Other utilities like the krb5libs complain about the
> size and time it takes to load the database.  We've added hacks (I
> believe all in Fedora, but maybe upstream as well) which try to pare
> down the database to some prefix(es) on database load.  If systemd only
> needs to label in /var why load all the stuff for /etc?  These prefix
> hacks don't work particularly well as fallback labels (such as
> default_t) are hard to capture and the prefixes cannot be long as the
> regexes are usually quite short.  They also don't work well with label
> equivalencies.
> 
> So today I wrote a little daemon which listens in the abstract namespace
> for requests and returns the context.  It really really rough, I admit,
> but it works quite well.  My first perf numbers looking at /home/eparis
> make sense:
> 
> $ ./initonce /home/eparis
>  0.180 seconds used by the processor.
> $ ./initalways /home/eparis
>  19.200 seconds used by the processor.
> $ ./client /home/eparis
>  0.570 seconds used by the processor.
> 
> If I init the DB one time and do the same lookup (for /home/eparis) 1000
> times it takes .18 seconds.  Doing 1000 lookups init-ing and fini-ing
> the db every time it took 19.2.  Connecting to the server and asking
> 1000 times took .57 seconds.  This means that if you have to do about 48
(Continue reading)

Daniel J Walsh | 8 Aug 2012 22:52
Picon
Favicon
Gravatar

Re: A filename to label translation daemon


On 08/08/2012 04:05 PM, Stephen Smalley wrote:
> On Wed, 2012-08-08 at 15:31 -0400, Eric Paris wrote:
>> We know that utilities like install disable their SELinux support because
>> of the enormous amount of time it takes to load the matchpathcon regex
>> database.  We know that systemd spends time loading the database at least
>> twice.  Other utilities like the krb5libs complain about the size and
>> time it takes to load the database.  We've added hacks (I believe all in
>> Fedora, but maybe upstream as well) which try to pare down the database
>> to some prefix(es) on database load.  If systemd only needs to label in
>> /var why load all the stuff for /etc?  These prefix hacks don't work
>> particularly well as fallback labels (such as default_t) are hard to
>> capture and the prefixes cannot be long as the regexes are usually quite
>> short.  They also don't work well with label equivalencies.
>> 
>> So today I wrote a little daemon which listens in the abstract namespace 
>> for requests and returns the context.  It really really rough, I admit, 
>> but it works quite well.  My first perf numbers looking at /home/eparis 
>> make sense:
>> 
>> $ ./initonce /home/eparis 0.180 seconds used by the processor. $
>> ./initalways /home/eparis 19.200 seconds used by the processor. $
>> ./client /home/eparis 0.570 seconds used by the processor.
>> 
>> If I init the DB one time and do the same lookup (for /home/eparis) 1000 
>> times it takes .18 seconds.  Doing 1000 lookups init-ing and fini-ing the
>> db every time it took 19.2.  Connecting to the server and asking 1000
>> times took .57 seconds.  This means that if you have to do about 48 
>> lookups, it's faster to do your own init.  If <48, you should use the 
>> server.
(Continue reading)

Eric Paris | 8 Aug 2012 22:55
Picon
Favicon

Re: A filename to label translation daemon

On Wed, 2012-08-08 at 16:05 -0400, Stephen Smalley wrote:
> On Wed, 2012-08-08 at 15:31 -0400, Eric Paris wrote:

> Not sure how this helps systemd, as it runs first (by definition) and
> loads the file_contexts configuration before it starts any other
> daemons, right?  Now if you wanted systemd to export this as a service
> to everything else, that might make sense.

If we agree a label daemon is useful and practical I'm sure we can find
a way to get systemd to either use it or be the label daemon.  It might
be as easy as getting systemd to activate it earlier than it normally
activates things.  Although maybe the code needs to live in systemd
itself.  I don't know right now.  I'm still in the, "is this as good an
idea as it seems" stage?

-Eric

Colin Walters | 8 Aug 2012 23:26
Gravatar

Re: A filename to label translation daemon

Seems to make sense...though someone could also probably get fairly far
by writing a regular expression optimizer.  It might not even be that
hard to write a multi-regexp matching engine which took a set of regexps
at once and constructed a single matching DFA for them.

On Wed, 2012-08-08 at 15:31 -0400, Eric Paris wrote:
>         /* just to be safe! */
>         buffer[MAX_REQUEST_LEN] = '\0'; 

Should be buffer[len] = '\0';  right?

>                /* calculating the length of the address is magic since it starts with nul.
>                * there be dragons in here! */

See also https://bugzilla.gnome.org/show_bug.cgi?id=615960

Russell Coker | 9 Aug 2012 16:37
Picon

Re: A filename to label translation daemon

On Thu, 9 Aug 2012, Colin Walters <walters@...> wrote:
> Seems to make sense...though someone could also probably get fairly far
> by writing a regular expression optimizer.  It might not even be that
> hard to write a multi-regexp matching engine which took a set of regexps
> at once and constructed a single matching DFA for them.

Is this really going to help?  My slowest system is a P3-866 which takes less 
than 30ms of user time for "restorecon /bin/bash" and takes a total of 136ms 
of wall time if the cache is cold.  On a 1.8GHz 64bit system it's only 8ms of 
user time.

What benefit are we expecting to get here?

--

-- 
My Main Blog         http://etbe.coker.com.au/
My Documents Blog    http://doc.coker.com.au/

Daniel J Walsh | 9 Aug 2012 19:06
Picon
Favicon
Gravatar

Re: A filename to label translation daemon


On 08/09/2012 10:37 AM, Russell Coker wrote:
> On Thu, 9 Aug 2012, Colin Walters <walters@...> wrote:
>> Seems to make sense...though someone could also probably get fairly far 
>> by writing a regular expression optimizer.  It might not even be that 
>> hard to write a multi-regexp matching engine which took a set of regexps 
>> at once and constructed a single matching DFA for them.
> 
> Is this really going to help?  My slowest system is a P3-866 which takes
> less than 30ms of user time for "restorecon /bin/bash" and takes a total of
> 136ms of wall time if the cache is cold.  On a 1.8GHz 64bit system it's
> only 8ms of user time.
> 
> What benefit are we expecting to get here?
> 
kerberos library currently does a matchpathcon on /tmp/BLAH files and sets the
label correctly.  With this change in the library we are seeing huge
performance hits of apache services caused by loading the regex.

Running make install has caused a huge hit if you are running thousands of
install commands which caused the remove of labeling from the install command.

Systemd has been is executing the load load many many times and is showing up
to 1 second slow down on startup.  If the startup is 10 seconds, it is kind of
hard to justify 10% slowdown on boot.

I believe we just add support for this service and have the labeling fall back
to the default if the labeling socket does not exists, and then distributions
can decide whether or not they want to use it.
(Continue reading)

Colin Walters | 9 Aug 2012 19:51
Gravatar

Re: A filename to label translation daemon

On Thu, 2012-08-09 at 13:06 -0400, Daniel J Walsh wrote:

> I believe we just add support for this service and have the labeling fall back
> to the default if the labeling socket does not exists, and then distributions
> can decide whether or not they want to use it.

There are other possible intermediate steps though - for example,
caching the precompiled regular expressions in a file accessible via
mmap().

Basically:

* Your mmap file is in some data format - you can make up your own, but
  I like using http://developer.gnome.org/glib/stable/glib-GVariant.html
* Check the timestamp on the regexp text file versus the cached copy, if
  newer, use the text file
* Otherwise, mmap the cached blob, loop through each regexp, passing
  a pointer to the mmap cache file for regexec()

The mmap cache file would probably need to be tied to a specific version
of glibc though; you wouldn't want to upgrade and use old compiled
regexps that the new glibc doesn't understand.

Russell Coker | 10 Aug 2012 04:28
Picon

Re: A filename to label translation daemon

On Fri, 10 Aug 2012, Daniel J Walsh <dwalsh@...> wrote:
> On 08/09/2012 10:37 AM, Russell Coker wrote:
> > On Thu, 9 Aug 2012, Colin Walters <walters@...> wrote:
> >> Seems to make sense...though someone could also probably get fairly far
> >> by writing a regular expression optimizer.  It might not even be that
> >> hard to write a multi-regexp matching engine which took a set of regexps
> >> at once and constructed a single matching DFA for them.
> > 
> > Is this really going to help?  My slowest system is a P3-866 which takes
> > less than 30ms of user time for "restorecon /bin/bash" and takes a total
> > of 136ms of wall time if the cache is cold.  On a 1.8GHz 64bit system
> > it's only 8ms of user time.
> > 
> > What benefit are we expecting to get here?
> 
> kerberos library currently does a matchpathcon on /tmp/BLAH files and sets
> the label correctly.  With this change in the library we are seeing huge
> performance hits of apache services caused by loading the regex.

What is kerberos doing under /tmp and why is it being done repeatedly by 
different processes?

> Running make install has caused a huge hit if you are running thousands of
> install commands which caused the remove of labeling from the install
> command.

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=638304

I believe that is a design bug in the SE Linux code in install, I've filed the 
above Debian bug report about it.
(Continue reading)

Daniel J Walsh | 10 Aug 2012 14:39
Picon
Favicon
Gravatar

Re: A filename to label translation daemon


On 08/09/2012 10:28 PM, Russell Coker wrote:
> On Fri, 10 Aug 2012, Daniel J Walsh <dwalsh@...> wrote:
>> On 08/09/2012 10:37 AM, Russell Coker wrote:
>>> On Thu, 9 Aug 2012, Colin Walters <walters@...> wrote:
>>>> Seems to make sense...though someone could also probably get fairly
>>>> far by writing a regular expression optimizer.  It might not even be
>>>> that hard to write a multi-regexp matching engine which took a set of
>>>> regexps at once and constructed a single matching DFA for them.
>>> 
>>> Is this really going to help?  My slowest system is a P3-866 which
>>> takes less than 30ms of user time for "restorecon /bin/bash" and takes
>>> a total of 136ms of wall time if the cache is cold.  On a 1.8GHz 64bit
>>> system it's only 8ms of user time.
>>> 
>>> What benefit are we expecting to get here?
>> 
>> kerberos library currently does a matchpathcon on /tmp/BLAH files and
>> sets the label correctly.  With this change in the library we are seeing
>> huge performance hits of apache services caused by loading the regex.
> 
> What is kerberos doing under /tmp and why is it being done repeatedly by 
> different processes?
> 
Actually /var/tmp/HOST_0 /var/tmp/HTTP_23 ...  Kerberos Replay Cache.   Every
time someone contacts an apache server using kerberos it needs to update this
file, it does this via mktemp (/tmpHTTPD_23XXXX), rename.

/var/cache/krb5rcache(/.*)?	system_u:object_r:krb5_host_rcache_t:s0
/var/tmp/nfs_0	--	system_u:object_r:krb5_host_rcache_t:s0
(Continue reading)

Russell Coker | 10 Aug 2012 15:35
Picon

Re: A filename to label translation daemon

On Fri, 10 Aug 2012, Daniel J Walsh <dwalsh@...> wrote:
> > What is kerberos doing under /tmp and why is it being done repeatedly by
> > different processes?
> 
> Actually /var/tmp/HOST_0 /var/tmp/HTTP_23 ...  Kerberos Replay Cache.  
> Every time someone contacts an apache server using kerberos it needs to
> update this file, it does this via mktemp (/tmpHTTPD_23XXXX), rename.

When replacing an existing file wouldn't it be better to just copy the context 
of the existing file when creating the replacement?  If there was some good 
reason for running chcon on such a file (and I can't imagine a reason but it's 
best to leave the options open IMHO) then having the context change back the 
next time someone connects seems like a bug.

> >> Running make install has caused a huge hit if you are running thousands
> >> of install commands which caused the remove of labeling from the install
> >> command.
> > 
> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=638304
> > 
> > I believe that is a design bug in the SE Linux code in install, I've
> > filed the above Debian bug report about it.
> > 
> > I think that correct design of install wouldn't have a "make install"
> > performed as part of a dpkg or rpm build do any SE Linux checks.  That
> > would be faster than any other option.
> 
> Programmers and testers regularly run make install and this ends up badly
> mislabling files all over the place, telling everyone they have to use rpm
> or dpkg is not going to fly.
(Continue reading)

Daniel J Walsh | 12 Aug 2012 13:02
Picon
Favicon
Gravatar

Re: A filename to label translation daemon


On 08/10/2012 09:35 AM, Russell Coker wrote:
> On Fri, 10 Aug 2012, Daniel J Walsh <dwalsh@...> wrote:
>>> What is kerberos doing under /tmp and why is it being done repeatedly
>>> by different processes?
>> 
>> Actually /var/tmp/HOST_0 /var/tmp/HTTP_23 ...  Kerberos Replay Cache. 
>> Every time someone contacts an apache server using kerberos it needs to 
>> update this file, it does this via mktemp (/tmpHTTPD_23XXXX), rename.
> 
> When replacing an existing file wouldn't it be better to just copy the
> context of the existing file when creating the replacement?  If there was
> some good reason for running chcon on such a file (and I can't imagine a
> reason but it's best to leave the options open IMHO) then having the
> context change back the next time someone connects seems like a bug.
> 
They use setfscreatecon, if file does not exist, it gets labeled incorrectly.
>>>> Running make install has caused a huge hit if you are running
>>>> thousands of install commands which caused the remove of labeling
>>>> from the install command.
>>> 
>>> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=638304
>>> 
>>> I believe that is a design bug in the SE Linux code in install, I've 
>>> filed the above Debian bug report about it.
>>> 
>>> I think that correct design of install wouldn't have a "make install" 
>>> performed as part of a dpkg or rpm build do any SE Linux checks.  That 
>>> would be faster than any other option.
>> 
(Continue reading)


Gmane