Willy Tarreau | 11 Jun 01:17 2012
Picon

Significantly reducing headers footprint

Hi,

I recently managed to collect requests from some enterprise proxies to
experiment with binary encoding as described in our draft [1].

After some experimentation and discussions with some people, I managed to
get significant gains [2] which could still be improved.

What's currently performed is the following :
  - message framing
  - binary encoding of the HTTP version (2 bits)
  - binary encoding of the method (4 bits)
  - move Host header to the URI
  - encoding of the URI relative to the previous one
  - binary encoding of each header field names (1 byte)
  - encoding of each header relative to the previous one.
  - binary encoding of the If-Modified-Since date

The code achieving this is available at [2]. It's an ugly PoC but it's
a useful experimentation tool for me, feel free to use it to experiment
with your own implementations if you like.

I'm already observing request compression ratios of 90-92% on various
requests, including on a site with a huge page with large cookies and
URIs ; 132 kB of requests were reduced to 10kB. In fact while the draft
suggests use of multiple header contexts (connection, common and message),
now I'm feeling like we don't need to store 3 contexts anymore, one single
is enough if requests remain relative to previous one.

But I think that by typing a bit more the protocol, we could improve even
(Continue reading)

Roberto Peon | 11 Jun 01:39 2012
Picon

Re: Significantly reducing headers footprint



On Sun, Jun 10, 2012 at 4:17 PM, Willy Tarreau <w <at> 1wt.eu> wrote:
Hi,

I recently managed to collect requests from some enterprise proxies to
experiment with binary encoding as described in our draft [1].

After some experimentation and discussions with some people, I managed to
get significant gains [2] which could still be improved.

What's currently performed is the following :
 - message framing
 - binary encoding of the HTTP version (2 bits)
 - binary encoding of the method (4 bits)
 - move Host header to the URI
 - encoding of the URI relative to the previous one
 - binary encoding of each header field names (1 byte)
 - encoding of each header relative to the previous one.
 - binary encoding of the If-Modified-Since date

The code achieving this is available at [2]. It's an ugly PoC but it's
a useful experimentation tool for me, feel free to use it to experiment
with your own implementations if you like.

I'm already observing request compression ratios of 90-92% on various
requests, including on a site with a huge page with large cookies and
URIs ; 132 kB of requests were reduced to 10kB. In fact while the draft
suggests use of multiple header contexts (connection, common and message),
now I'm feeling like we don't need to store 3 contexts anymore, one single
is enough if requests remain relative to previous one.

For my deployment, I'm fairly certain this would not be all that common.
Two contexts may be enough 'connection' and 'common', but I think you had it right the first time.
The more clients you have and are aggregating through to elsewhere, to more advantageous that scheme becomes.
 

But I think that by typing a bit more the protocol, we could improve even
further and at the same time improve interoperability. Among the things
I am observing which still take some space in the page load of an online
newspaper (127 objects, data were anonymized) :

 - User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; fr; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12
   => Well, this one is only sent once over the connection, but we could
      reduce this further by using a registery of known vendors/products
      and incite vendors to emit just a few bytes (vendor/product/version).

 - Accept: text/css,*/*;q=0.1
   => this one changes depending on what object the browser requests, so it
      is less efficiently compressed :

       1 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
       4 Accept: text/css,*/*;q=0.1
       8 Accept: */*
       1 Accept: image/png,image/*;q=0.8,*/*;q=0.5
       2 Accept: */*
       9 Accept: image/png,image/*;q=0.8,*/*;q=0.5
       2 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
      90 Accept: image/png,image/*;q=0.8,*/*;q=0.5
       1 Accept: */*
       9 Accept: image/png,image/*;q=0.8,*/*;q=0.5

   => With better request reordering, we could have this :

      11 Accept: */*
     109 Accept: image/png,image/*;q=0.8,*/*;q=0.5
       4 Accept: text/css,*/*;q=0.1
       3 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8

Achieving this seems difficult? How would we get a reording to occur in a reasonable manner?
 

   I'm already wondering if we have *that* many content-types and if we need
   to use long words such as "application" everywhere.

We were quite wordy in the past :)
 

 - Accept-Language: fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3
   Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
   Accept-Encoding: gzip,deflate

   => Same comment as above concerning the number of possible values. However
      these ones were all sent identical so the gain is more for the remote
      parser than for the upstream link.

 - Referer: http://www.example.com/
   => referrers do compress quite well relative to each other. Still there
      are many blogs and newspapers on the net today with very large URLs,
      and their URLs cause very large referrers to be sent along with each
      object composing the page. At least a better ordering of the requests
      saves a few more hundred bytes for the whole page. In the end I only
      got 4 different values :
      http://www.example.com/
      http://www.example.com/sites/news/files/css/css_RWicSr_h9UxCJrAbE57UbNf_oNYhtaF5YghFXJemVNQ.css
      http://www.example.com/sites/news/files/css/css_lKoFARDAyB20ibb5wNG8nMhflDNNW_Nb9DsNprYt8mk.css
      http://www.example.com/sites/news/files/css/css_qSyFGRLc-tslOV1oF9GCzEe1eGDn4PP7vOM1HGymNYU.css

   Among the improvements I'm thinking about, we could decide to use relative
   URIs when the site is the same. I don't know either if it's of any use on
   the server side to know that the request was emitted for a specific CSS.

 - If-Modified-Since: Fri, 27 Apr 2012 14:41:31 GMT
   => I have encoded this one on 32 and 64 bits and immediately saved 3.1 and
      2.6 kB respectively. Well, storing 4 more bytes per request might be
      wasted considering that we probably don't need a nanosecond resolution
      for 585 years. But 40-48 bits might be fine.

 - Cache-Control: max-age=0
   => I suspect the user hit the Refresh button, this was present in about
      half the requests. Anyway, this raises the question of the length it
      requires for something which is just a boolean here ("ignore cache").
      Probably that a client has very few Cache-Control header values to
      send, and that reducing this to a smaller set would be beneficial.

 - If-None-Match: "3013140661"
   => I guess there is nothing we can do on this one, except suggest that
      implementors use more bits and less bytes to emit their etags.

 - Cookie: xtvrn=$OaiJty$; xtan327981=c; xtant327981=c; has_js=c; __utma=KBjWnx24Q.7qFKqmB7v.i0JDH91L_R.0kU2W1uL49.JM4KtFLV0b.C; __utmc=Rae9ZgQHz; __utmz=NRSZOcCWV.d5MlK5RJsi.-.f.N8J73w=S1SLuT_j0m.O8|VsIxwE=(jHw58obb)|r9SgsT=WQfZe8jr|pFSZGH=/ <at> /qwDyMw3I; __gads=td=ASP_D5ml4Ebevrej:R=pvxltafqZK:x=E4FUn3YiNldW3rhxzX6YlCptZp8zF-b5qc; _chartbeat2=oQvb8k_G9tduhauf.LqOukjnlaaE7K.uDBaR79E1WT4t.Kr9L_lIrOtruE8; __qca=LC9oiRpFSWShYlxUtD37GJ2k8AL; __utmb=vG8UMEjrz.Qf.At.pXD61lUeHZ; pm8196_1=c; pm8194_1=c

   => amazingly, this one compresses extremely well with the above scheme,
      because additions are performed at the end so consecutive cookies keep
      a lot in common, and changes are not too frequent. However, given the
      omnipresent usage of cookies, I was wondering why we should not create
      a new entity of its own for the cookies instead of abusing the Cookie
      header. It would make it a lot easier for both ends to find what they
      need. For instance, a load balancer just needs to find a server name
      in the thing above. What a waste of on-wire bits and of CPU cycles !

You're suggesting breaking the above into smaller, addressable bits?
 

BTW, binary encoding would probably also help addressing a request I often
hear in banking environments : the need to sign/encrypt/compress only certain
headers or cookies. Right now when people do this, they have to base64-encode
the result, which is another transformation at both ends and inflates the
data. If we make provisions in the protocol for announcing encrypted or
compressed headers using 2-3 bits, it might become more usable. I'm not
convinced it provides any benefit between a browser and an origin server
though. So maybe it will remain application-specific and the transport
just has to make it easier to emit 8-bit data in header field values.

 
Has anyone any opinion on the subject above ? Or ideas about other things
that terribly clobber the upstream pipe and that should be fixed in 2.0 ?

I like binary framing because it is significantly easier to get right and works well when we're considering things other than just plain HTTP.
Token-based parsing is quite annoying in comparison-- it either requires significant implementation complexity to minimize memory. With length-based framing, the implementation complexity is decreased arguably for everyone and certainly in cases where you wish to be efficient with buffers.

-=R


I hope I'll soon find some time to update our draft to reflect recent updates
and findings.

Regards,
Willy

--
[1] http://tools.ietf.org/id/draft-tarreau-httpbis-network-friendly-00.txt
[2] http://1wt.eu/http2/



Willy Tarreau | 11 Jun 11:16 2012
Picon

Re: Significantly reducing headers footprint

Hi Roberto !

I was sure you would be the first to respond :-)

On Sun, Jun 10, 2012 at 04:39:37PM -0700, Roberto Peon wrote:
> > I'm already observing request compression ratios of 90-92% on various
> > requests, including on a site with a huge page with large cookies and
> > URIs ; 132 kB of requests were reduced to 10kB. In fact while the draft
> > suggests use of multiple header contexts (connection, common and message),
> > now I'm feeling like we don't need to store 3 contexts anymore, one single
> > is enough if requests remain relative to previous one.
> >
> 
> For my deployment, I'm fairly certain this would not be all that common.
> Two contexts may be enough 'connection' and 'common', but I think you had
> it right the first time.

Connection indeed has some uses, but we found that these are sometimes
limited. Between a client and a server, the UA and connection information
may be transmitted. Whether it's transmitted as a connection-specific header
or as a normal header that is retained for all other messages doesn't make a
difference.

For a proxy, connection headers may be used to transmit Via and the
Forwarded-For header. This last one goes away if connections are
multiplexed between multiple clients.

Concerning the merge oc common+message into message, I found in the traffic
I analysed that a number of header fields are transmitted for a few requests
in a row only. Initially I thought that sending a set of headers which are
planned to be common for multiple consecutive requests was the way to do it.
But after seeing the traces, I'm realizing that sending differences between
consecutive requests achieves the same result with more flexibility and better
resistance to frequent changes. Also, one of the difficulties for a proxy was
to decide what to put into the common section. By only sending differences
between requests, this problem doesn't exist anymore.

> The more clients you have and are aggregating through to elsewhere, to more
> advantageous that scheme becomes.

Warning, for me there always was only one common section, since we can't make
a server support an infinite number of contexts.

> >  - Accept: text/css,*/*;q=0.1
> >    => this one changes depending on what object the browser requests, so it
> >       is less efficiently compressed :
> >
> >        1 Accept:
> > text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
> >        4 Accept: text/css,*/*;q=0.1
> >        8 Accept: */*
> >        1 Accept: image/png,image/*;q=0.8,*/*;q=0.5
> >        2 Accept: */*
> >        9 Accept: image/png,image/*;q=0.8,*/*;q=0.5
> >        2 Accept:
> > text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
> >       90 Accept: image/png,image/*;q=0.8,*/*;q=0.5
> >        1 Accept: */*
> >        9 Accept: image/png,image/*;q=0.8,*/*;q=0.5
> >
> >    => With better request reordering, we could have this :
> >
> >       11 Accept: */*
> >      109 Accept: image/png,image/*;q=0.8,*/*;q=0.5
> >        4 Accept: text/css,*/*;q=0.1
> >        3 Accept:
> > text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
> >
> 
> Achieving this seems difficult? How would we get a reording to occur in a
> reasonable manner?

I don't think it's that difficult, but I'm not a browser developer and I'm
sure they're facing a huge amount of complex issues. For instance, maybe
it's not always possible to fetch all images at a time, or to fetch css
first then images. I must say I don't know :-/

> >    I'm already wondering if we have *that* many content-types and if we
> > need
> >    to use long words such as "application" everywhere.
> 
> We were quite wordy in the past :)

Yes, indeed.

> >  - Cookie: xtvrn=$OaiJty$; xtan327981=c; xtant327981=c; has_js=c;
> > __utma=KBjWnx24Q.7qFKqmB7v.i0JDH91L_R.0kU2W1uL49.JM4KtFLV0b.C;
> > __utmc=Rae9ZgQHz;
> > __utmz=NRSZOcCWV.d5MlK5RJsi.-.f.N8J73w=S1SLuT_j0m.O8|VsIxwE=(jHw58obb)|r9SgsT=WQfZe8jr|pFSZGH=/ <at> /qwDyMw3I;
> > __gads=td=ASP_D5ml4Ebevrej:R=pvxltafqZK:x=E4FUn3YiNldW3rhxzX6YlCptZp8zF-b5qc;
> > _chartbeat2=oQvb8k_G9tduhauf.LqOukjnlaaE7K.uDBaR79E1WT4t.Kr9L_lIrOtruE8;
> > __qca=LC9oiRpFSWShYlxUtD37GJ2k8AL; __utmb=vG8UMEjrz.Qf.At.pXD61lUeHZ;
> > pm8196_1=c; pm8194_1=c
> >
> >    => amazingly, this one compresses extremely well with the above scheme,
> >       because additions are performed at the end so consecutive cookies
> > keep
> >       a lot in common, and changes are not too frequent. However, given the
> >       omnipresent usage of cookies, I was wondering why we should not
> > create
> >       a new entity of its own for the cookies instead of abusing the Cookie
> >       header. It would make it a lot easier for both ends to find what they
> >       need. For instance, a load balancer just needs to find a server name
> >       in the thing above. What a waste of on-wire bits and of CPU cycles !
> >
> 
> You're suggesting breaking the above into smaller, addressable bits?

Yes possibly. I'm not completely sure yet because the overhead of "; =" is
small. That said, we're seeing many hex-encoded or base64-encoded cookies
everywhere, and such use cases would benefit from being length-delimited
and support binary contents.

> > Has anyone any opinion on the subject above ? Or ideas about other things
> > that terribly clobber the upstream pipe and that should be fixed in 2.0 ?
> 
> I like binary framing because it is significantly easier to get right and
> works well when we're considering things other than just plain HTTP.
>
> Token-based parsing is quite annoying in comparison-- it either requires
> significant implementation complexity to minimize memory.

And it forces us to support border-line variants (eg: LF vs CRLF, case
matching, support of empty names and spaces around names, etc...). And it
requires the recipient to parse data it doesn't care about, just to find
delimiters.

> With length-based
> framing, the implementation complexity is decreased arguably for everyone
> and certainly in cases where you wish to be efficient with buffers.

Exactly. And it's harder to get it wrong :-)

Thanks,
Willy

patrick mcmanus | 11 Jun 20:50 2012

Re: Significantly reducing headers footprint

On 6/11/2012 5:16 AM, Willy Tarreau wrote:
> On Sun, Jun 10, 2012 at 04:39:37PM -0700, Roberto Peon wrote:
>     =>  With better request reordering, we could have this :
>
>        11 Accept: */*
>       109 Accept: image/png,image/*;q=0.8,*/*;q=0.5
>         4 Accept: text/css,*/*;q=0.1
>         3 Accept:
> text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
>
>> Achieving this seems difficult? How would we get a reording to occur in a
>> reasonable manner?
> I don't think it's that difficult, but I'm not a browser developer and I'm
> sure they're facing a huge amount of complex issues. For instance, maybe
> it's not always possible to fetch all images at a time, or to fetch css
> first then images. I must say I don't know :-/
>

reordering adds latency in order to discover the full set of things that 
should be reordered when you're doing streamed parsing. This is a 
situation SPDY actually improves - in HTTP/1 you might not send a 
resource request as soon as you discover it (adding latency) in order to 
speculatively preserve bandwidth for resources you hope will be 
discovered "soon".. in spdy you can just send them all asap with 
appropriate priorities attached to manage the bandwidth - reintroducing 
a motivation for queuing is undesirable imo.

So this seems like an un-necessary constraint to solve a situation that 
gzip windows already effectively address.

Willy Tarreau | 12 Jun 00:10 2012
Picon

Re: Significantly reducing headers footprint

Hi Patrick,

On Mon, Jun 11, 2012 at 02:50:45PM -0400, patrick mcmanus wrote:
> On 6/11/2012 5:16 AM, Willy Tarreau wrote:
> >On Sun, Jun 10, 2012 at 04:39:37PM -0700, Roberto Peon wrote:
> >    =>  With better request reordering, we could have this :
> >
> >       11 Accept: */*
> >      109 Accept: image/png,image/*;q=0.8,*/*;q=0.5
> >        4 Accept: text/css,*/*;q=0.1
> >        3 Accept:
> >text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
> >
> >>Achieving this seems difficult? How would we get a reording to occur in a
> >>reasonable manner?
> >I don't think it's that difficult, but I'm not a browser developer and I'm
> >sure they're facing a huge amount of complex issues. For instance, maybe
> >it's not always possible to fetch all images at a time, or to fetch css
> >first then images. I must say I don't know :-/
> >
> 
> reordering adds latency in order to discover the full set of things that 
> should be reordered when you're doing streamed parsing.

This is more or less what I was suspecting, but of course it's better
with your explanation !

> This is a 
> situation SPDY actually improves - in HTTP/1 you might not send a 
> resource request as soon as you discover it (adding latency) in order to 
> speculatively preserve bandwidth for resources you hope will be 
> discovered "soon".. in spdy you can just send them all asap with 
> appropriate priorities attached to manage the bandwidth - reintroducing 
> a motivation for queuing is undesirable imo.
> 
> So this seems like an un-necessary constraint to solve a situation that 
> gzip windows already effectively address.

OK thanks for your insights !

Willy

Mike Belshe | 11 Jun 16:32 2012

Re: Significantly reducing headers footprint

This is good work, Willy.


Any perf results on how much this will impact the user?  Given the stateful nature of gzip already in use, I'm betting this has almost no impact for most users?  

There is a tradeoff; completely custom compression will introduce more interop issues.  Registries of "well known headers" are notoriously painful to maintain and keep versioned.

A few more comments below:


On Sun, Jun 10, 2012 at 4:17 PM, Willy Tarreau <w <at> 1wt.eu> wrote:
Hi,

I recently managed to collect requests from some enterprise proxies to
experiment with binary encoding as described in our draft [1].

After some experimentation and discussions with some people, I managed to
get significant gains [2] which could still be improved.

What's currently performed is the following :
 - message framing
 - binary encoding of the HTTP version (2 bits)
 - binary encoding of the method (4 bits)
 - move Host header to the URI
 - encoding of the URI relative to the previous one
 - binary encoding of each header field names (1 byte)
 - encoding of each header relative to the previous one.
 - binary encoding of the If-Modified-Since date

The code achieving this is available at [2]. It's an ugly PoC but it's
a useful experimentation tool for me, feel free to use it to experiment
with your own implementations if you like.

I'm already observing request compression ratios of 90-92% on various
requests, including on a site with a huge page with large cookies and
URIs ; 132 kB of requests were reduced to 10kB. In fact while the draft
suggests use of multiple header contexts (connection, common and message),
now I'm feeling like we don't need to store 3 contexts anymore, one single
is enough if requests remain relative to previous one.

But I think that by typing a bit more the protocol, we could improve even
further and at the same time improve interoperability. Among the things
I am observing which still take some space in the page load of an online
newspaper (127 objects, data were anonymized) :

 - User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; fr; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12
   => Well, this one is only sent once over the connection, but we could
      reduce this further by using a registery of known vendors/products
      and incite vendors to emit just a few bytes (vendor/product/version).

I don't think the compressor should be learning about vendor-specific information.  This gives advantages to certain browser incumbents and is unfair to startups.  We absolutely MUST NOT give advantages to the current popular browsers.


 - Accept: text/css,*/*;q=0.1
   => this one changes depending on what object the browser requests, so it
      is less efficiently compressed :

       1 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
       4 Accept: text/css,*/*;q=0.1
       8 Accept: */*
       1 Accept: image/png,image/*;q=0.8,*/*;q=0.5
       2 Accept: */*
       9 Accept: image/png,image/*;q=0.8,*/*;q=0.5
       2 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
      90 Accept: image/png,image/*;q=0.8,*/*;q=0.5
       1 Accept: */*
       9 Accept: image/png,image/*;q=0.8,*/*;q=0.5

   => With better request reordering, we could have this :

      11 Accept: */*
     109 Accept: image/png,image/*;q=0.8,*/*;q=0.5
       4 Accept: text/css,*/*;q=0.1
       3 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8

As long as the browser uses the same accept header from request to request (which it generally does), this compresses to almost zero after the first header block.
 

   I'm already wondering if we have *that* many content-types and if we need
   to use long words such as "application" everywhere.

 - Accept-Language: fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3
   Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
   Accept-Encoding: gzip,deflate

   => Same comment as above concerning the number of possible values. However
      these ones were all sent identical so the gain is more for the remote
      parser than for the upstream link.

 - Referer: http://www.example.com/
   => referrers do compress quite well relative to each other. Still there
      are many blogs and newspapers on the net today with very large URLs,
      and their URLs cause very large referrers to be sent along with each
      object composing the page. At least a better ordering of the requests
      saves a few more hundred bytes for the whole page. In the end I only
      got 4 different values :
      http://www.example.com/
      http://www.example.com/sites/news/files/css/css_RWicSr_h9UxCJrAbE57UbNf_oNYhtaF5YghFXJemVNQ.css
      http://www.example.com/sites/news/files/css/css_lKoFARDAyB20ibb5wNG8nMhflDNNW_Nb9DsNprYt8mk.css
      http://www.example.com/sites/news/files/css/css_qSyFGRLc-tslOV1oF9GCzEe1eGDn4PP7vOM1HGymNYU.css

   Among the improvements I'm thinking about, we could decide to use relative
   URIs when the site is the same. I don't know either if it's of any use on
   the server side to know that the request was emitted for a specific CSS.

 - If-Modified-Since: Fri, 27 Apr 2012 14:41:31 GMT
   => I have encoded this one on 32 and 64 bits and immediately saved 3.1 and
      2.6 kB respectively. Well, storing 4 more bytes per request might be
      wasted considering that we probably don't need a nanosecond resolution
      for 585 years. But 40-48 bits might be fine.

 - Cache-Control: max-age=0
   => I suspect the user hit the Refresh button, this was present in about
      half the requests. Anyway, this raises the question of the length it
      requires for something which is just a boolean here ("ignore cache").
      Probably that a client has very few Cache-Control header values to
      send, and that reducing this to a smaller set would be beneficial.

Trying to change the motivation or semantics of headers is a large endeavor....  Not sure if the compression of the bits is the right motivation for doing so.



 - If-None-Match: "3013140661"
   => I guess there is nothing we can do on this one, except suggest that
      implementors use more bits and less bytes to emit their etags.

 - Cookie: xtvrn=$OaiJty$; xtan327981=c; xtant327981=c; has_js=c; __utma=KBjWnx24Q.7qFKqmB7v.i0JDH91L_R.0kU2W1uL49.JM4KtFLV0b.C; __utmc=Rae9ZgQHz; __utmz=NRSZOcCWV.d5MlK5RJsi.-.f.N8J73w=S1SLuT_j0m.O8|VsIxwE=(jHw58obb)|r9SgsT=WQfZe8jr|pFSZGH=/ <at> /qwDyMw3I; __gads=td=ASP_D5ml4Ebevrej:R=pvxltafqZK:x=E4FUn3YiNldW3rhxzX6YlCptZp8zF-b5qc; _chartbeat2=oQvb8k_G9tduhauf.LqOukjnlaaE7K.uDBaR79E1WT4t.Kr9L_lIrOtruE8; __qca=LC9oiRpFSWShYlxUtD37GJ2k8AL; __utmb=vG8UMEjrz.Qf.At.pXD61lUeHZ; pm8196_1=c; pm8194_1=c

   => amazingly, this one compresses extremely well with the above scheme,
      because additions are performed at the end so consecutive cookies keep
      a lot in common, and changes are not too frequent. However, given the
      omnipresent usage of cookies, I was wondering why we should not create
      a new entity of its own for the cookies instead of abusing the Cookie
      header. It would make it a lot easier for both ends to find what they
      need. For instance, a load balancer just needs to find a server name
      in the thing above. What a waste of on-wire bits and of CPU cycles !

BTW, binary encoding would probably also help addressing a request I often
hear in banking environments : the need to sign/encrypt/compress only certain
headers or cookies. Right now when people do this, they have to base64-encode
the result, which is another transformation at both ends and inflates the
data. If we make provisions in the protocol for announcing encrypted or
compressed headers using 2-3 bits, it might become more usable. I'm not
convinced it provides any benefit between a browser and an origin server
though. So maybe it will remain application-specific and the transport
just has to make it easier to emit 8-bit data in header field values.

Happens all the time, yes.  Just make sure that HTTP2 -> HTTP1.1 definition is preserved so that gateways still work.
 

Has anyone any opinion on the subject above ? Or ideas about other things
that terribly clobber the upstream pipe and that should be fixed in 2.0 ?

I hope I'll soon find some time to update our draft to reflect recent updates
and findings.

Again, I think we could spend a lot of time debating the compressor.  And with one more registry or one more semantic header change from HTTP, there will always be one more bit to compress out.  But these are, IMHO, already diminishing returns for performance.  I hope we'll all focus on the more important parts of the protocol (flow control, security, 1.x to 2.x upgrades, etc) than compression.

Mike


Regards,
Willy

--
[1] http://tools.ietf.org/id/draft-tarreau-httpbis-network-friendly-00.txt
[2] http://1wt.eu/http2/



Martin Nilsson | 11 Jun 18:38 2012
Picon

Re: Significantly reducing headers footprint

On Mon, 11 Jun 2012 16:32:41 +0200, Mike Belshe <mike@...> wrote:

> This is good work, Willy.
>
> Any perf results on how much this will impact the user?  Given the  
> stateful
> nature of gzip already in use, I'm betting this has almost no impact for
> most users?
>

Gzip works better on text files than binary files, at least for small  
messages where transferring a custom huffman table creates a big relative  
overhead, so doing things binary doesn't have to be better. If you start  
doing unaligned bits it gets really bad, as gzip works on byte level (c.f.  
flash files). You can of course try to construct your binary format so  
that it looks ASCII-ish for all common values...

/Martin Nilsson

--

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Willy Tarreau | 11 Jun 19:04 2012
Picon

Re: Significantly reducing headers footprint

Hi Martin,

On Mon, Jun 11, 2012 at 06:38:39PM +0200, Martin Nilsson wrote:
> On Mon, 11 Jun 2012 16:32:41 +0200, Mike Belshe <mike@...> wrote:
> 
> >This is good work, Willy.
> >
> >Any perf results on how much this will impact the user?  Given the  
> >stateful
> >nature of gzip already in use, I'm betting this has almost no impact for
> >most users?
> >
> 
> Gzip works better on text files than binary files, at least for small  
> messages where transferring a custom huffman table creates a big relative  
> overhead, so doing things binary doesn't have to be better. If you start  
> doing unaligned bits it gets really bad, as gzip works on byte level (c.f.  
> flash files). You can of course try to construct your binary format so  
> that it looks ASCII-ish for all common values...

It's not totally true in fact. Gzip offers nice savings here, but the main
issue I'm having is that by compressing the full requests, we still present
a complete request to the recipient, which has to process it as a whole. This
basically means decompressing then parsing all the cookies etc... While
doing so over HTTP/1.1 probably is the most natural thing to do, I think
that we can improve the 2.0 design to avoid having to do this in the first
place.

Regards,
Willy

Willy Tarreau | 11 Jun 18:59 2012
Picon

Re: Significantly reducing headers footprint

Hi Mike,

On Mon, Jun 11, 2012 at 07:32:41AM -0700, Mike Belshe wrote:
> This is good work, Willy.

Thanks.

> Any perf results on how much this will impact the user?  Given the stateful
> nature of gzip already in use, I'm betting this has almost no impact for
> most users?

No, I don't have numbers. All I can say is that on one core of my core2 3 GHz, 
I could compress around 70k requests per second with the PoC code, so the
CPU cost even at 100 req/s will be extremely low. Also, the compression
ratio was quite high (12.7x, 92%) even with the currently limited set of
features, so I'm quite confident that the impact on upstream will be a
significant gain. For instance, the original 132 kB of requests were around
90 MSS, which represent a significant number of RTTs. The resulting 10kB are
7 MSS, which can be sent at once with the default INITCWND 10. I'd be pleased
if I could put my hands on large amounts of reassembled requests streams.
It's something much more difficult to get than I initially believed. And
I'm not a browser developer, I'd really now know where to start from to
make a PoC with a real browser.

> There is a tradeoff; completely custom compression will introduce more
> interop issues.  Registries of "well known headers" are notoriously painful
> to maintain and keep versioned.

I agree. But some are clearly protocol elements. Basically everything
that is described in the spec could have its number. We have a syntax
for If-Modified-Since, we can have a number too.

> >  - User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; fr;
> > rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12
> >    => Well, this one is only sent once over the connection, but we could
> >       reduce this further by using a registery of known vendors/products
> >       and incite vendors to emit just a few bytes (vendor/product/version).
> 
> I don't think the compressor should be learning about vendor-specific
> information.  This gives advantages to certain browser incumbents and is
> unfair to startups.  We absolutely MUST NOT give advantages to the current
> popular browsers.

I'm with you on this, but I was not speaking about making the compressor
aware of the numbers (I probably was not clear on this, it was late). I'd
rather have vendors register IDs and choose to advertise them instead of
the current text. An 1.1 -> 2.0 gateway would just pass along what is above,
as my PoC code did. For instance, the UA above could be advertised by the
browser as 0x0002:0x0306:0xC (just 5 bytes). There could be an experimental
range as you have on USB/PCI/Ethernet so that new users are not disadvantaged.

> >    => With better request reordering, we could have this :
> >
> >       11 Accept: */*
> >      109 Accept: image/png,image/*;q=0.8,*/*;q=0.5
> >        4 Accept: text/css,*/*;q=0.1
> >        3 Accept:
> > text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
> >
> 
> As long as the browser uses the same accept header from request to request
> (which it generally does), this compresses to almost zero after the first
> header block.

Indeed, this was more a note about things that can be improved upstream,
where the requests are originated.

> >  - Cache-Control: max-age=0
> >    => I suspect the user hit the Refresh button, this was present in about
> >       half the requests. Anyway, this raises the question of the length it
> >       requires for something which is just a boolean here ("ignore cache").
> >       Probably that a client has very few Cache-Control header values to
> >       send, and that reducing this to a smaller set would be beneficial.
> >
> 
> Trying to change the motivation or semantics of headers is a large
> endeavor....  Not sure if the compression of the bits is the right
> motivation for doing so.

The compression clearly does not gain much from this. In fact it's more
that I noticed it in the compressed stream, and it made me realize that
improving semantincs in 2.0 could improve reliability and interoperability.
I've seen many times users sending "maxage=0" or "max-age:0", which are
not supposed to be the correct form. In fact what they want is simply to
ignore cached contents.

> >  - Cookie: xtvrn=$OaiJty$; xtan327981=c; xtant327981=c; has_js=c;
> > __utma=KBjWnx24Q.7qFKqmB7v.i0JDH91L_R.0kU2W1uL49.JM4KtFLV0b.C;
> > __utmc=Rae9ZgQHz;
> > __utmz=NRSZOcCWV.d5MlK5RJsi.-.f.N8J73w=S1SLuT_j0m.O8|VsIxwE=(jHw58obb)|r9SgsT=WQfZe8jr|pFSZGH=/ <at> /qwDyMw3I;
> > __gads=td=ASP_D5ml4Ebevrej:R=pvxltafqZK:x=E4FUn3YiNldW3rhxzX6YlCptZp8zF-b5qc;
> > _chartbeat2=oQvb8k_G9tduhauf.LqOukjnlaaE7K.uDBaR79E1WT4t.Kr9L_lIrOtruE8;
> > __qca=LC9oiRpFSWShYlxUtD37GJ2k8AL; __utmb=vG8UMEjrz.Qf.At.pXD61lUeHZ;
> > pm8196_1=c; pm8194_1=c
> >
> >    => amazingly, this one compresses extremely well with the above scheme,
> >       because additions are performed at the end so consecutive cookies
> > keep
> >       a lot in common, and changes are not too frequent. However, given the
> >       omnipresent usage of cookies, I was wondering why we should not
> > create
> >       a new entity of its own for the cookies instead of abusing the Cookie
> >       header. It would make it a lot easier for both ends to find what they
> >       need. For instance, a load balancer just needs to find a server name
> >       in the thing above. What a waste of on-wire bits and of CPU cycles !
> >
> > BTW, binary encoding would probably also help addressing a request I often
> > hear in banking environments : the need to sign/encrypt/compress only
> > certain
> > headers or cookies. Right now when people do this, they have to
> > base64-encode
> > the result, which is another transformation at both ends and inflates the
> > data. If we make provisions in the protocol for announcing encrypted or
> > compressed headers using 2-3 bits, it might become more usable. I'm not
> > convinced it provides any benefit between a browser and an origin server
> > though. So maybe it will remain application-specific and the transport
> > just has to make it easier to emit 8-bit data in header field values.
> >
> 
> Happens all the time, yes.  Just make sure that HTTP2 -> HTTP1.1 definition
> is preserved so that gateways still work.

In fact I'd say that we have to make provisions for those gateways to reliably
encode bytes that cannot be represented in 1.1, and have the conversion back.
This is a bit tricky, might look a bit like what happened with quoted-printable
text in mails but certainly can be done a lot easier.

> > Has anyone any opinion on the subject above ? Or ideas about other things
> > that terribly clobber the upstream pipe and that should be fixed in 2.0 ?
> >
> > I hope I'll soon find some time to update our draft to reflect recent
> > updates
> > and findings.
> >
> 
> Again, I think we could spend a lot of time debating the compressor.  And
> with one more registry or one more semantic header change from HTTP, there
> will always be one more bit to compress out.  But these are, IMHO, already
> diminishing returns for performance.  I hope we'll all focus on the more
> important parts of the protocol (flow control, security, 1.x to 2.x
> upgrades, etc) than compression.

Totally agreed. In fact I feel a bit frustrated to be working on this because
I know there are a lot of other aspects. But I wanted to ensure that we could
squeeze enough bytes of the stream impacting the end user in a way that would
be cheaper to process than gzip for intermediaries.

On the other hand, I'm perfectly fine with the way you process streams and
flow control in SPDY, which is another reason why I have no motivation for
working on it too. Upgrades and gatewaying are other very important points
that still need some work.

Thanks for your comments, Mike !

Willy


Gmane