Andrew Evans | 2 May 2011 10:03

Re: [PATCH 2/2] datapath: Hash only the part of struct sw_flow_key populated by flow_extract().

On Sat, 2011-04-30 at 14:29 -0700, Andrew Evans wrote:
> On Fri, 2011-04-29 at 19:58 -0700, Jesse Gross wrote:
> > I was thinking about this a bit more and realized that we have to be
> > more careful about how we handle invalid packets, not just here but in
> > other places as well.  In some places if the packet is invalid then we
> > don't set length to include the invalid headers.  At first I thought
> > this was correct but it's actually a significant problem.  Take IPv6
> > for example: if the length of the packet is shorter than the IPv6
> > header then we won't include the IPv6 fields.  However, this means
> > that the extracted flow will actually match any IPv6 flow, which is
> > clearly not right.  Instead we need to include the length of the IPv6
> > fields so that we will compare them and know that we are looking at an
> > invalid packet.
> 
> Ah, now your comments about setting key_len to the end of the ipv4/ipv6
> structs makes sense to me. I'll change the code to always set the key
> length to the end of the IPv4/IPv6 structs if any of the respective
> fields are set.

Ok, I think there are only three flow key lengths that make sense:

1. to end of Ethernet headers
2. to end of IPv4 headers
3. to end of IPv6 headers

If that's true, then there's no point in incrementing the flow key
length as we go field-by-field. We should just set it to wherever the
end would be if it were valid based on the protocol information in the
packet. I was able to remove a lot of key_len assignment clutter as a
result.
(Continue reading)

Jesse Gross | 3 May 2011 00:37

Re: [PATCH 2/2] datapath: Hash only the part of struct sw_flow_key populated by flow_extract().

On Mon, May 2, 2011 at 1:03 AM, Andrew Evans <aevans@...> wrote:
> On Sat, 2011-04-30 at 14:29 -0700, Andrew Evans wrote:
>> On Fri, 2011-04-29 at 19:58 -0700, Jesse Gross wrote:
>> > I was thinking about this a bit more and realized that we have to be
>> > more careful about how we handle invalid packets, not just here but in
>> > other places as well.  In some places if the packet is invalid then we
>> > don't set length to include the invalid headers.  At first I thought
>> > this was correct but it's actually a significant problem.  Take IPv6
>> > for example: if the length of the packet is shorter than the IPv6
>> > header then we won't include the IPv6 fields.  However, this means
>> > that the extracted flow will actually match any IPv6 flow, which is
>> > clearly not right.  Instead we need to include the length of the IPv6
>> > fields so that we will compare them and know that we are looking at an
>> > invalid packet.
>>
>> Ah, now your comments about setting key_len to the end of the ipv4/ipv6
>> structs makes sense to me. I'll change the code to always set the key
>> length to the end of the IPv4/IPv6 structs if any of the respective
>> fields are set.
>
> Ok, I think there are only three flow key lengths that make sense:
>
> 1. to end of Ethernet headers
> 2. to end of IPv4 headers
> 3. to end of IPv6 headers
>
> If that's true, then there's no point in incrementing the flow key
> length as we go field-by-field. We should just set it to wherever the
> end would be if it were valid based on the protocol information in the
> packet. I was able to remove a lot of key_len assignment clutter as a
(Continue reading)

Andrew Evans | 10 May 2011 23:19

Re: [PATCH 2/2] datapath: Hash only the part of struct sw_flow_key populated by flow_extract().

On Mon, 2011-05-02 at 15:37 -0700, Jesse Gross wrote:
> I think what you had previously is actually along the right lines and
> we definitely need to include L4 headers in the length.  The one
> difference is that once we start to parse a header group the length
> needs to be included even if it ultimately turns out to be invalid.  A
> header group is something that is implied by a next header in the
> previous layer (i.e. IPv6 is implied by the EtherType).
> 
> I think the header groups are:
> * L1/L2 (tun_id, in_port, dl_tci, dl_src, dl_dst)
> * IPv4 (nw_proto, nw_tos, ipv4.src, ipv4.dst)
> * IPv6 (nw_proto, nw_tos, ipv6.src, ipv6.dst)
> * ARP (nw_proto, nw_tos, arp.sha, arp.tha)
> * TCP/UDP/ICMP over IPv4 (ipv4.tp.src, ipv4.tp.dst)
> * TCP/UDP/ICMP over IPv6 (ipv6.tp.src, ipv6.tp.dst)
> * IPv6 ND (ipv6.nd_target, ipv6.nd_sll, ipv6.nd_tll)
> 
> If we break it more finely than that, then we run into the problems
> with invalid packets partially matching flows.  However, by breaking
> it at the layer boundary we know that flows can't have additional data
> farther on because that would imply that they have the same current
> layer headers as us, which are zeroed.

That makes sense to me. Ok, here is an incremental diff that should
return flow key lengths at the end of these boundaries, even when errors
occur:

diff --git a/datapath/flow.c b/datapath/flow.c
index 241822a..0f851d6 100644
--- a/datapath/flow.c
(Continue reading)

Jesse Gross | 11 May 2011 05:10

Re: [PATCH 2/2] datapath: Hash only the part of struct sw_flow_key populated by flow_extract().

On Tue, May 10, 2011 at 2:19 PM, Andrew Evans <aevans@...> wrote:
> On Mon, 2011-05-02 at 15:37 -0700, Jesse Gross wrote:
>> I think what you had previously is actually along the right lines and
>> we definitely need to include L4 headers in the length.  The one
>> difference is that once we start to parse a header group the length
>> needs to be included even if it ultimately turns out to be invalid.  A
>> header group is something that is implied by a next header in the
>> previous layer (i.e. IPv6 is implied by the EtherType).
>>
>> I think the header groups are:
>> * L1/L2 (tun_id, in_port, dl_tci, dl_src, dl_dst)
>> * IPv4 (nw_proto, nw_tos, ipv4.src, ipv4.dst)
>> * IPv6 (nw_proto, nw_tos, ipv6.src, ipv6.dst)
>> * ARP (nw_proto, nw_tos, arp.sha, arp.tha)
>> * TCP/UDP/ICMP over IPv4 (ipv4.tp.src, ipv4.tp.dst)
>> * TCP/UDP/ICMP over IPv6 (ipv6.tp.src, ipv6.tp.dst)
>> * IPv6 ND (ipv6.nd_target, ipv6.nd_sll, ipv6.nd_tll)
>>
>> If we break it more finely than that, then we run into the problems
>> with invalid packets partially matching flows.  However, by breaking
>> it at the layer boundary we know that flows can't have additional data
>> farther on because that would imply that they have the same current
>> layer headers as us, which are zeroed.
>
> That makes sense to me. Ok, here is an incremental diff that should
> return flow key lengths at the end of these boundaries, even when errors
> occur:

This looks about right to me but there have been so many incrementals
to this patch that I've long since lost track.  Can you send it out in
(Continue reading)

Andrew Evans | 12 May 2011 01:27

Re: [PATCH 2/2] datapath: Hash only the part of struct sw_flow_key populated by flow_extract().

On Tue, 2011-05-10 at 20:10 -0700, Jesse Gross wrote:
> This looks about right to me but there have been so many incrementals
> to this patch that I've long since lost track.  Can you send it out in
> whole again?

Yes, I'll do that once we've resolved the following.

> Also, we had talked some about grouping together fields that are
> always assigned together, such as IP source and dest addresses so that
> the length assignment can use the whole block instead of the last
> field.  Did we ever come to a conclusion on that?

I backed out the changes that grouped fields into structs because I
thought you wanted to be consistent with other structs in the datapath,
which don't nest structs except as members of a union. I do still like
the idea of grouping related fields together, since I think it makes it
less likely that future changes to the flow key struct will break flow
key length computation. Do you want me to group them? That's an easy
change to make. Are there other structs you think should be similarly
grouped for the sake of consistency?

> >
> > diff --git a/datapath/flow.c b/datapath/flow.c
> > index 241822a..0f851d6 100644
> > --- a/datapath/flow.c
> > +++ b/datapath/flow.c
> >  <at>  <at>  -547,7 +554,7  <at>  <at>  int flow_extract(struct sk_buff *skb, u16 in_port, struct sw_flow_key *key,
> >
> >                key_len = SW_FLOW_KEY_OFFSET(ipv6);
> 
(Continue reading)

Jesse Gross | 12 May 2011 05:09

Re: [PATCH 2/2] datapath: Hash only the part of struct sw_flow_key populated by flow_extract().

On Wed, May 11, 2011 at 4:27 PM, Andrew Evans <aevans@...> wrote:
> On Tue, 2011-05-10 at 20:10 -0700, Jesse Gross wrote:
>> Also, we had talked some about grouping together fields that are
>> always assigned together, such as IP source and dest addresses so that
>> the length assignment can use the whole block instead of the last
>> field.  Did we ever come to a conclusion on that?
>
> I backed out the changes that grouped fields into structs because I
> thought you wanted to be consistent with other structs in the datapath,
> which don't nest structs except as members of a union. I do still like
> the idea of grouping related fields together, since I think it makes it
> less likely that future changes to the flow key struct will break flow
> key length computation. Do you want me to group them? That's an easy
> change to make. Are there other structs you think should be similarly
> grouped for the sake of consistency?

I think it makes sense to group them, I was just looking for a
consistent policy.  Right now we are consistent in that we only do it
with unions but that doesn't mean there isn't something better.  I
think the right policy is fields that are always assigned together,
which is that list of header groups that I mentioned before.
Andrew Evans | 13 May 2011 20:33

Re: [PATCH 2/2] datapath: Hash only the part of struct sw_flow_key populated by flow_extract().

On Wed, 2011-05-11 at 20:09 -0700, Jesse Gross wrote:
> I think it makes sense to group them, I was just looking for a
> consistent policy.  Right now we are consistent in that we only do it
> with unions but that doesn't mean there isn't something better.  I
> think the right policy is fields that are always assigned together,
> which is that list of header groups that I mentioned before.

Ok, does this look reasonable?

struct sw_flow_key {
	struct {
		__be64 tun_id;			/* Encapsulating tunnel ID. */
		u16    in_port;			/* Input switch port. */
		u8     dl_src[ETH_ALEN];	/* Ethernet source address. */
		u8     dl_dst[ETH_ALEN];	/* Ethernet destination address. */
		__be16 dl_tci;			/* 0 if no VLAN, VLAN_TAG_PRESENT set otherwise. */
		__be16 dl_type;			/* Ethernet frame type. */
	} eth;
	struct {
		u8     nw_proto;		/* IP protocol or lower 8 bits of ARP opcode. */
		u8     nw_tos;			/* IP ToS (DSCP field, 6 bits). */
	} ip;
	union {
		struct {
			struct {
				__be32 src;	/* IP source address. */
				__be32 dst;	/* IP destination address. */
			} addr;
			union {
				struct {
(Continue reading)

Jesse Gross | 13 May 2011 20:45

Re: [PATCH 2/2] datapath: Hash only the part of struct sw_flow_key populated by flow_extract().

On Fri, May 13, 2011 at 11:33 AM, Andrew Evans <aevans@...> wrote:
> On Wed, 2011-05-11 at 20:09 -0700, Jesse Gross wrote:
>> I think it makes sense to group them, I was just looking for a
>> consistent policy.  Right now we are consistent in that we only do it
>> with unions but that doesn't mean there isn't something better.  I
>> think the right policy is fields that are always assigned together,
>> which is that list of header groups that I mentioned before.
>
> Ok, does this look reasonable?

Thanks, that looks good to me.
Andrew Evans | 14 May 2011 02:54

Re: [PATCH 2/2] datapath: Hash only the part of struct sw_flow_key populated by flow_extract().

On Fri, 2011-05-13 at 11:45 -0700, Jesse Gross wrote:
> On Fri, May 13, 2011 at 11:33 AM, Andrew Evans <aevans@...> wrote:
> > On Wed, 2011-05-11 at 20:09 -0700, Jesse Gross wrote:
> >> I think it makes sense to group them, I was just looking for a
> >> consistent policy.  Right now we are consistent in that we only do it
> >> with unions but that doesn't mean there isn't something better.  I
> >> think the right policy is fields that are always assigned together,
> >> which is that list of header groups that I mentioned before.
> >
> > Ok, does this look reasonable?
> 
> Thanks, that looks good to me.

Ok, here's an incremental for the rename. I'll follow this with a repost
of the entire patch.

diff --git a/datapath/actions.c b/datapath/actions.c
index b6b7135..abc06e6 100644
--- a/datapath/actions.c
+++ b/datapath/actions.c
 <at>  <at>  -114,13 +114,13  <at>  <at>  static struct sk_buff *modify_vlan_tci(struct sk_buff *skb, __be16 tci)

 static bool is_ip(struct sk_buff *skb)
 {
-	return (OVS_CB(skb)->flow->key.dl_type == htons(ETH_P_IP) &&
+	return (OVS_CB(skb)->flow->key.eth.dl_type == htons(ETH_P_IP) &&
 		skb->transport_header > skb->network_header);
 }

 static __sum16 *get_l4_checksum(struct sk_buff *skb)
(Continue reading)

Jesse Gross | 14 May 2011 03:27

Re: [PATCH 2/2] datapath: Hash only the part of struct sw_flow_key populated by flow_extract().

On Fri, May 13, 2011 at 5:54 PM, Andrew Evans <aevans@...> wrote:
> On Fri, 2011-05-13 at 11:45 -0700, Jesse Gross wrote:
>> On Fri, May 13, 2011 at 11:33 AM, Andrew Evans <aevans@...> wrote:
>> > On Wed, 2011-05-11 at 20:09 -0700, Jesse Gross wrote:
>> >> I think it makes sense to group them, I was just looking for a
>> >> consistent policy.  Right now we are consistent in that we only do it
>> >> with unions but that doesn't mean there isn't something better.  I
>> >> think the right policy is fields that are always assigned together,
>> >> which is that list of header groups that I mentioned before.
>> >
>> > Ok, does this look reasonable?
>>
>> Thanks, that looks good to me.
>
> Ok, here's an incremental for the rename. I'll follow this with a repost
> of the entire patch.

I guess we can actually collapse the names for the members in the new
eth and ip structs, similar to what you did for the ND members i.e.
eth.src instead of eth.dl_src.
Andrew Evans | 14 May 2011 05:51

Re: [PATCH 2/2] datapath: Hash only the part of struct sw_flow_key populated by flow_extract().

On Fri, 2011-05-13 at 18:27 -0700, Jesse Gross wrote:
> I guess we can actually collapse the names for the members in the new
> eth and ip structs, similar to what you did for the ND members i.e.
> eth.src instead of eth.dl_src.

Ok, done.

Full diff to follow.

Andrew Evans | 14 May 2011 05:54

[PATCH] datapath: Hash and compare only the part of sw_flow_key actually used.

Currently the whole flow key struct is hashed on every packet received from the
network or userspace. The whole struct is also compared byte-for-byte when
doing flow table lookups. This consumes a fair percentage of CPU time, and most
of the time part of the structure is unused (e.g. the IPv6 fields when handling
IPv4 traffic; the IPv4 fields when handling Ethernet frames).

This commit reorders the fields in the flow key struct to put the least
commonly used elements at the end and changes the hash and comparison functions
to look only at the part that contains data.

Signed-off-by: Andrew Evans <aevans@...>
---
 datapath/actions.c  |    6 +-
 datapath/datapath.c |   31 ++--
 datapath/flow.c     |  502 +++++++++++++++++++++++++++++----------------------
 datapath/flow.h     |   70 +++++---
 datapath/table.c    |   14 +-
 datapath/table.h    |    4 +-
 datapath/tunnel.c   |   21 ++-
 7 files changed, 383 insertions(+), 265 deletions(-)

diff --git a/datapath/actions.c b/datapath/actions.c
index b6b7135..abc06e6 100644
--- a/datapath/actions.c
+++ b/datapath/actions.c
 <at>  <at>  -114,13 +114,13  <at>  <at>  static struct sk_buff *modify_vlan_tci(struct sk_buff *skb, __be16 tci)

 static bool is_ip(struct sk_buff *skb)
 {
-	return (OVS_CB(skb)->flow->key.dl_type == htons(ETH_P_IP) &&
(Continue reading)

Jesse Gross | 16 May 2011 23:35

Re: [PATCH] datapath: Hash and compare only the part of sw_flow_key actually used.

On Fri, May 13, 2011 at 8:54 PM, Andrew Evans <aevans <at> nicira.com> wrote:
> Currently the whole flow key struct is hashed on every packet received from the
> network or userspace. The whole struct is also compared byte-for-byte when
> doing flow table lookups. This consumes a fair percentage of CPU time, and most
> of the time part of the structure is unused (e.g. the IPv6 fields when handling
> IPv4 traffic; the IPv4 fields when handling Ethernet frames).
>
> This commit reorders the fields in the flow key struct to put the least
> commonly used elements at the end and changes the hash and comparison functions
> to look only at the part that contains data.
>
> Signed-off-by: Andrew Evans <aevans <at> nicira.com>

Thanks, this largely looks good to me.  I noticed a few fairly small
things but I think it is very close.

Were you going to rename the members of the flow structure where the
struct name and member prefix are redundant, like eth.dl_type?  I
thought that you said you had already done this in the previous
message but I don't see it here.

> diff --git a/datapath/flow.c b/datapath/flow.c
> index d678979..6969a70 100644
> --- a/datapath/flow.c
> +++ b/datapath/flow.c
> +static int parse_ipv6hdr(struct sk_buff *skb, struct sw_flow_key *key, int *key_lenp)

Can you break this line so it isn't quite so long?

>  <at>  <at>  -429,17 +448,19  <at>  <at>  int flow_extract(struct sk_buff *skb, u16 in_port, struct sw_flow_key *key,
(Continue reading)

Andrew Evans | 17 May 2011 23:18

Re: [PATCH] datapath: Hash and compare only the part of sw_flow_key actually used.

On Mon, 2011-05-16 at 14:35 -0700, Jesse Gross wrote:
> Were you going to rename the members of the flow structure where the
> struct name and member prefix are redundant, like eth.dl_type?  I
> thought that you said you had already done this in the previous
> message but I don't see it here.

Sorry, I forgot to commit that before doing git-send-email against
master.

> On Fri, May 13, 2011 at 8:54 PM, Andrew Evans <aevans@...> wrote:
> > diff --git a/datapath/flow.c b/datapath/flow.c
> > index d678979..6969a70 100644
> > --- a/datapath/flow.c
> > +++ b/datapath/flow.c
> > +static int parse_ipv6hdr(struct sk_buff *skb, struct sw_flow_key *key, int *key_lenp)
> 
> Can you break this line so it isn't quite so long?

Sure, done.

> >  <at>  <at>  -429,17 +448,19  <at>  <at>  int flow_extract(struct sk_buff *skb, u16 in_port, struct sw_flow_key *key,
> > +       /* dl_type, dl_vlan, dl_vlan_pcp. */
> >        __skb_pull(skb, 2 * ETH_ALEN);
> 
> I deleted this comment in my previous patch because it was obsolete,
> looks like it reappeared due to a rebase error.

Oops, sorry, deleted.

> > -       key->dl_type = parse_ethertype(skb);
(Continue reading)

Jesse Gross | 18 May 2011 00:39

Re: [PATCH] datapath: Hash and compare only the part of sw_flow_key actually used.

On Tue, May 17, 2011 at 2:18 PM, Andrew Evans <aevans <at> nicira.com> wrote:
> On Mon, 2011-05-16 at 14:35 -0700, Jesse Gross wrote:
>> Were you going to rename the members of the flow structure where the
>> struct name and member prefix are redundant, like eth.dl_type?  I
>> thought that you said you had already done this in the previous
>> message but I don't see it here.
>
> Sorry, I forgot to commit that before doing git-send-email against
> master.

All of those changes look fine to me.  I think this is ready to go.

Acked-by: Jesse Gross <jesse <at> nicira.com>
_______________________________________________
dev mailing list
dev <at> openvswitch.org
http://openvswitch.org/mailman/listinfo/dev
Andrew Evans | 18 May 2011 20:32

Re: [PATCH] datapath: Hash and compare only the part of sw_flow_key actually used.

On Tue, 2011-05-17 at 15:39 -0700, Jesse Gross wrote:
> All of those changes look fine to me.  I think this is ready to go.
> 
> Acked-by: Jesse Gross <jesse@...>

Thanks. I pushed this.

Jesse Gross | 18 May 2011 20:56

Re: [PATCH] datapath: Hash and compare only the part of sw_flow_key actually used.

On Wed, May 18, 2011 at 11:32 AM, Andrew Evans <aevans <at> nicira.com> wrote:
> On Tue, 2011-05-17 at 15:39 -0700, Jesse Gross wrote:
>> All of those changes look fine to me.  I think this is ready to go.
>>
>> Acked-by: Jesse Gross <jesse <at> nicira.com>
>
> Thanks. I pushed this.

I guess I was expecting that you were going to drop the nw_ prefixes
in struct ip as well.
_______________________________________________
dev mailing list
dev <at> openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Gmane