Robert Barclay | 19 Nov 22:15
Picon

Standardizing reputation query mechanisms

Lately, despite the low level of activity on this list, there has been
a great deal of activity in the development of a new generation
reputation services beyond the existing blacklist/whitelist paradigm.
Several (including one I am working on) either are available in some
state currently or are in development This is an exciting development
for the email industry in general, but does present some challenges.
The primary one is that most of these services are commercial to some
extent or another which makes sharing information difficult.
Despite this, an overwhelming concern I have heard from ISPs and MTA
vendors is that each of the developing services is publishing its data
in a slightly different way, and beyond that several protocols have
been suggested as standards for querying this data.
An area where it should be possible for all of us to work together
without the problems of commercial damage is in development and
deployment of a standard protocol for publishing and querying
reputation data. This problem is much more complicated than it may on
its face appear, because reputation services will have a much wider
range of sematics than traditional blacklists. They will return
different levels of granularity of data, from a single binary score,
to a huge range of scores over individual data points or even
customized scores for individual queriers. Or tey may provide a list
of suggested actions to be taken. They may have a need to allow email
receivers to guide the semantics of the query (e.g. I want elements
x,y,and z but not the other 23).

The advantage of getting us all to agree on the mechanisms to access
and exchange this data is that the mechanism can be built into every
MTA (if desired) and all of the systems will be supported without need
to develop new libraries every time someone creates a system. A
standard protocol also makes it more straightforward to compare
(Continue reading)

Mark C. Langston | 19 Nov 23:06

Re: Standardizing reputation query mechanisms

On Fri, Nov 19, 2004 at 02:15:08PM -0700, Robert Barclay wrote:
> Lately, despite the low level of activity on this list, there has been
> a great deal of activity in the development of a new generation
> reputation services beyond the existing blacklist/whitelist paradigm.
> Several (including one I am working on) either are available in some
> state currently or are in development This is an exciting development
> for the email industry in general, but does present some challenges.

On that note, I'd like to make a brief announcement:  GOSSiP 0.8-beta is
available on sourceforge.net.  Please see
http://sourceforge.net/projects/gossip-project/ if you're interested.

Also, and perhaps as importantly, I'd like to announce that I'm stepping
down from leading the GOSSiP Project, and need to find someone willing
to take my place developing, guiding development, and evangelizing
GOSSiP.  I've accepted a new position, and as part of my employment
contract, I've agreed to cease my involvement in GOSSiP to preclude any
possibility of conflict-of-interest in a potentially competing product.

If you'd like to assume command of GOSSiP, please email me privately,
explaining how you intend to continue the work I've started.  If chosen,
you'll receive administrative access to the sourceforge.net project
(I'll need your sourceforge.net account name).

> The primary one is that most of these services are commercial to some
> extent or another which makes sharing information difficult.

This is why I made GOSSiP open-source, under the GPL.

> Despite this, an overwhelming concern I have heard from ISPs and MTA
(Continue reading)

Robert Barclay | 22 Nov 23:52
Picon

Re: Standardizing reputation query mechanisms

> I believe there was a proposal floated several months ago on the ASRG
> AIR subgroup.  The draft was posted to that mailing list.  Perhaps it
> would serve as a good starting point?
> 
> I could also briefly describe the GOSSiP query/response protocol, if
> there's interest.
> 
Actually I have seen a few proposals floated, and besides these
many systems are creating their own query query/response
protocols. While this makes sense for the developers of each system
(it allows them to tailor the protocol to the exact needs of their
system) ultimately I think this will end up being a bad deal for end
users by making it more difficult to switch or compare services. It
also means an increased burden if you want to use multiple services in
parallel. 

The formal proposals I am so far aware of are SIQ, Vouchlist, DNA
(part of the CSV set of proposals), and the proposal sent to this list
a few weeks ago for DNS Publication of Accreditation Data. Besides
this there are the publication mechanisms and query response protocols
already used by your system and the Cloudmark Rating system as just
two examples.

At the FTC summitt and various other email related events it has
become clear that a large number of people think that just like
authentication is in the process of becoming reputation or public
accountability in some form will be core parts of the email
infrastructure in the near future. If this is the case then I think it
argues strongly for all of us working in the reputation space to agree
on at least on standard way to convey our data to everyone, even if
(Continue reading)

John Levine | 23 Nov 04:40

Re: Standardizing reputation query mechanisms

{ we'll probably have lots of reputation systems ]
>If this is the case then I think it argues strongly for all of us
>working in the reputation space to agree on at least on standard way
>to convey our data to everyone, even if the data created by our
>systems is not exactly identical.

Absolutely. That's the main reason I set up IAR.  We're going to have
multiple reputations systems for several reasons.  Different people
will have different approahes to scoring reputations, e.g., looking
for sent spam, visiting senders to check their mailing practices, or
holding money against third party spam complaints.  It was also quite
clear at the FTC auth summit that a lot of reputation systems will be
country specific since the major mailers even in the US and in Canada
are different.

If people can't plug in reputation systems as easily as they can plug
in dnsbls, it's just not going to work.  No doubt some vendors will
try to lock people in with proprietary systems, but I hope we can
avoid that.  People pay money (quite a lot, in fact) for the MAPS
dnsbls, and if need be, they'll pay for good reputation data, too.

--

-- 
John R. Levine, IECC, POB 727, Trumansburg NY 14886 +1 607 330 5711
johnl <at> iecc.com, Mayor, http://johnlevine.com, 
Member, Provisional board, Coalition Against Unsolicited Commercial E-mail

Robert Barclay | 29 Nov 19:27
Picon

Re: Standardizing reputation query mechanisms

Below are a really high level set of goals for a standard
query/response protocol for reputation services based on discussions I
have had recently with a variety of people at ISP, MTA vendors, and
various reputation and accreditation data sources.
Note that these are labeled goals not requirements.  All of these are
debatable to some extent.They are intended to do 2 things
1) start some discussions on what it would take to get a single
protocol that meets the needs of both reputation providers and email
recipients
2) hopefully serve as a starting point for comparative evaluation of
some of the proposals already extant

The primary work now I think is not in debating the merits of any of
the specific proposals but, for MTA vendors to say whether a protocol
that met these requirements would be supportable within their MTA, and
for reputation data providers to say whether the below requirements
support the data they are trying to communicate.

Again for those of you not already subscribed these discussions are
taking place on the IAR subcommittee of the Anti-Spam Research group (
 http://asrg.sp.am/subgroups/iar.shtml ). 

Goals for a standardized reputation query/response protocol:

1) Uses existing code (deployed in common MTA's) to the greatest extent possible
2) Cacheable
3) Both query and response can be fit in a single UDP packet in most cases
4) Response processing has low impact on system utilization for receivin sources
5) Protocol supports receiver directed queries (by this I mean that it
is possible within this protocol for an email recipient to request a
(Continue reading)

John R Levine | 30 Nov 04:59

Re: [taugh.com-johnl] Re: Standardizing reputation query

Looks pretty reasonable.

> 1) Uses existing code (deployed in common MTA's) to the greatest extent possible
> 2) Cacheable
> 3) Both query and response can be fit in a single UDP packet in most cases

I'd just say minimal web traffic.  I can easilyimagine a service where
clients log in and leave a socket open and stream queries to the service
rather than making individual UDP queries.  It probably would be a good
idea to say that questions and answers should be small enough to fit in
single UDP packets for simple query applications.

> 4) Response processing has low impact on system utilization for receivin
> sources
> 5) Protocol supports receiver directed queries (by this I mean that it
> is possible within this protocol for an email recipient to request a
> specific score or data element from a list of ones available, this
> obviously does not mean that all reputation services must support that
> functionality)
> 6) Responses must be able to accomodate the following data types
>    a. Judgemental score (this sender is in the 99th percentle of good
> guys by my criteria)
>    b. Suggested action (block this sender, accept this sender)
>    c. Specific measured data (this sender sent 500K messages today)
>    d. A confidence rating for a score (I think this guy is bad but I
> have only seen enough mail to be 50% sure of it)
>    e. Multiple scores in a singe response (here is the overall score
> for the domain you requested, and here is the score for that specific
> IP address when sending for that domain)

(Continue reading)

Robert Barclay | 30 Nov 18:00
Picon

Re: [taugh.com-johnl] Re: Standardizing reputation query

On 29 Nov 2004 22:59:31 -0500, John R Levine <johnl <at> taugh.com> wrote:
> Looks pretty reasonable.
> 
> > 1) Uses existing code (deployed in common MTA's) to the greatest extent possible
> > 2) Cacheable
> > 3) Both query and response can be fit in a single UDP packet in most cases
> 
> I'd just say minimal web traffic.  I can easilyimagine a service where
> clients log in and leave a socket open and stream queries to the service
> rather than making individual UDP queries.  It probably would be a good
> idea to say that questions and answers should be small enough to fit in
> single UDP packets for simple query applications.

I would say that rather than invalidating the goal of UDP for simple
queries that goal can be rewritten as

3) The protocol should support a UDP interface allowing each of the
query and response to fit in a single UDP packet for simple queries.

and then add an additional goal (again these being goals to evaluate
proposals not requirements) which would be something like

4) The protocol should additionally support a TCP interface for more
data intensive queries and responses

If I had to pick a priority for these two I would say 3 is close to a
must while 4 is less important. That though is more a gut feeling than
anything else.

> > 4) Response processing has low impact on system utilization for receivin
(Continue reading)

George Schlossnagle | 30 Nov 05:41
Favicon

Re: [taugh.com-johnl] Re: Standardizing reputation query

On Nov 29, 2004, at 10:59 PM, John R Levine wrote:

> Looks pretty reasonable.
>
>> 1) Uses existing code (deployed in common MTA's) to the greatest 
>> extent possible
>> 2) Cacheable
>> 3) Both query and response can be fit in a single UDP packet in most 
>> cases
>
> I'd just say minimal web traffic.  I can easilyimagine a service where
> clients log in and leave a socket open and stream queries to the 
> service
> rather than making individual UDP queries.  It probably would be a good
> idea to say that questions and answers should be small enough to fit in
> single UDP packets for simple query applications.

A TCP connection puts more burden on the server, as now it (may) need 
to maintain 100s of thousands of open persistent connections.

>> 4) Response processing has low impact on system utilization for 
>> receivin
>> sources
>> 5) Protocol supports receiver directed queries (by this I mean that it
>> is possible within this protocol for an email recipient to request a
>> specific score or data element from a list of ones available, this
>> obviously does not mean that all reputation services must support that
>> functionality)
>> 6) Responses must be able to accomodate the following data types
>>    a. Judgemental score (this sender is in the 99th percentle of good
(Continue reading)

Robert Barclay | 30 Nov 21:49
Picon

Re: [taugh.com-johnl] Re: Standardizing reputation query

> >> 6) Responses must be able to accomodate the following data types
> >>    a. Judgemental score (this sender is in the 99th percentle of good
> >> guys by my criteria)
> >>    b. Suggested action (block this sender, accept this sender)
> >>    c. Specific measured data (this sender sent 500K messages today)
> 
> What scheme do you envision for this?
> 
Sorry, but just to clarify what "this" did you mean? Were you
referring specifcally to publishing specific measured data elements
about a sender or were you referring to the broader ability to publish
(and presumably identify) semantically different types of data like a
score vs. a point data element.
In either case one or more of the existing proposals has mechanisms to
deal with the problem.
The vouchlist proposal (which seems to be what IADB uses), and Phillip
Hallam Baker's proposal both deal with the problem by publishing a
schema that describes the data in the response (in the Vouchlist case
there is one schema, Phillip's proposal allows for the creation of
multiple schemas). The SIQ proposal seems to allow for dealing with
both of these problems (though it is a little unclear to me whether
this was the intent) by allowing for extra data within both the query
and response packets.

Each of these has some problem in relation to the other goals though.
The Vouchlist proposal, because of its single schema seems like it
would have difficulty handling extensions to new data eements not
currently supported. It also does not allow the receiver to easily
define which elements they want returned.
Phillip's proposal requires retreiving two records instead of just one
(Continue reading)

Anthony Howe | 9 Dec 14:46
Gravatar

Re: [taugh.com-johnl] Re: Standardizing reputation query

Robert Barclay wrote:

>>>>6) Responses must be able to accomodate the following data types
>>>>   a. Judgemental score (this sender is in the 99th percentle of good
>>>>guys by my criteria)
>>>>   b. Suggested action (block this sender, accept this sender)
>>>>   c. Specific measured data (this sender sent 500K messages today)
>>
>>What scheme do you envision for this?
>>
> 
> Sorry, but just to clarify what "this" did you mean? Were you
> referring specifcally to publishing specific measured data elements
> about a sender or were you referring to the broader ability to publish
> (and presumably identify) semantically different types of data like a
> score vs. a point data element.
> In either case one or more of the existing proposals has mechanisms to
> deal with the problem.
> The vouchlist proposal (which seems to be what IADB uses), and Phillip
> Hallam Baker's proposal both deal with the problem by publishing a
> schema that describes the data in the response (in the Vouchlist case
> there is one schema, Phillip's proposal allows for the creation of
> multiple schemas). The SIQ proposal seems to allow for dealing with
> both of these problems (though it is a little unclear to me whether
> this was the intent) by allowing for extra data within both the query
> and response packets.

The SIQ protocol was designed to allow for custom/specific client-server
extensions, though it could have been more clearly stated in the draft.
In particular the UDP format used by SIQ should clarify this, since the
(Continue reading)

Robert Barclay | 13 Dec 18:21
Picon

Re: [taugh.com-johnl] Re: Standardizing reputation query

On Thu, 09 Dec 2004 08:46:41 -0500, Anthony Howe <achowe <at> snert.com> wrote:
> Robert Barclay wrote:
> 
> 
> 
> >>>>6) Responses must be able to accomodate the following data types
> >>>>   a. Judgemental score (this sender is in the 99th percentle of good
> >>>>guys by my criteria)
> >>>>   b. Suggested action (block this sender, accept this sender)
> >>>>   c. Specific measured data (this sender sent 500K messages today)
> >>
> >>What scheme do you envision for this?
> >>
> >
> > Sorry, but just to clarify what "this" did you mean? Were you
> > referring specifcally to publishing specific measured data elements
> > about a sender or were you referring to the broader ability to publish
> > (and presumably identify) semantically different types of data like a
> > score vs. a point data element.
> > In either case one or more of the existing proposals has mechanisms to
> > deal with the problem.
> > The vouchlist proposal (which seems to be what IADB uses), and Phillip
> > Hallam Baker's proposal both deal with the problem by publishing a
> > schema that describes the data in the response (in the Vouchlist case
> > there is one schema, Phillip's proposal allows for the creation of
> > multiple schemas). The SIQ proposal seems to allow for dealing with
> > both of these problems (though it is a little unclear to me whether
> > this was the intent) by allowing for extra data within both the query
> > and response packets.
> 
(Continue reading)

Rao, Anup | 4 Dec 02:08
Picon
Favicon

Re: [taugh.com-johnl] Re: Standardizing reputation query

> -----Original Message-----
> From: iar-owner <at> asrg.sp.am [mailto:iar-owner <at> asrg.sp.am] On
> Behalf Of Robert Barclay
> Sent: Tuesday, November 30, 2004 9:01 AM

>
> On 29 Nov 2004 22:59:31 -0500, John R Levine <johnl <at> taugh.com> wrote:
> > Looks pretty reasonable.
> >
> > > 1) Uses existing code (deployed in common MTA's) to the greatest
> > > extent possible
> > > 2) Cacheable
> > > 3) Both query and response can be fit in a single UDP
> packet in most cases
> >
> > I'd just say minimal web traffic.  I can easilyimagine a
> service where
> > clients log in and leave a socket open and stream queries to the
> > service rather than making individual UDP queries.  It
> probably would
> > be a good idea to say that questions and answers should be small
> > enough to fit in single UDP packets for simple query applications.
>
> I would say that rather than invalidating the goal of UDP for
> simple queries that goal can be rewritten as
>
> 3) The protocol should support a UDP interface allowing each
> of the query and response to fit in a single UDP packet for
> simple queries.
>
(Continue reading)

Anthony Howe | 9 Dec 14:46
Gravatar

Re: [taugh.com-johnl] Re: Standardizing reputation query

Rao, Anup wrote:
> I think there ought to be a secure mode for this protocol given the
> nature of the data. So if we end up doing only 3), then we need to have
> some form of integrity protection.
> Doing 4) over https  can provide the required integrity in a currently
> deployed manner. Also in general, using TCP makes it easier to deal with
> firewalls. For these reasons, I tend to see the priority of 4) to be
> higher than 3).

The SIQ protocol already allows for a secure data over HTTPS.

But I question the value of TLS/SSL methods other than to verify a
subscribed client and to verify you have the correct provider. How
sensitive can the query/response really be?

I think there will be more issues with respect to privacy laws of
different nations, in which case just posing a question to a third party
reputation service, regardless whether its communicated securely,
becomes an issue.

I've commented else where on the matter of privacy, with respect to the
information communicated by the SIQ protocol:

	http://wiki.outboundindex.net/ProtocolDiscussion

and I wonder if its worth a new thread of discussion here to discuss
this aspect of reputation services, because I'm not sure we can wait for
a test court case before addressing this.

--

-- 
(Continue reading)

Robert Barclay | 8 Dec 22:36
Picon

Re: [taugh.com-johnl] Re: Standardizing reputation query

On Fri, 3 Dec 2004 17:08:33 -0800, Rao, Anup <anrao <at> cisco.com> wrote:

> > > > 3) Both query and response can be fit in a single UDP
> > packet in most cases
> > >
> > > I'd just say minimal web traffic.  I can easilyimagine a
> > service where
> > > clients log in and leave a socket open and stream queries to the
> > > service rather than making individual UDP queries.  It
> > probably would
> > > be a good idea to say that questions and answers should be small
> > > enough to fit in single UDP packets for simple query applications.
> >
> > I would say that rather than invalidating the goal of UDP for
> > simple queries that goal can be rewritten as
> >
> > 3) The protocol should support a UDP interface allowing each
> > of the query and response to fit in a single UDP packet for
> > simple queries.
> >
> > and then add an additional goal (again these being goals to
> > evaluate proposals not requirements) which would be something like
> >
> > 4) The protocol should additionally support a TCP interface
> > for more data intensive queries and responses
> >
> >
> > If I had to pick a priority for these two I would say 3 is
> > close to a must while 4 is less important. That though is
> > more a gut feeling than anything else.
(Continue reading)

Anup Rao | 9 Dec 00:23
Picon
Favicon

Re: [taugh.com-johnl] Re: Standardizing reputation query

What does the group see as the inputs to the reputation query ? Is the 
sender domain included in the input only if it is authenticated using 
iim/dk, or is the authentication status of the message one of the inputs 
to the query in addition to the domain and source ipaddress. I'm 
guessing the latter.

-anup.

Robert Barclay | 9 Dec 06:45
Picon

Re: [taugh.com-johnl] Re: Standardizing reputation query

On Wed, 08 Dec 2004 15:23:13 -0800, Anup Rao <anrao <at> cisco.com> wrote:
> 
> What does the group see as the inputs to the reputation query ? Is the
> sender domain included in the input only if it is authenticated using
> iim/dk, or is the authentication status of the message one of the inputs
> to the query in addition to the domain and source ipaddress. I'm
> guessing the latter.
> 
> -anup.
> 
I would imagine both are likely. In the system I am working on we
assume that if only the domain is input that the querier has already
authenticated the domain to their satisfaction, alternately we will
support queries that contain both the domain and IP or queries on just
the IP. Ideally some day authentication will be universally deployed
and we could deprecate the latter two, but that does not seem feasable
now.

John R Levine | 30 Nov 05:58

Re: Standardizing reputation query mechanisms

> >> 3) Both query and response can be fit in a single UDP packet in most
> >> cases
> >
> > I'd just say minimal net traffic.  I can easilyimagine a service where
> > clients log in and leave a socket open and stream queries to the
> > service
> > rather than making individual UDP queries.  It probably would be a good
> > idea to say that questions and answers should be small enough to fit in
> > single UDP packets for simple query applications.
>
> A TCP connection puts more burden on the server, as now it (may) need
> to maintain 100s of thousands of open persistent connections.

It might, but on the other hand it's a lot easier to keep track of who's
at the other end of a TCP connection if your service is available only to
known users.  The DNS model of responding to anonymous UDP packets from
all over the net is one model but not the only one.  I know DNSBLs that
you can only use by AXFR or rsync or http to pick up the data.

For reputation sytems, I expect that it'll be far more common to require
that clients preregister since the data being provided is more likely to
be legally touchy, so I don't want to mandate any details of the access
model other than that the per-message cost has to be low.

Regards,
John Levine, johnl <at> taugh.com, Taughannock Networks, Trumansburg NY
"I dropped the toothpaste", said Tom, crestfallenly.


Gmane