On Thu, Aug 16, 2012 at 5:41 PM, Ryosuke Niwa <
rniwa <at> webkit.org> wrote:
> On Thu, Aug 16, 2012 at 5:18 PM, Dirk Pranke <
dpranke <at> chromium.org> wrote:
>>
>> On Thu, Aug 16, 2012 at 3:50 PM, Stephen Chenney <
schenney <at> chromium.org>
>> wrote:
>> > I agree with the priorities above, at least. I also agree with the
>> > overall
>> > goal of making our implementation match our philosophy on testing.
>> >
>> > Ryosuke has raised a very valid point: it is not clear what a Gardener
>> > should do with a test that has changed behavior. Personally, the area I
>> > deal
>> > with is very susceptible to the "single pixel differences matter" issue,
>> > and
>> > I would prefer gardeners to flag updates in some way so that I can look
>> > at
>> > it myself (or one of the very few other experts can look at it). Maybe
>> > I'm
>> > just a control freak.
>> >
>> > In the current situation, gardeners are not likely to roll out a patch
>> > short
>> > of build failures, obvious test or perf regressions, or security issues.
>> > For
>> > the bulk of cases where a test result has changed, the gardener will
>> > either
>> > add it to expectations or go ahead and update the result.
>> >
>> > The result is a mixture of incorrect expected results and files listed
>> > in
>> > TestExpectations that may or may not be correct. The way I deal with
>> > this
>> > right now is to maintain a snapshot of TestExpectations and occasionally
>> > go
>> > through and look for newly added results, and check if they are in fact
>> > correct. And every now and then go look at older entries to see if they
>> > have
>> > been updated. Or I get lucky and notice that a new expected result has
>> > been
>> > checked in that is incorrect (I can mostly catch that by following
>> > checkins
>> > in the code).
>> >
>> > The proposed change does not alter this fundamental dynamic. Depending
>> > on
>> > what policy gardeners adopt, they might now rename the result as failing
>> > and
>> > remove the expected, or maybe they'll just update the expected. I'm
>> > still in
>> > a situation where I don't know the exact status of any given result, and
>> > where I have no easy way of knowing if a gardener has correctly updated
>> > an
>> > expected or failing image.
>>
>> So, if we had -passing/-failing, then people who knew what the results
>> were supposed to be in a given directory (call them "owners") could
>> rename the existing -expected files into one category or another, and
>> then a gardener could add newly-failing tests as -expected; it would
>> then be trivial for the owner to look for new -expected files in that
>> directory.
>>
>> Right now, an owner would have to periodically scan the directory for
>> files changed since the last time they scanned the directory, and
>> would have no way of knowing whether an updated file was "correct" or
>> not (perhaps you could filter out by who committed the change, but
>> it's certainly harder than ls *-expected).
>>
>> So, I would contend that my proposal would make it easier for us to
>> have a process by which gardeners could punt on changes they're unsure
>> about and for owners to subsequently review them. I think you could
>> have a similar model if we just checked in *all* new results into
>> -expected, but it would be harder to implement (perhaps not
>> impossible, though, I haven't given it much thought yet).
>
>
> That seems like a much better model.