Reid Draper | 22 Nov 01:27 2013
Picon

Proposal: free shrinking with QuickCheck

I originally sent this message to the QuickCheck mailing list,
but haven't heard anything in a while, and realized there are
only ten subscribers. Below is the original message, any feedback
appreciated.

---

I have an idea to eliminate the `shrink` function from the `Arbitrary` type
class. Currently, users can optionally implement shrinking manually, this tends
to be boilerplate code and is "...clearly a generic programming problem" [1].
Further, users often have to create separate types to accommodate their
domain's restrictions. For example, a user may wish to only generate
power-of-two numbers for their test. Their code simply uses `Int`, but with
QuickCheck they must create a `newtype` wrapper and implement this logic in
both the `arbitrary` and `shrink` implementation. My idea is to
eliminate the `shrink` function, and to integrate shrinking with the
`arbitrary` function. Currently, the type for `arbitrary` is:

    arbitrary :: Gen a

and `Gen` has the definition:

    newtype Gen a = MkGen{ unGen :: StdGen -> Int -> a }

I suggest that instead of the inner-function returning `a`s, it should return
`RoseTree a`. The root of the tree is the generated value, and the rest of the
tree is all of the ways to shrink the value. Here's how things would fit
together:

    data RoseTree a = RoseTree a [RoseTree a]
(Continue reading)

Carter Schonwald | 22 Nov 02:05 2013
Picon

Re: Proposal: free shrinking with QuickCheck

alternatively, would it be possible to have a default definition of shrink using generics? (I'm still chewing on your email, but sounds like the current definition doesn't have a default generic method!)


On Thu, Nov 21, 2013 at 7:27 PM, Reid Draper <reiddraper <at> gmail.com> wrote:
I originally sent this message to the QuickCheck mailing list,
but haven't heard anything in a while, and realized there are
only ten subscribers. Below is the original message, any feedback
appreciated.

---

I have an idea to eliminate the `shrink` function from the `Arbitrary` type
class. Currently, users can optionally implement shrinking manually, this tends
to be boilerplate code and is "...clearly a generic programming problem" [1].
Further, users often have to create separate types to accommodate their
domain's restrictions. For example, a user may wish to only generate
power-of-two numbers for their test. Their code simply uses `Int`, but with
QuickCheck they must create a `newtype` wrapper and implement this logic in
both the `arbitrary` and `shrink` implementation. My idea is to
eliminate the `shrink` function, and to integrate shrinking with the
`arbitrary` function. Currently, the type for `arbitrary` is:


    arbitrary :: Gen a


and `Gen` has the definition:


    newtype Gen a = MkGen{ unGen :: StdGen -> Int -> a }


I suggest that instead of the inner-function returning `a`s, it should return
`RoseTree a`. The root of the tree is the generated value, and the rest of the
tree is all of the ways to shrink the value. Here's how things would fit
together:


    data RoseTree a = RoseTree a [RoseTree a]

    -- this is the same as the current Gen
    newtype Generator a = Gen { unGen :: StdGen -> Int -> a }

    newtype Gen a = MkGen { unGen :: Gen (RoseTree a) }


Conveniently, `Gen` still has a unary type constructor of `a`. Further, if
users have implemented their `arbitrary` implementations in terms of the
QuickCheck combinators, their code won't have to change at all (I don't think…).
The lower-level combinators would be updated to manipulate trees, with
functions like `joinRose` and `filterRose`. Let's next look at how `Gen`
would implement the monad type class:


    instance Monad Gen where
        return = MkGen . return . return

        gen >>= f = MkGen $ helper (unGen gen) (unGen . f)
            -- sequence is from Data.Traversable
            where helper m k = m >>= \y -> fmap joinRose $ sequence $ fmap k y


The implementation of `return` is clear. `>>=` isn't too bad either: the
function provided to bind will return `Gen b`'s, and `joinRose` and `sequence`
help us pull this `Generator (RoseTree b))` 'out', much like `join` does. This
means our users can still write code like:


    (arbitrary :: Gen [Int]) >>= elements


Which brings up a good point. The above code has an issue, `elements` is a
partial function with respect to the empty list. With the current
implementation, we'd use the `NonEmptyList` modifier, which much respect this
predicate both during generation and shrinking. This change would allow all
predicates to be expressed simply with `suchThat`, which, since it acts on both
values _and_ shrink trees, applies the predicate in both places. The
implementation of `suchThat` would have to unwrap `Gen`, and apply `roseFilter`
to the `RoseTree` inside of `Generator`, using `fmap`.

I have implemented the above description in my Clojure port of QuickCheck,
called simple-check [1], and it seems to be working quite nicely. Further, I
suspect Erlang QuickCheck [3] is doing something similar, though their
implementation is not open source, I can't presume too much.

Clearly this would be a big change, and my goal now is simply to start a
discussion: how does this sound? What's wrong, what's likely to break? Feedback
and criticism welcome.

Reid


[1] Scrap your boilerplate with class: extensible generic functions: http://research.microsoft.com/pubs/67439/gmap3.pdf

[2] https://github.com/reiddraper/simple-check

[3] http://www.quviq.com/index.html

_______________________________________________
Libraries mailing list
Libraries <at> haskell.org
http://www.haskell.org/mailman/listinfo/libraries

_______________________________________________
Libraries mailing list
Libraries <at> haskell.org
http://www.haskell.org/mailman/listinfo/libraries
Bryan O'Sullivan | 22 Nov 19:02 2013

Re: Proposal: free shrinking with QuickCheck


On Thu, Nov 21, 2013 at 4:27 PM, Reid Draper <reiddraper <at> gmail.com> wrote:
I originally sent this message to the QuickCheck mailing list,
but haven't heard anything in a while, and realized there are
only ten subscribers. Below is the original message, any feedback
appreciated.

I don't love this idea on first reading. The move to switch to generating a rose tree seems rather intrusive and like it could put a lot of complexity right in the paths of developers. That being said, I'd be interested in seeing the idea fleshed out to the point of determining whether my worry has any teeth - could the end-user writing of an Arbitrary instance remain unaffected in practice?
_______________________________________________
Libraries mailing list
Libraries <at> haskell.org
http://www.haskell.org/mailman/listinfo/libraries
Reid Draper | 22 Nov 20:18 2013
Picon

Re: Proposal: free shrinking with QuickCheck


On Nov 22, 2013, at 12:02 PM, Bryan O'Sullivan <bos <at> serpentine.com> wrote:


On Thu, Nov 21, 2013 at 4:27 PM, Reid Draper <reiddraper <at> gmail.com> wrote:
I originally sent this message to the QuickCheck mailing list,
but haven't heard anything in a while, and realized there are
only ten subscribers. Below is the original message, any feedback
appreciated.

I don't love this idea on first reading. The move to switch to generating a rose tree seems rather intrusive and like it could put a lot of complexity right in the paths of developers. That being said, I'd be interested in seeing the idea fleshed out to the point of determining whether my worry has any teeth - could the end-user writing of an Arbitrary instance remain unaffected in practice?

I'll take a stab at a minimal implementation. The goal would certainly be for users to be able to simply delete their `shrink` function. If they were previously using the default (shrink _ = []), they'd then get shrinking for free. For those looking for more motivation, take a look at how boilerplate shrink implementations end up being in practice. Further, care must be given to respect any implicit constraints during shrinking, that are respected during generation. For example, when generating with `choose 10 20`, you must never shrink to a value below 10. This would all be taken care of using the proposed method.

pandoc, which doesn't have shrinking, conceivably it'd be quite tedious: https://github.com/jgm/pandoc/blob/master/tests/Tests/Arbitrary.hs

_______________________________________________
Libraries mailing list
Libraries <at> haskell.org
http://www.haskell.org/mailman/listinfo/libraries
Joseph Abrahamson | 22 Nov 21:57 2013

Re: Proposal: free shrinking with QuickCheck

I really like this idea, but I think I’d personally love to see some examples of it in practice. 75% of my Arbitrary instances are “traverse arbitrarily” and 95% of them keep Gen abstract, so I have a good feeling that it’d be possible to implement with relatively small amounts of user pain.

The thing I wonder about is how often `arbitrary` declarations provide enough information for efficient, meaningful `shrink` declarations. If the two differ it may be more necessary to break into `Gen` to provide the right kind of behavior—which would be increasingly complex.

I’d love to see some examples of complex generators from simple-check implemented in this style. As an ugly example I frequently run into in my own code, it’s often hard to write a good arbitrary instance for UTCTime—it’s very domain specific what defines a good arbitrary time. Would such kinds of complex generator code also lead to a nice shrink function? Or would it require something more complex?

-- 
Joseph Abrahamson
Sent with Airmail

On November 22, 2013 at 2:18:13 PM, Reid Draper (reiddraper <at> gmail.com) wrote:


On Nov 22, 2013, at 12:02 PM, Bryan O'Sullivan <bos <at> serpentine.com> wrote:


On Thu, Nov 21, 2013 at 4:27 PM, Reid Draper <reiddraper <at> gmail.com> wrote:
I originally sent this message to the QuickCheck mailing list,
but haven't heard anything in a while, and realized there are
only ten subscribers. Below is the original message, any feedback
appreciated.

I don't love this idea on first reading. The move to switch to generating a rose tree seems rather intrusive and like it could put a lot of complexity right in the paths of developers. That being said, I'd be interested in seeing the idea fleshed out to the point of determining whether my worry has any teeth - could the end-user writing of an Arbitrary instance remain unaffected in practice?

I'll take a stab at a minimal implementation. The goal would certainly be for users to be able to simply delete their `shrink` function. If they were previously using the default (shrink _ = []), they'd then get shrinking for free. For those looking for more motivation, take a look at how boilerplate shrink implementations end up being in practice. Further, care must be given to respect any implicit constraints during shrinking, that are respected during generation. For example, when generating with `choose 10 20`, you must never shrink to a value below 10. This would all be taken care of using the proposed method.

pandoc, which doesn't have shrinking, conceivably it'd be quite tedious: https://github.com/jgm/pandoc/blob/master/tests/Tests/Arbitrary.hs

_______________________________________________
Libraries mailing list
Libraries <at> haskell.org
http://www.haskell.org/mailman/listinfo/libraries
_______________________________________________
Libraries mailing list
Libraries <at> haskell.org
http://www.haskell.org/mailman/listinfo/libraries
Reid Draper | 25 Nov 21:44 2013
Picon

Re: Proposal: free shrinking with QuickCheck


On Nov 22, 2013, at 2:57 PM, Joseph Abrahamson <me <at> jspha.com> wrote:

I really like this idea, but I think I’d personally love to see some examples of it in practice. 75% of my Arbitrary instances are “traverse arbitrarily” and 95% of them keep Gen abstract, so I have a good feeling that it’d be possible to implement with relatively small amounts of user pain.

That's been my experience as well, though I haven't even had the 5% where I've had to break the Gen abstraction.


The thing I wonder about is how often `arbitrary` declarations provide enough information for efficient, meaningful `shrink` declarations. If the two differ it may be more necessary to break into `Gen` to provide the right kind of behavior—which would be increasingly complex.

My experience with both Erlang QuickCheck and simple-check (Clojure) is that it _is_ enough information. In the rare cases it's not, you can always still implement your own logic for shrinking by breaking the Gen abstraction (manipulating the RoseTree yourself).


I’d love to see some examples of complex generators from simple-check implemented in this style. As an ugly example I frequently run into in my own code, it’s often hard to write a good arbitrary instance for UTCTime—it’s very domain specific what defines a good arbitrary time. Would such kinds of complex generator code also lead to a nice shrink function? Or would it require something more complex?

 I think a powerful example is an 'any' generator I have in simple-check, which will generate arbitrary Clojure values [1]. It's constructed in exactly the same way as you'd do it with today's QuickCheck, but it also shrinks as well as a hand-written shrink algorithm. (I've found edge-cases in Clojure's reader/writer this way).


_______________________________________________
Libraries mailing list
Libraries <at> haskell.org
http://www.haskell.org/mailman/listinfo/libraries
Bryan O'Sullivan | 26 Nov 01:09 2013

Re: Proposal: free shrinking with QuickCheck


On Mon, Nov 25, 2013 at 12:44 PM, Reid Draper <reiddraper <at> gmail.com> wrote:
My experience with both Erlang QuickCheck and simple-check (Clojure) is that it _is_ enough information. In the rare cases it's not, you can always still implement your own logic for shrinking by breaking the Gen abstraction (manipulating the RoseTree yourself).

It seems we're cautiously optimistic that this could work. I think the next step is up to you.

If I were in your shoes, here's what I'd do: patch up QuickCheck to make this change, and then for some fairly large handful of widely-used packages on Hackage that have test suites, see how many of them can build more or less unscathed, maybe with the shrink method still present in the typeclass so that packages that define it don't have spurious build failures - and then see how many test suites continue to work okay.

In other words, I think this is worth pursuing, and that the best way to make progress on it is to try it and see how well it goes. That way, you can come back armed with both a diff and evidence that it's good.
_______________________________________________
Libraries mailing list
Libraries <at> haskell.org
http://www.haskell.org/mailman/listinfo/libraries
Reid Draper | 26 Nov 03:06 2013
Picon

Re: Proposal: free shrinking with QuickCheck


On Nov 25, 2013, at 6:09 PM, Bryan O'Sullivan <bos <at> serpentine.com> wrote:


On Mon, Nov 25, 2013 at 12:44 PM, Reid Draper <reiddraper <at> gmail.com> wrote:
My experience with both Erlang QuickCheck and simple-check (Clojure) is that it _is_ enough information. In the rare cases it's not, you can always still implement your own logic for shrinking by breaking the Gen abstraction (manipulating the RoseTree yourself).

It seems we're cautiously optimistic that this could work. I think the next step is up to you.

If I were in your shoes, here's what I'd do: patch up QuickCheck to make this change, and then for some fairly large handful of widely-used packages on Hackage that have test suites, see how many of them can build more or less unscathed, maybe with the shrink method still present in the typeclass so that packages that define it don't have spurious build failures - and then see how many test suites continue to work okay.

In other words, I think this is worth pursuing, and that the best way to make progress on it is to try it and see how well it goes. That way, you can come back armed with both a diff and evidence that it's good.

Good advice, thanks.

Reid
_______________________________________________
Libraries mailing list
Libraries <at> haskell.org
http://www.haskell.org/mailman/listinfo/libraries

Gmane