blackbox.dev.ml | 3 Aug 02:49 2013
Picon

Why GHC is written in Happy and not a monadic parser library?

Hi!
Is there any specific reason why GHC is written in a parser GENERATOR (Happy) and not in MONADIC PARSER COMBINATOR (like parsec)?

Is Happy faster / handles better errors / hase some great features or anything else?

thank you :)
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Brandon Allbery | 3 Aug 03:10 2013
Picon

Re: Why GHC is written in Happy and not a monadic parser library?

On Fri, Aug 2, 2013 at 8:49 PM, blackbox.dev.ml <blackbox.dev.ml <at> gmail.com> wrote:
Is there any specific reason why GHC is written in a parser GENERATOR (Happy) and not in MONADIC PARSER COMBINATOR (like parsec)?

Is Happy faster / handles better errors / hase some great features or anything else?

Most probably because GHC predates practical parser combinators. Happy is just a yacc clone, really; ancient tech. And I would suspect that replacing the parser at this point could get pretty painful.

--
brandon s allbery kf8nh                               sine nomine associates
allbery.b <at> gmail.com                                  ballbery <at> sinenomine.net
unix, openafs, kerberos, infrastructure, xmonad        http://sinenomine.net
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Jason Dagit | 3 Aug 03:20 2013
Picon

Re: Why GHC is written in Happy and not a monadic parser library?

On Fri, Aug 2, 2013 at 5:49 PM, blackbox.dev.ml
<blackbox.dev.ml <at> gmail.com> wrote:
> Hi!
> Is there any specific reason why GHC is written in a parser GENERATOR
> (Happy) and not in MONADIC PARSER COMBINATOR (like parsec)?
>
> Is Happy faster / handles better errors / hase some great features or
> anything else?

One reason is that  it predates monadic parser libraries.

Pros of parser generators:
  + Static analysis of the grammar (shift/reduce conflicts, debuggers, etc)
  + More succinct definition of grammar / semantic actions (easier to
formally reason about the language)
  + Efficient parser generation
  + attribute grammars

Monadic parsers are not amenable to static analysis in the same ways.
Applicative parsers hypothetically do not have this limitation, but
none of the applicative parsers provide static analysis.

I would expect parsec to be slower than happy (although I haven't
benchmarked it).

The place where monadic parsers shine is when you don't know the
formal grammar or you can't have one.

Personally, I prefer parser generators although I think monadic
parsers have their time and place. Parsec is a wonderful replacement
to writing a recursive descent parser.

Lately, I've been modernizing the happy source code and making it so
that it better integrates with cabal. I hope this will help improve
the image of happy.

Jason
Malcolm Wallace | 3 Aug 12:36 2013

Re: Why GHC is written in Happy and not a monadic parser library?


On 3 Aug 2013, at 02:20, Jason Dagit wrote:

>> Hi!
>> Is there any specific reason why GHC is written in a parser GENERATOR
>> (Happy) and not in MONADIC PARSER COMBINATOR (like parsec)?
>> 
>> Is Happy faster / handles better errors / hase some great features or
>> anything else?
> 
> One reason is that  it predates monadic parser libraries.

I'm not entirely sure this is true.  I reckon the development of applicative parser combinators (used in the
implementation of the nhc12 compiler, way back in 1995 or so), is roughly contemporaneous with the
development of Happy, and its use inside ghc.  (I found a release note from Sept 1997 that said ghc had just
converted its interface-file parser to use Happy.)  Certainly table-driven parsers in non-functional
languages go back a lot further, and functional combinator-based parsing was then the relative newcomer.

As to why ghc switched to Happy, the literature of the time suggests that generated table-driven parsers
were faster than combinator-based parsers.  I'm not sure I have ever seen any performance figures to back
that up however.  And with the general improvement in performance of idiomatic Haskell over the last
twenty years, I'd be interested to see a modern comparison.

Regards,
    Malcolm
Jason Dagit | 3 Aug 22:03 2013
Picon

Re: Why GHC is written in Happy and not a monadic parser library?

On Sat, Aug 3, 2013 at 3:36 AM, Malcolm Wallace <malcolm.wallace <at> me.com> wrote:
>
> On 3 Aug 2013, at 02:20, Jason Dagit wrote:
>
>>> Hi!
>>> Is there any specific reason why GHC is written in a parser GENERATOR
>>> (Happy) and not in MONADIC PARSER COMBINATOR (like parsec)?
>>>
>>> Is Happy faster / handles better errors / hase some great features or
>>> anything else?
>>
>> One reason is that  it predates monadic parser libraries.
>
> I'm not entirely sure this is true.  I reckon the development of applicative parser combinators (used in
the implementation of the nhc12 compiler, way back in 1995 or so), is roughly contemporaneous with the
development of Happy, and its use inside ghc.  (I found a release note from Sept 1997 that said ghc had just
converted its interface-file parser to use Happy.)  Certainly table-driven parsers in non-functional
languages go back a lot further, and functional combinator-based parsing was then the relative newcomer.

Interesting. I know the happy source has copyright dates starting in
1991, so I figured it was developed to help write ghc. Perhaps they
were using it for source parsing before they adopted it for
interface-file parsing in 1997?

> As to why ghc switched to Happy, the literature of the time suggests that generated table-driven parsers
were faster than combinator-based parsers.  I'm not sure I have ever seen any performance figures to back
that up however.  And with the general improvement in performance of idiomatic Haskell over the last
twenty years, I'd be interested to see a modern comparison.

Another con of using parsec that I forgot to mention in my previous
email is that with Parsec you need to be explicit about backtracking
(use of try). Reasoning about the correct places to put try is not
always easy and parsec doesn't help you with the task. In my
experience, this is the main bug that people run into when using
parsec.

Jason
Malcolm Wallace | 4 Aug 10:33 2013

Re: Why GHC is written in Happy and not a monadic parser library?


On 3 Aug 2013, at 21:03, Jason Dagit wrote:

> Another con of using parsec that I forgot to mention in my previous
> email is that with Parsec you need to be explicit about backtracking
> (use of try). Reasoning about the correct places to put try is not
> always easy and parsec doesn't help you with the task. In my
> experience, this is the main bug that people run into when using
> parsec.

Although the original question did not mention parsec explicitly, I find it disappointing that many
people immediately think of it as the epitome of monadic combinator parsing.  The power of good marketing,
eh?  There are so many other good parsing libraries out there.  Parsec happened to cure some known
space-leaks in rival libraries about the time of its release (2000 or so), but the main reason it is popular
is simply because it was distributed alongside ghc for a long time.

Curiously enough, the complaint you make about parsec is exactly the same as the observation that drove the
development of my own set of combinators - polyparse.  But the concept of commitment to a partial parse (to
prevent backtracking) was already present somewhat in Röjemo's applicative parsers way back in 1994. 
He had both `ap` and `apCut` (the naming of "cut" borrowed from Prolog I suppose).  Space performance and
the elimination of space leaks was totally his focus: he mentions being able to recompile the compiler
itself in just 3Mb of RAM.

Regards,
    Malcolm
Roman Cheplyaka | 4 Aug 10:53 2013

Re: Why GHC is written in Happy and not a monadic parser library?

* Malcolm Wallace <malcolm.wallace <at> me.com> [2013-08-04 09:33:22+0100]
> 
> On 3 Aug 2013, at 21:03, Jason Dagit wrote:
> 
> > Another con of using parsec that I forgot to mention in my previous
> > email is that with Parsec you need to be explicit about backtracking
> > (use of try). Reasoning about the correct places to put try is not
> > always easy and parsec doesn't help you with the task. In my
> > experience, this is the main bug that people run into when using
> > parsec.
> 
> Although the original question did not mention parsec explicitly, I
> find it disappointing that many people immediately think of it as the
> epitome of monadic combinator parsing.  The power of good marketing,
> eh?  There are so many other good parsing libraries out there.  Parsec
> happened to cure some known space-leaks in rival libraries about the
> time of its release (2000 or so), but the main reason it is popular is
> simply because it was distributed alongside ghc for a long time.

I would also add to this a catchy and well-marketed name, which is often
used as a generic name for a parser combinator library.

Roman

Gmane