Achim Schneider | 16 May 19:40
Picon

Proving my point

"test.l" (line 7, column 1):
unexpected end of input
expecting "(", Lambda abstraction, Let binding, Atom, end of input or
Function application

I obviously don't know anything about Parsec's inner workings. I'm
going to investigate as soon as I stopped despairing.

--

-- 
(c) this sig last receiving data processing entity. Inspect headers for
past copyright information. All rights reserved. Unauthorised copying,
hiring, renting, public performance and/or broadcasting of this
signature prohibited. 
Andrew Coppin | 16 May 20:52

Re: Proving my point

Achim Schneider wrote:
> "test.l" (line 7, column 1):
> unexpected end of input
> expecting "(", Lambda abstraction, Let binding, Atom, end of input or
> Function application
>
> I obviously don't know anything about Parsec's inner workings. I'm
> going to investigate as soon as I stopped despairing.
>   

Wait... "unexpected end of input; expecting [...] end of input [...]"

That's just *wrong*...! ;-)

But don't despaire - show us your parser and what it's supposed to 
parse, and I'm sure somebody [maybe even me] will be able to tell you 
what's up.
Achim Schneider | 16 May 21:33
Picon

Re: Proving my point

Andrew Coppin <andrewcoppin <at> btinternet.com> wrote:

> Wait... "unexpected end of input; expecting [...] end of input [...]"
> 
> That's just *wrong*...! ;-)
> 
> But don't despaire - show us your parser and what it's supposed to 
> parse, and I'm sure somebody [maybe even me] will be able to tell you 
> what's up.

This is what I came up with while simplifying the parser:

import Text.Parsec

identifier = do
    whiteSpace
    s <- many1 letter
    whiteSpace
    return s

whiteSpace = do 
    eof <|> ((many $ choice [ char ' ', newline ]) >> return ())

main = do 
    let syn = runParser (do
        char '\\'
        many1 identifier
        char ':'
        whiteSpace
        identifier
(Continue reading)

Daniel Fischer | 16 May 22:25
Picon

Re: Re: Proving my point

Am Freitag, 16. Mai 2008 21:33 schrieb Achim Schneider:
> Andrew Coppin <andrewcoppin <at> btinternet.com> wrote:
> > Wait... "unexpected end of input; expecting [...] end of input [...]"
> >
> > That's just *wrong*...! ;-)
> >
> > But don't despaire - show us your parser and what it's supposed to
> > parse, and I'm sure somebody [maybe even me] will be able to tell you
> > what's up.
>
> This is what I came up with while simplifying the parser:
>
> import Text.Parsec
>
> identifier = do
>     whiteSpace
>     s <- many1 letter
>     whiteSpace
>     return s
>
> whiteSpace = do
>     eof <|> ((many $ choice [ char ' ', newline ]) >> return ())
>
> main = do
>     let syn = runParser (do
>         char '\\'
>         many1 identifier
>         char ':'
>         whiteSpace
>         identifier
(Continue reading)

Achim Schneider | 16 May 22:47
Picon

Re: Proving my point

Daniel Fischer <daniel.is.fischer <at> web.de> wrote:

> [very helpful stuff]
>
> > Please, please don't ask me for the rationale of using eof like
> > this, you would get the same answer as if you'd ask me why I cast a
> > stone into the sea.
> 
> And why did you do that?
>
To cast away something I don't understand.

--

-- 
(c) this sig last receiving data processing entity. Inspect headers for
past copyright information. All rights reserved. Unauthorised copying,
hiring, renting, public performance and/or broadcasting of this
signature prohibited. 
Philippa Cowderoy | 16 May 22:25

Re: Re: Proving my point

On Fri, 16 May 2008, Achim Schneider wrote:

> Andrew Coppin <andrewcoppin <at> btinternet.com> wrote:
> 
> > Wait... "unexpected end of input; expecting [...] end of input [...]"
> > 
> > That's just *wrong*...! ;-)
> > 
> > But don't despaire - show us your parser and what it's supposed to 
> > parse, and I'm sure somebody [maybe even me] will be able to tell you 
> > what's up.
> 
> This is what I came up with while simplifying the parser:
> 
> import Text.Parsec
> 
> identifier = do
>     whiteSpace
>     s <- many1 letter
>     whiteSpace
>     return s
> 
> whiteSpace = do 
>     eof <|> ((many $ choice [ char ' ', newline ]) >> return ())
> 
> main = do 
>     let syn = runParser (do
>         char '\\'
>         many1 identifier
>         char ':'
(Continue reading)

Philippa Cowderoy | 16 May 22:40

Re: Re: Proving my point

On Fri, 16 May 2008, Philippa Cowderoy wrote:

> Confusing, isn't it? It's almost the right message, too. I'm pretty sure 
> the misbehaviour's because eof doesn't consume - see what happens if you 
> put an error message on all of whiteSpace?
> 

It is indeed, and because the error merging code can't tell eof's "don't 
consume" from the "don't consume" try returns when its parm fails - nor 
is there any equivalent distinction in the error values. Which is to say: 
it's broken, but at least I know how to fix it in the library.

--

-- 
flippa <at> flippac.org

"The reason for this is simple yet profound. Equations of the form
x = x are completely useless. All interesting equations are of the
form x = y." -- John C. Baez
Philippa Cowderoy | 16 May 21:39

Re: Proving my point

On Fri, 16 May 2008, Achim Schneider wrote:

> "test.l" (line 7, column 1):
> unexpected end of input
> expecting "(", Lambda abstraction, Let binding, Atom, end of input or
> Function application
> 
> I obviously don't know anything about Parsec's inner workings. I'm
> going to investigate as soon as I stopped despairing.
> 

One gotcha here, which really wants fixing: if the show instance for your 
token type returns "", Parsec assumes that's EOF for error purposes. Guess 
who ran into that with a separate token for layout-inserted braces?

--

-- 
flippa <at> flippac.org

Sometimes you gotta fight fire with fire. Most
of the time you just get burnt worse though.
Achim Schneider | 16 May 22:49
Picon

Re: Proving my point

Philippa Cowderoy <flippa <at> flippac.org> wrote:

> On Fri, 16 May 2008, Achim Schneider wrote:
> 
> Guess who ran into that with a separate token for
> layout-inserted braces?
> 
It can't be me, as I attempted to be as lazy as possible, not going for
a tokenising pass, and ended up being too lazy.

--

-- 
(c) this sig last receiving data processing entity. Inspect headers for
past copyright information. All rights reserved. Unauthorised copying,
hiring, renting, public performance and/or broadcasting of this
signature prohibited. 
Philippa Cowderoy | 16 May 22:57

Re: Re: Proving my point

On Fri, 16 May 2008, Achim Schneider wrote:

> Philippa Cowderoy <flippa <at> flippac.org> wrote:
> 
> > On Fri, 16 May 2008, Achim Schneider wrote:
> > 
> > Guess who ran into that with a separate token for
> > layout-inserted braces?
> > 
> It can't be me, as I attempted to be as lazy as possible, not going for
> a tokenising pass, and ended up being too lazy.
> 

Nah, you just picked the wrong way to attempt discipline. I don't use 
separate tokenising/lexing passes in a lot of my code (though you can't 
really avoid it when you want to do layout), it's a matter of knowing how 
it's done. Unless you've got a lexical structure that prevents it (which 
is to say, there're situations in which two tokens following each other 
aren't allowed to have whitespace between them), it's a good idea to have 
your token productions eat any whitespace following them, and then your 
toplevel becomes:

do {whitespace; r <- realTopLevel; eof; return r}

and then you need never worry about it again.

--

-- 
flippa <at> flippac.org

Ivanova is always right.
(Continue reading)

Achim Schneider | 16 May 23:14
Picon

Re: Proving my point

Philippa Cowderoy <flippa <at> flippac.org> wrote:

> On Fri, 16 May 2008, Achim Schneider wrote:
> 
> > Philippa Cowderoy <flippa <at> flippac.org> wrote:
> > 
> > > On Fri, 16 May 2008, Achim Schneider wrote:
> > > 
> > > Guess who ran into that with a separate token for
> > > layout-inserted braces?
> > > 
> > It can't be me, as I attempted to be as lazy as possible, not going
> > for a tokenising pass, and ended up being too lazy.
> > 
> 
> Nah, you just picked the wrong way to attempt discipline. I don't use 
> separate tokenising/lexing passes in a lot of my code (though you
> can't really avoid it when you want to do layout), it's a matter of
> knowing how it's done. Unless you've got a lexical structure that
> prevents it (which is to say, there're situations in which two tokens
> following each other aren't allowed to have whitespace between them),
> it's a good idea to have your token productions eat any whitespace
> following them, and then your toplevel becomes:
> 
> do {whitespace; r <- realTopLevel; eof; return r}
> 
> and then you need never worry about it again.
> 
My problem is that realTopLevel = expr, and that I get into an infinite
recursion, never "closing" enough parens, never hitting eof.
(Continue reading)

Philippa Cowderoy | 16 May 23:46

Re: Re: Proving my point

On Fri, 16 May 2008, Achim Schneider wrote:

> My problem is that realTopLevel = expr, and that I get into an infinite
> recursion, never "closing" enough parens, never hitting eof.

Have you run into the left-recursion trap, by any chance?

This doesn't work:

expr = do expr; ...

You can cover common cases with combinators like many* and chain* though.

> Btw: Is there any way to make Parsec return a tree of things it tried?
> The end-user error messages are quite often just not informative enough
> while debugging the parser itself.
> 

If you're willing to accept a little pain, you can write a few helper 
functions akin to <?> that keep a log in Parsec's state and extract it 
from there.

--

-- 
flippa <at> flippac.org

Society does not owe people jobs.
Society owes it to itself to find people jobs.
Achim Schneider | 17 May 00:17
Picon

Re: Proving my point

Philippa Cowderoy <flippa <at> flippac.org> wrote:

> On Fri, 16 May 2008, Achim Schneider wrote:
> 
> > My problem is that realTopLevel = expr, and that I get into an
> > infinite recursion, never "closing" enough parens, never hitting
> > eof.
> 
> Have you run into the left-recursion trap, by any chance?
> 
> This doesn't work:
> 
> expr = do expr; ...
> 
expr =
    do {e <- parens expr; return $ Nest e}
    <|> lambda
    <|> _let
    <|> try app
    <|> atom

There's at least one token before any recursion, so I guess not. After
all, it terminates. It's my state that does not succeed in directing
the parser not to mess up, so I'm reimplementing the thing as a
two-pass but stateless parser now. Definitely the easier and clearer
thing to do: I can have an end of line token that carries the number of
trailing spaces, so I got perfect indent information without any pain
involved, at all, and don't have to make parsers fail based on state.

--

-- 
(Continue reading)

Philippa Cowderoy | 17 May 00:45

Re: Re: Proving my point

On Sat, 17 May 2008, Achim Schneider wrote:

> There's at least one token before any recursion, so I guess not. After
> all, it terminates. It's my state that does not succeed in directing
> the parser not to mess up, so I'm reimplementing the thing as a
> two-pass but stateless parser now.

In most cases, you're better off stateless unless you've got a really good 
reason for it. Or at least, not using the state for anything that affects 
the parse itself.

> Definitely the easier and clearer
> thing to do: I can have an end of line token that carries the number of
> trailing spaces, so I got perfect indent information without any pain
> involved, at all, and don't have to make parsers fail based on state.
> 

Definitely! Are you doing some form of layout? It's certainly not worth 
doing in one pass IMO, I ended up with a three pass design much like that 
in the Haskell 98 report. Well, that's an understatement - I took the 
algorithm from it! 

--

-- 
flippa <at> flippac.org

There is no magic bullet. There are, however, plenty of bullets that
magically home in on feet when not used in exactly the right circumstances.
Achim Schneider | 17 May 01:53
Picon

Re: Proving my point

Philippa Cowderoy <flippa <at> flippac.org> wrote:

> On Sat, 17 May 2008, Achim Schneider wrote:
> 
> > Definitely the easier and clearer
> > thing to do: I can have an end of line token that carries the
> > number of trailing spaces, so I got perfect indent information
> > without any pain involved, at all, and don't have to make parsers
> > fail based on state.
> > 
> 
> Definitely! Are you doing some form of layout? 

Yes, 

/pair x y m: m x y 
/fst z: z 
    \p q: p
/snd z: z \p q:
    q
/numbers: pair one two
/run: pair (fst numbers) (snd numbers)
run 

is supposed to work (/ indicates a let). I'm trying to purge scheme out
of my mind by implementing something that looks quite like it, and then
change it. The rule is simple: An indented line continues the previous,
and a non-indented closes every opened paren except one from the
previous line, eof closing all that are left. I still have to think
about recursive lets, but I guess I will go unlambda and just include a
(Continue reading)


Gmane