[stack] peg: a lazy, non-deterministic concatenative language
2012-04-16 14:47:27 GMT
John Nowak <john <at> johnnowak.com> wrote:
> https://github.com/HackerFoo/peg
> Not mine.
Amazing... This is creative. He's parsing right to left, and executing
only what's needed to evaluate the last word. In cases where
evaluation is ambiguous, he chooses "nondeterministically" (I'm not
sure what that means). If execution fails, he backtracks to a previous
ambiguous point. Words are provided to force failures.
The language uses stack manipulation words, but I'm not sure that it's
actually a stack. Anyone have any discussion on that point?
> - jn
-Wm
Agreed. Peg is the most interesting variant I've seen in quite a while. I haven't looked deeply yet but it seems to have an Icon flavor to it. Of course, any concatentative language doesn't have to use a stack, it's just most convenient. -- don On 17 Apr 2012, at 05:26, William Tanksley, Jr wrote: > John Nowak <john <at> johnnowak.com> wrote: > > https://github.com/HackerFoo/peg > > Not mine. > > Amazing... This is creative. He's parsing right to left, and executing > only what's needed to evaluate the last word. In cases where > evaluation is ambiguous, he chooses "nondeterministically" (I'm not > sure what that means). If execution fails, he backtracks to a previous > ambiguous point. Words are provided to force failures. > > The language uses stack manipulation words, but I'm not sure that it's > actually a stack. Anyone have any discussion on that point? > > > - jn > > -Wm > [Non-text portions of this message have been removed] ------------------------------------ Yahoo! Groups Links <*> To visit your group on the web, go to: http://groups.yahoo.com/group/concatenative/ <*> Your email settings: Individual Email | Traditional <*> To change settings online go to: http://groups.yahoo.com/group/concatenative/join (Yahoo! ID required) <*> To change settings via email: concatenative-digest <at> yahoogroups.com concatenative-fullfeatured <at> yahoogroups.com <*> To unsubscribe from this group, send an email to: concatenative-unsubscribe <at> yahoogroups.com <*> Your use of Yahoo! Groups is subject to: http://docs.yahoo.com/info/terms/
Don Groves <dgpdx <at> comcast.net> wrote:
> Of course, any concatentative language doesn't have to use a stack,
> it's just most convenient.
"Convenient" is kind of vague. A stack is the only parameter-passing
data structure we know of that's computationally complete/Turing
equivalent and doesn't require application. You can also make a
concatenative language using a single integer accumulator, for
example, and it's simple to implement and understand (but it's
ridiculously weak). Since a stack is the only one we know about it's
of course the most convenient one we know about... But what ones don't
we know about?
Here's another question: is peg _formally_ concatenative? That is,
given two valid programs, is their concatenation also a valid program?
Are the semantics of the resulting program the composition of the
semantics of the original two?
I think the answer is "no". A program "a1" and "a2", both of which
deterministically execute something like "False assert" or an
undefined word (which means that they always halt). The concatenation
"a1 a2" will perform (deterministically) the semantics of a2, but
never a1.
This doesn't make "peg" a bad language... But it does suggest my
"formal" rule for testing concatenativity wants a weaker variant.
Another example of the need for a weakening is words that perform
control flow alteration, like "RETURN" and "call/cc", or Forth's R>
and EXIT.
For "peg" itself, it's easy to see that concatenativity DOES hold for
programs where the parse "continues" past the beginning of the
program. For Forth, concatenativity holds when certain words in the
dictionary are not used. Rice's theorem will tangle us up for both
languages, but in general it's easy to see when a Forth program is
concatenative. Parsing a peg program MIGHT be possible to explode to
O(2^n), but I think it'll always terminate.
> don
-Wm
On Tue, Apr 17, 2012 at 3:48 PM, William Tanksley, Jr <wtanksleyjr <at> gmail.com
> wrote:
> **
>
>
> Don Groves <dgpdx <at> comcast.net> wrote:
> > Of course, any concatentative language doesn't have to use a stack,
> > it's just most convenient.
>
> "Convenient" is kind of vague. A stack is the only parameter-passing
> data structure we know of that's computationally complete/Turing
> equivalent and doesn't require application. You can also make a
> concatenative language using a single integer accumulator, for
> example, and it's simple to implement and understand (but it's
> ridiculously weak). Since a stack is the only one we know about it's
> of course the most convenient one we know about... But what ones don't
> we know about?
>
> Your comment and the \/ branch operator made me wonder about the
possibility of a concatenative language that operated on a 'graph-structured
stack'? (and if such a thing would be useful)
> Here's another question: is peg _formally_ concatenative? That is,
> given two valid programs, is their concatenation also a valid program?
> Are the semantics of the resulting program the composition of the
> semantics of the original two?
>
> I think the answer is "no". A program "a1" and "a2", both of which
> deterministically execute something like "False assert" or an
> undefined word (which means that they always halt). The concatenation
> "a1 a2" will perform (deterministically) the semantics of a2, but
> never a1.
>
> This doesn't make "peg" a bad language... But it does suggest my
> "formal" rule for testing concatenativity wants a weaker variant.
> Another example of the need for a weakening is words that perform
> control flow alteration, like "RETURN" and "call/cc", or Forth's R>
> and EXIT.
>
> For "peg" itself, it's easy to see that concatenativity DOES hold for
> programs where the parse "continues" past the beginning of the
> program. For Forth, concatenativity holds when certain words in the
> dictionary are not used. Rice's theorem will tangle us up for both
> languages, but in general it's easy to see when a Forth program is
> concatenative. Parsing a peg program MIGHT be possible to explode to
> O(2^n), but I think it'll always terminate.
>
> > don
>
> -Wm
>
>
>
--
--
Stephen De Gabrielle
stephen.degabrielle <at> acm.org
Telephone +44 (0)20 85670911
Mobile +44 (0)79 85189045
http://www.degabrielle.name/stephen
----
Professor: Oh God! I clicked without reading!
Cubert: And I slightly modified something I own!
Professor: We're monsters!
[Non-text portions of this message have been removed]
Stephen De Gabrielle <spdegabrielle <at> gmail.com> wrote:
> Your comment and the \/ branch operator made me wonder about the
> possibility of a concatenative language that operated on a 'graph-structured
> stack'? (and if such a thing would be useful)
That's a good question. You'd have to provide more structure than a
simple "graph", since a graph has no orientation. Obviously you've got
a root and at least one current position... But how do branches
happen, and how do mergers happen?
> William Tanksley, Jr <wtanksleyjr <at> gmail.com wrote:
>> Here's another question: is peg _formally_ concatenative? That is,
>> given two valid programs, is their concatenation also a valid program?
>> Are the semantics of the resulting program the composition of the
>> semantics of the original two?
>> I think the answer is "no". A program "a1" and "a2", both of which
>> deterministically execute something like "False assert" or an
>> undefined word (which means that they always halt). The concatenation
>> "a1 a2" will perform (deterministically) the semantics of a2, but
>> never a1.
Since hackerfoo just joined the group, we may be soon receiving an
answer! I look forward to it.
-Wm
--- In concatenative <at> yahoogroups.com, "William Tanksley, Jr" <wtanksleyjr <at> ...> wrote:
>
> Don Groves <dgpdx <at> ...> wrote:
> > Of course, any concatentative language doesn't have to use a stack,
> > it's just most convenient.
>
> "Convenient" is kind of vague. A stack is the only parameter-passing
> data structure we know of that's computationally complete/Turing
> equivalent and doesn't require application. You can also make a
> concatenative language using a single integer accumulator, for
> example, and it's simple to implement and understand (but it's
> ridiculously weak). Since a stack is the only one we know about it's
> of course the most convenient one we know about... But what ones don't
> we know about?
The stack works essentially as a sort of caching structure. Whichever caching strategy you choose determines the data structure.
Last in, first out = queue
Highest/lowest weighted out = priority queue
First in, first out = stack
There can be only one = a single item (accumulator)
I can't think of any other strategies that have a single unambiguous "top" element. FIFO seems to be the only reasonable caching strategy, though.
>
> Here's another question: is peg _formally_ concatenative? That is,
> given two valid programs, is their concatenation also a valid program?
> Are the semantics of the resulting program the composition of the
> semantics of the original two?
>
> I think the answer is "no". A program "a1" and "a2", both of which
> deterministically execute something like "False assert" or an
> undefined word (which means that they always halt). The concatenation
> "a1 a2" will perform (deterministically) the semantics of a2, but
> never a1.
>
Peg is a pure functional language, so it doesn't matter if a1 is executed if a2 yields no results; the execution of a1 has no observable effects. The exception is if a1 performs I/O. Unfortunately, I can't see any way to fix this, because the Peg interpreter could never perform any I/O for fear of an unseen 'False assert' being concatenated to the right.
For Peg, it is formally concatenative with some modifications:
1. The expression to the right must be fully evaluated before evaluation of a sub-expression containing an IO token.
2. Concatenation must be performed on the Cartesian products of the results of evaluation of the sub-expressions
dustin.deweese <dustin.deweese <at> gmail.com> wrote:
> --- In concatenative <at> yahoogroups.com, "William Tanksley, Jr" <wtanksleyjr <at> ...> wrote:
>> Don Groves <dgpdx <at> ...> wrote:
>> > Of course, any concatentative language doesn't have to use a stack,
>> > it's just most convenient.
>> "Convenient" is kind of vague. A stack is the only parameter-passing
>> data structure we know of that's computationally complete/Turing
>> equivalent and doesn't require application. You can also make a
>> concatenative language using a single integer accumulator, for
>> example, and it's simple to implement and understand (but it's
>> ridiculously weak). Since a stack is the only one we know about it's
>> of course the most convenient one we know about... But what ones don't
>> we know about?
> The stack works essentially as a sort of caching structure. Whichever caching strategy you choose determines the data structure.
> Last in, first out = queue
> Highest/lowest weighted out = priority queue
> First in, first out = stack
> There can be only one = a single item (accumulator)
> I can't think of any other strategies that have a single unambiguous "top" element. FIFO seems to be the only reasonable caching strategy, though.
That's a clever approach. Although I liked it at first glance, I think
it's not quite complete. We know concatenative languages that use an
accumulator and ones that use a stack; we also know ones that use
multiples of each (for example, ANS Forth can have a separate floating
point stack, and Machine Forth uses a memory fetch register; note that
the return stack in Forth does NOT count, since accessing that makes
the program NOT concatenative). The problem is that the choice between
stack and accumulator changes the language in a profound way. So if it
were possible to arrange a queue in some way instead of a stack, what
would the resulting language look like? We have no semantics for such
a language (yet).
I'm also thinking about the statement that there should be only one
unambiguous "top" element. This is essentially right, of course;
accepting that when there are two top elements they can be
disambiguated by the word that's needing the data (as with ANS Forth's
floating point vocabulary, whose words may pull from a distinct
stack). There are more details, of course, since some words accept
more than one parameter, so at the time the word is called the
structure must provide ALL of its elements to it unambiguously (so not
just the top item).
A stack meets that requirement by imposing a total ordering of all of
its contents at all times. I think it's fair to say that an
accumulator does as well (consider a language over matrices or quantum
mechanical operators instead of over the integers). If the only
possible data structure must impose a total ordering, then a stack is
the only possibility. BUT... does peg impose a total ordering? Doesn't
the nondeterminism hint that there's something other than a total
ordering?
Actually, because a peg program must be parsed backward, it occurs to
me to take a lesson from Stevan Apter, and make the language's parse
structure explicit. In order to be parsed backward, it would make
sense to describe the start of a parse as pushing the REVERSE of the
initial program onto a stack (which plays the same role as Forth's
return stack and xy's dequeue.
I'd better send this -- otherwise I'll never finish writing it.
Undeniably peg is stack-based, but there's a lot more going on in
terms of data structures and algorithms than just the usual return
stack-data stack pair, and I think the additional stuff is
interesting.
>> Here's another question: is peg _formally_ concatenative? That is,
>> given two valid programs, is their concatenation also a valid program?
>> Are the semantics of the resulting program the composition of the
>> semantics of the original two?
>> I think the answer is "no". A program "a1" and "a2", both of which
>> deterministically execute something like "False assert" or an
>> undefined word (which means that they always halt). The concatenation
>> "a1 a2" will perform (deterministically) the semantics of a2, but
>> never a1.
> Peg is a pure functional language, so it doesn't matter if a1 is executed if a2
> yields no results; the execution of a1 has no observable effects.
Oh, so I actually completely misunderstood. "False assert" doesn't
halt the PROGRAM; rather, it halts the PARSE; it actually sends the
program into a "never terminates" state. That makes perfect sense; and
seems quite clever.
> The exception is if a1 performs I/O. Unfortunately, I can't see any way to fix
> this, because the Peg interpreter could never perform any I/O for fear of an
> unseen 'False assert' being concatenated to the right.
Then let's just note that performing IO renders undefined results in
the presence of nonterminating programs. Good ol' "undefined results".
> For Peg, it is formally concatenative with some modifications:
> 1. The expression to the right must be fully evaluated before evaluation of a sub-expression containing an IO token.
That seems like a nice rule.
> 2. Concatenation must be performed on the Cartesian products of the results of evaluation of the sub-expressions
You MIGHT mean composition rather than concatenation. Is that the case?
If so, you've discovered a new subvariant of a "concatenative
language". The usual rule is "the syntax of concatenation implies the
semantics of composition", but for your language the rule is "the
syntax of concatenation implies the semantics of composition of all
Cartesian products".
I like it.
Now, Enchilada did something similar, but IIRC it didn't do it with
mere concatenation (van Dalen?); I don't know enough about Ripple to
say anything, but it either did this or allowed some explicit
operations to cause it (fortytwo, are you listening?).
Either way, I just learned something new.
-Wm
--- In concatenative <at> yahoogroups.com, "William Tanksley, Jr" <wtanksleyjr <at> ...> wrote:
> A stack meets that requirement by imposing a total ordering of all of
> its contents at all times. I think it's fair to say that an
> accumulator does as well (consider a language over matrices or quantum
> mechanical operators instead of over the integers). If the only
> possible data structure must impose a total ordering, then a stack is
> the only possibility. BUT... does peg impose a total ordering? Doesn't
> the nondeterminism hint that there's something other than a total
> ordering?
The stack for any particular execution path has a total ordering. You could use non-determinism to make the stack appear only partially-ordered, though, with an expression such as:
1 2 3 [] [swap] \/ $
> > 2. Concatenation must be performed on the Cartesian products of the results of evaluation of the sub-expressions
>
> You MIGHT mean composition rather than concatenation. Is that the case?
>
The second rule wasn't worded very well. I tried to make it more precise:
2. The composition of two sub-expressions is equivalent to concatenating each of the Cartesian products of the results of evaluating the sub-expressions individualy.
See this wiki page (https://github.com/HackerFoo/peg/wiki/Properties-of-Peg) for an example, as well as other information on the properties of Peg you might find interesting.
On Fri, Apr 20, 2012 at 1:21 PM, William Tanksley, Jr <wtanksleyjr <at> gmail.com > wrote: > > You MIGHT mean composition rather than concatenation. Is that the case? > > If so, you've discovered a new subvariant of a "concatenative > language". The usual rule is "the syntax of concatenation implies the > semantics of composition", but for your language the rule is "the > syntax of concatenation implies the semantics of composition of all > Cartesian products". > > I like it. > > Now, Enchilada did something similar, but IIRC it didn't do it with > mere concatenation (van Dalen?); I don't know enough about Ripple to > say anything, but it either did this or allowed some explicit > operations to cause it (fortytwo, are you listening?). > Present, just lazyThat's an interesting property, but it's not true of Ripple, and lazy evaluation is the reason it's not true. Suppose we split a Ripple program arbitrarily and evaluate the fragments, e.g. 4 sqrt. 10 add. to 4 sqrt. and 10 add. They're both valid programs, yet the right-hand-side program has no solutions while the complete program has two solutions ([8] and [12], or (8) and (12) in Ripple syntax). Any time the right-hand-side does have a solution, e.g. if the program had been (4 sqrt. 10), which is already fully reduced, the left-hand-side will never be reduced in the complete program, so it doesn't matter what its solutions are. If Ripple used an eager evaluation strategy instead, the "cartesian composition" rule would more or less hold. Best, Josh > > Either way, I just learned something new. > > -Wm > > > [Non-text portions of this message have been removed] ------------------------------------ Yahoo! Groups Links <*> To visit your group on the web, go to: http://groups.yahoo.com/group/concatenative/ <*> Your email settings: Individual Email | Traditional <*> To change settings online go to: http://groups.yahoo.com/group/concatenative/join (Yahoo! ID required) <*> To change settings via email: concatenative-digest <at> yahoogroups.com concatenative-fullfeatured <at> yahoogroups.com <*> To unsubscribe from this group, send an email to: concatenative-unsubscribe <at> yahoogroups.com <*> Your use of Yahoo! Groups is subject to: http://docs.yahoo.com/info/terms/
> Last in, first out = queue
> First in, first out = stack
Didn't anybody pick up this 'typo'?
> I can't think of any other strategies that have a single unambiguous
> "top" element.
Think rather in term of the 'most recent' element.
] We know concatenative languages that use an accumulator and ones that
] use a stack; we also know ones that use multiples of each (for example,
] ANS Forth can have a separate floating point stack, and Machine Forth
] uses a memory fetch register; note that the return stack in Forth does
] NOT count, since accessing that makes the program NOT concatenative).
I disagree: it's the concept of LIFO/most-recent, and not the implementation
details, that count..
Consider the following real-life-task:
papers are 'stacked' in their order of creation;
we need to present them to the emperor in their order of creation;
so we could sequentially 'pop' them to another 'pile';
and than 'pop & present' from that 'pile'.
NB. I distinguish between the stack and the pile;
where the pile is stack2 of pop(stack).
If some of the papers may not be handled by the 'presenter',
they could be piled on a separate/floating-point-pile.
But then we've destroyed/lost the <sequence markers>,
by bifubricating the stack/pile.
I'm guessing that in FORTH, when/if you ADD a floating-point-facility,
you must have a mechanism to keep the 2nd [or more] stack 'syncronised'.
So effectively/conceptually there's only ONE stack. Or?
== Chris Glur.
--- In concatenative <at> yahoogroups.com, eas lab <lab.eas <at> ...> wrote:
>
> > Last in, first out = queue
> > First in, first out = stack
> Didn't anybody pick up this 'typo'?
This is why I prefer a pencil to a pen.
Oh well. Point and laugh.
>
> > I can't think of any other strategies that have a single unambiguous
> > "top" element.
> Think rather in term of the 'most recent' element.
>
> ] We know concatenative languages that use an accumulator and ones that
> ] use a stack; we also know ones that use multiples of each (for example,
> ] ANS Forth can have a separate floating point stack, and Machine Forth
> ] uses a memory fetch register; note that the return stack in Forth does
> ] NOT count, since accessing that makes the program NOT concatenative).
> I disagree: it's the concept of LIFO/most-recent, and not the implementation
> details, that count..
>
> Consider the following real-life-task:
> papers are 'stacked' in their order of creation;
> we need to present them to the emperor in their order of creation;
> so we could sequentially 'pop' them to another 'pile';
> and than 'pop & present' from that 'pile'.
>
> NB. I distinguish between the stack and the pile;
> where the pile is stack2 of pop(stack).
>
> If some of the papers may not be handled by the 'presenter',
> they could be piled on a separate/floating-point-pile.
>
> But then we've destroyed/lost the <sequence markers>,
> by bifubricating the stack/pile.
>
> I'm guessing that in FORTH, when/if you ADD a floating-point-facility,
> you must have a mechanism to keep the 2nd [or more] stack 'syncronised'.
> So effectively/conceptually there's only ONE stack. Or?
>
> == Chris Glur.
>
I considered having separate stacks for each type in Peg. Each argument to a word would be taken from the appropriate stack depending on the type required. Then, values could be wrapped in semantic types to minimize stack manipulation.
For instance a word 'greeting' could require a 'name' and a 'title', both of which are simply wrappers for strings, and the greeting would return "Hello [title] [name]!" regardless of the order of arguments. It would just use the title and name on the top of their respective stacks.
I decided against this, because it would complicate many other things.
--- In concatenative <at> yahoogroups.com, Joshua Shinavier <parcour <at> ...> wrote:
>
> That's an interesting property, but it's not true of Ripple, and lazy
> evaluation is the reason it's not true. Suppose we split a Ripple program
> arbitrarily and evaluate the fragments, e.g.
>
> 4 sqrt. 10 add.
>
> to
>
> 4 sqrt.
>
> and
>
> 10 add.
>
> They're both valid programs, yet the right-hand-side program has no
> solutions while the complete program has two solutions ([8] and [12], or
> (8) and (12) in Ripple syntax). Any time the right-hand-side does have a
> solution, e.g. if the program had been (4 sqrt. 10), which is already fully
> reduced, the left-hand-side will never be reduced in the complete program,
> so it doesn't matter what its solutions are.
>
> If Ripple used an eager evaluation strategy instead, the "cartesian
> composition" rule would more or less hold.
>
Peg also uses lazy evaluation, but the pureness of the language means that:
1 2 +
3
7 4 -
are all equivalent and indistinguishable, whether evaluated or not, because observing them requires evaluation.
Because expressions need to terminate first in order to use the Cartesian composition rule, non-termination does not pose a problem either.
Also, in Peg, an expression will not reduce, rather than yielding no solutions, if an argument to a word is missing i.e. the expression is a partially applied function. So,
10 + --> 10 +
which is different from
"ten" + --> no (no results)
Reddit thread (with comments) here:
http://www.reddit.com/r/haskell/comments/sc858/peg_a_lazy_nondeterministic_concatenative/
RSS Feed