Matt Mackall | 20 Apr 00:32 2010

[RFC] revision sets

Right now we have a notion of revision ranges, ie:

 hg log -r 0:1.0

Internally, we iterate over this with something like:

 for rev in commandutil.revrange(opts['rev']):

I've been talking about expanding this into a more powerful system that
would allow specifying dates, keywords, branches, etc. My current
thought is to make it look like this:

 hg log -r "branch(foo) and keyword(bar) and date(mar 1 - apr 1)"

Which would be identical to:

 hg log -b foo -k bar -d "mar 1 - apr 1"

but would magically work anywhere -r ranges were accepted (export, push,
etc.).

Further, we'd be able to add lots of interesting primitives:

 hg log -r "descendant(parent2(1.0)) and ancestor(2.0) and
author(george) and sorted(date) and reversed()"

Read that as: every cset that is descended from the second parent of
revision 1.0 and is also an ancestor of 2.0 and was written by george,
sorted by date in reverse order.

(Continue reading)

Benoit Boissinot | 20 Apr 01:53 2010

Re: [RFC] revision sets

On Mon, Apr 19, 2010 at 05:32:41PM -0500, Matt Mackall wrote:
> I've been talking about expanding this into a more powerful system that
> would allow specifying dates, keywords, branches, etc. My current
> thought is to make it look like this:
> 
>  hg log -r "branch(foo) and keyword(bar) and date(mar 1 - apr 1)"
> 
> Which would be identical to:
> 
>  hg log -b foo -k bar -d "mar 1 - apr 1"
> 
> but would magically work anywhere -r ranges were accepted (export, push,
> etc.).

for push, it's a bit different. If a range is specified (instead of a
list of revs), then they should be connected (no "holes" in the ranges).
> 
> Further, we'd be able to add lots of interesting primitives:
> 
>  hg log -r "descendant(parent2(1.0)) and ancestor(2.0) and
> author(george) and sorted(date) and reversed()"

We'd need some way to specify interesting topological orderings too, I
guess.

Being able to mix queries with other repos, would be interesting too
(e.g. only match from outgoing csets).

regards,

(Continue reading)

Matt Mackall | 20 Apr 06:23 2010

Re: [RFC] revision sets

On Tue, 2010-04-20 at 01:53 +0200, Benoit Boissinot wrote:
> On Mon, Apr 19, 2010 at 05:32:41PM -0500, Matt Mackall wrote:
> > I've been talking about expanding this into a more powerful system that
> > would allow specifying dates, keywords, branches, etc. My current
> > thought is to make it look like this:
> > 
> >  hg log -r "branch(foo) and keyword(bar) and date(mar 1 - apr 1)"
> > 
> > Which would be identical to:
> > 
> >  hg log -b foo -k bar -d "mar 1 - apr 1"
> > 
> > but would magically work anywhere -r ranges were accepted (export, push,
> > etc.).
> 
> for push, it's a bit different. If a range is specified (instead of a
> list of revs), then they should be connected (no "holes" in the ranges).

I don't think so. push(set) == push(heads(set)). 

> > Further, we'd be able to add lots of interesting primitives:
> > 
> >  hg log -r "descendant(parent2(1.0)) and ancestor(2.0) and
> > author(george) and sorted(date) and reversed()"
> 
> We'd need some way to specify interesting topological orderings too, I
> guess.

Perhaps. The default ordering (by revs) is always a topological
ordering. There's not much that makes other orderings interesting except
(Continue reading)

Bill Barry | 20 Apr 05:22 2010
Picon

Re: [RFC] revision sets

Matt Mackall wrote:
> Right now we have a notion of revision ranges, ie:
>
>  hg log -r 0:1.0
>
> Internally, we iterate over this with something like:
>
>  for rev in commandutil.revrange(opts['rev']):
>
> I've been talking about expanding this into a more powerful system that
> would allow specifying dates, keywords, branches, etc. My current
> thought is to make it look like this:
>
>  hg log -r "branch(foo) and keyword(bar) and date(mar 1 - apr 1)"
>
> Which would be identical to:
>
>  hg log -b foo -k bar -d "mar 1 - apr 1"
>
> but would magically work anywhere -r ranges were accepted (export, push,
> etc.).
>
> Further, we'd be able to add lots of interesting primitives:
>
>  hg log -r "descendant(parent2(1.0)) and ancestor(2.0) and
> author(george) and sorted(date) and reversed()"
>
> Read that as: every cset that is descended from the second parent of
> revision 1.0 and is also an ancestor of 2.0 and was written by george,
> sorted by date in reverse order.
(Continue reading)

Matt Mackall | 20 Apr 06:51 2010

Re: [RFC] revision sets

On Mon, 2010-04-19 at 21:22 -0600, Bill Barry wrote:
> Matt Mackall wrote:
> > Right now we have a notion of revision ranges, ie:
> >
> >  hg log -r 0:1.0
> >
> > Internally, we iterate over this with something like:
> >
> >  for rev in commandutil.revrange(opts['rev']):
> >
> > I've been talking about expanding this into a more powerful system that
> > would allow specifying dates, keywords, branches, etc. My current
> > thought is to make it look like this:
> >
> >  hg log -r "branch(foo) and keyword(bar) and date(mar 1 - apr 1)"
> >
> > Which would be identical to:
> >
> >  hg log -b foo -k bar -d "mar 1 - apr 1"
> >
> > but would magically work anywhere -r ranges were accepted (export, push,
> > etc.).
> >
> > Further, we'd be able to add lots of interesting primitives:
> >
> >  hg log -r "descendant(parent2(1.0)) and ancestor(2.0) and
> > author(george) and sorted(date) and reversed()"
> >
> > Read that as: every cset that is descended from the second parent of
> > revision 1.0 and is also an ancestor of 2.0 and was written by george,
(Continue reading)

Bill Barry | 20 Apr 16:39 2010
Picon

Re: [RFC] revision sets

Matt Mackall wrote:
You must be new here. There's nothing we care about being backward-compatible with. Mercurial is not a library and explicitly has no stable API. We only care about Mercurial's internal API being friendly for Mercurial's future development. If you decide to link to Mercurial internals, then you have to expect things to break on a regular basis because we're not going to stop improving the core any time soon. There are three kinds of backward compatibility in the world: - the kind offered by Mercurial's command line API where you should expect things to work forever or get lots of advanced notice that they'll be breaking
This was exactly what I was talking about. Keeping this without any required extra effort is a good thing (for one example it has less opportunity to introduce bugs). I don't care about the backward compatibility inside the codebase, just at the CLI level.
step 4: eval this statement
eval is banned from use in hg. And regex parsing is insufficient to properly handle the quoting and nesting requirements here.
It doesn't have to be eval, that was just the easy way out. you could almost as easily push the method calls onto a stack and use getattr instead.

without regexes (though I think they could have been used just as easily as the simple tokenizer below) or eval, still not a formal grammer of any kind (again the usual warnings about untested code and the fact that python isn't a language I feel strong with):

class revfinder(object):
    unquoteddelims = "()\t\n \"'"
    def __init__(self, repo, query):
        self.repo=repo
       
        """goal:
        turn:
            descendant(parent2(1.0)) and ancestor(2.0)
            and author(george) and sorted('date') and reversed()
        into:
            [reversed, [sorted, [andf, [andf, [descendant, parent2, "1.0"],
            ancestor, "2.0"], author, "george"], "date"]]
        """
        charstack = []
        delimiters = unquoteddelims
        tokens = []
        for char in query:
            if delimiters.find(char) != -1:
                charstack.append(char)
            else
                token = chartoken = ''.join(charstack)
                if chartoken:
                    charstack = []
                    if token=='and' or token=='or':
                        token = token+'f'
                    if delimiters == unquoteddelims:
                        try:
                            token = getattr(self, token)
                        except AttributeError:
                            pass
                    if isinstance(string, tokens[-1]):
                        tokens[-1] = ' '.join([tokens[-1], chartoken])
                    else:
                        # could use tags on the functions to determine which case here
                        # doing so would allow plugins to extend (examples: xor, except, ...)
                        if chartoken=='and' or chartoken=='or':
                            # infix operation
                            tokens = [token].append(tokens)
                        elif chartoken=='sorted' or chartoken=='reversed':
                            # whole-list nonfiltering operations
                            tokens = [token, tokens[1]]
                        else:
                            # other
                            tokens.append(token)
                if char=='"':
                    if token[-1]=="\\":
                        charstack = [chartoken, char]
                    elif delimiters = char:
                        delimiters = unquoteddelims
                    else:
                        delimiters = char
                elif char=="'":
                    if delimiters = char:
                        delimiters = unquoteddelims
                    else:
                        delimiters=char
        self.statement = tokens

    def evalstmt(self):
        tokens = self.statement
        return self.run(tokens[0], tokens[1:])
       
    def run(self, func, args):
        if isinstance(list, args[-1]):
            args[-1] = self.run(args[-1][0], args[-1][1:])
        if len(args) > 2:
            if isinstance(list, args[1]):
                args[1] = self.run(args[1][0], args[1][1:])
        while len(args) >= 2 and callable(args[-2]):
            # heres hoping lists and strings are not callable
            args[-2:-1] = args[-2](args[-1])
        return func(*args)
        # does self need to be a parameter here somewhere
        # or is it implicitly included because of the getattr done?


    def andf(self, set1, set2):
        pass #not interested in writing these right now
    def orf(self, set1, set2):
        pass
    def reversed(self, changes):
        pass
    def sorted(self, changes, key):
        pass
    ...
<div>
Matt Mackall wrote:
<blockquote cite="mid:1271739084.4068.687.camel <at> calx" type="cite">

You must be new here. There's nothing we care about being
backward-compatible with. Mercurial is not a library and explicitly has
no stable API. We only care about Mercurial's internal API being
friendly for Mercurial's future development. If you decide to link to
Mercurial internals, then you have to expect things to break on a
regular basis because we're not going to stop improving the core any
time soon.

There are three kinds of backward compatibility in the world:

- the kind offered by Mercurial's command line API where you should
expect things to work forever or get lots of advanced notice that
they'll be breaking

</blockquote>
This was exactly what I was talking about. Keeping this without any
required extra effort is a good thing (for one example it has less
opportunity to introduce bugs). I don't care about the backward
compatibility inside the codebase, just at the CLI level.<br><blockquote cite="mid:1271739084.4068.687.camel <at> calx" type="cite">
  <blockquote type="cite">

 step 4: eval this statement

  </blockquote>

eval is banned from use in hg. And regex parsing is insufficient to
properly handle the quoting and nesting requirements here.

</blockquote>
It doesn't have to be eval, that was just the easy way out. you could
almost as easily push the method calls onto a stack and use getattr
instead.<br><br>
without regexes (though I think they could have been used just as
easily as the simple tokenizer below) or eval, still not a formal
grammer of any kind (again the usual warnings about untested code and
the fact that python isn't a language I feel strong with):<br><br>
class revfinder(object):<br>
&nbsp;&nbsp;&nbsp; unquoteddelims = "()\t\n \"'"<br>
&nbsp;&nbsp;&nbsp; def __init__(self, repo, query):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; self.repo=repo<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; """goal:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; turn:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; descendant(parent2(1.0)) and ancestor(2.0)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; and author(george) and sorted('date') and reversed()<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; into:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; [reversed, [sorted, [andf, [andf, [descendant, parent2,
"1.0"],<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ancestor, "2.0"], author, "george"], "date"]]<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; """<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; charstack = []<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; delimiters = unquoteddelims<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tokens = []<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for char in query:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if delimiters.find(char) != -1:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; charstack.append(char)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; token = chartoken = ''.join(charstack)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if chartoken:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; charstack = []<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if token=='and' or token=='or':<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; token = token+'f'<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if delimiters == unquoteddelims:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; try:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; token = getattr(self, token)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; except AttributeError:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; pass<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if isinstance(string, tokens[-1]):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tokens[-1] = ' '.join([tokens[-1], chartoken])<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # could use tags on the functions to determine
which case here<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # doing so would allow plugins to extend
(examples: xor, except, ...)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if chartoken=='and' or chartoken=='or':<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # infix operation<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tokens = [token].append(tokens)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; elif chartoken=='sorted' or
chartoken=='reversed':<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # whole-list nonfiltering operations<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tokens = [token, tokens[1]]<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # other<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tokens.append(token)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if char=='"':<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if token[-1]=="\\":<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; charstack = [chartoken, char]<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; elif delimiters = char:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; delimiters = unquoteddelims<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; delimiters = char<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; elif char=="'":<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if delimiters = char:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; delimiters = unquoteddelims<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; delimiters=char<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; self.statement = tokens<br><br>
&nbsp;&nbsp;&nbsp; def evalstmt(self):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tokens = self.statement<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return self.run(tokens[0], tokens[1:])<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br>
&nbsp;&nbsp;&nbsp; def run(self, func, args):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if isinstance(list, args[-1]):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; args[-1] = self.run(args[-1][0], args[-1][1:])<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if len(args) &gt; 2:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if isinstance(list, args[1]):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; args[1] = self.run(args[1][0], args[1][1:])<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; while len(args) &gt;= 2 and callable(args[-2]):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # heres hoping lists and strings are not callable<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; args[-2:-1] = args[-2](args[-1])<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return func(*args)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # does self need to be a parameter here somewhere<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # or is it implicitly included because of the getattr done?<br><br><br>
&nbsp;&nbsp;&nbsp; def andf(self, set1, set2):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; pass #not interested in writing these right now<br>
&nbsp;&nbsp;&nbsp; def orf(self, set1, set2):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; pass<br>
&nbsp;&nbsp;&nbsp; def reversed(self, changes):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; pass<br>
&nbsp;&nbsp;&nbsp; def sorted(self, changes, key):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; pass<br>
&nbsp;&nbsp;&nbsp; ...<br>
</div>
Mathieu Clabaut | 20 Apr 22:25 2010
Picon

Re: [RFC] revision sets

Just for the record, Pratt parsers may be of interest here. There is an example python implementation there : http://effbot.org/zone/simple-top-down-parsing.htm


-Mathieu

On Tue, Apr 20, 2010 at 16:39, Bill Barry <after.fallout <at> gmail.com> wrote:
Matt Mackall wrote:
You must be new here. There's nothing we care about being backward-compatible with. Mercurial is not a library and explicitly has no stable API. We only care about Mercurial's internal API being friendly for Mercurial's future development. If you decide to link to Mercurial internals, then you have to expect things to break on a regular basis because we're not going to stop improving the core any time soon. There are three kinds of backward compatibility in the world: - the kind offered by Mercurial's command line API where you should expect things to work forever or get lots of advanced notice that they'll be breaking
This was exactly what I was talking about. Keeping this without any required extra effort is a good thing (for one example it has less opportunity to introduce bugs). I don't care about the backward compatibility inside the codebase, just at the CLI level.

step 4: eval this statement
eval is banned from use in hg. And regex parsing is insufficient to properly handle the quoting and nesting requirements here.
It doesn't have to be eval, that was just the easy way out. you could almost as easily push the method calls onto a stack and use getattr instead.

without regexes (though I think they could have been used just as easily as the simple tokenizer below) or eval, still not a formal grammer of any kind (again the usual warnings about untested code and the fact that python isn't a language I feel strong with):

class revfinder(object):
    unquoteddelims = "()\t\n \"'"

    def __init__(self, repo, query):
        self.repo=repo
       
        """goal:
        turn:

            descendant(parent2(1.0)) and ancestor(2.0)
            and author(george) and sorted('date') and reversed()
        into:
            [reversed, [sorted, [andf, [andf, [descendant, parent2, "1.0"],
            ancestor, "2.0"], author, "george"], "date"]]
        """
        charstack = []
        delimiters = unquoteddelims
        tokens = []
        for char in query:
            if delimiters.find(char) != -1:
                charstack.append(char)
            else
                token = chartoken = ''.join(charstack)
                if chartoken:
                    charstack = []
                    if token=='and' or token=='or':
                        token = token+'f'
                    if delimiters == unquoteddelims:
                        try:
                            token = getattr(self, token)
                        except AttributeError:
                            pass
                    if isinstance(string, tokens[-1]):
                        tokens[-1] = ' '.join([tokens[-1], chartoken])
                    else:
                        # could use tags on the functions to determine which case here
                        # doing so would allow plugins to extend (examples: xor, except, ...)
                        if chartoken=='and' or chartoken=='or':
                            # infix operation
                            tokens = [token].append(tokens)
                        elif chartoken=='sorted' or chartoken=='reversed':
                            # whole-list nonfiltering operations
                            tokens = [token, tokens[1]]
                        else:
                            # other
                            tokens.append(token)
                if char=='"':
                    if token[-1]=="\\":
                        charstack = [chartoken, char]
                    elif delimiters = char:
                        delimiters = unquoteddelims
                    else:
                        delimiters = char
                elif char=="'":
                    if delimiters = char:
                        delimiters = unquoteddelims
                    else:
                        delimiters=char
        self.statement = tokens

    def evalstmt(self):
        tokens = self.statement
        return self.run(tokens[0], tokens[1:])
       
    def run(self, func, args):
        if isinstance(list, args[-1]):
            args[-1] = self.run(args[-1][0], args[-1][1:])
        if len(args) > 2:
            if isinstance(list, args[1]):
                args[1] = self.run(args[1][0], args[1][1:])
        while len(args) >= 2 and callable(args[-2]):
            # heres hoping lists and strings are not callable
            args[-2:-1] = args[-2](args[-1])
        return func(*args)
        # does self need to be a parameter here somewhere
        # or is it implicitly included because of the getattr done?



    def andf(self, set1, set2):
        pass #not interested in writing these right now

    def orf(self, set1, set2):
        pass
    def reversed(self, changes):
        pass
    def sorted(self, changes, key):
        pass
    ...

_______________________________________________
Mercurial-devel mailing list
Mercurial-devel <at> selenic.com
http://selenic.com/mailman/listinfo/mercurial-devel


<div>
<p>Just for the record, Pratt parsers may be of interest here. There is an example python implementation there :&nbsp;<a href="http://effbot.org/zone/simple-top-down-parsing.htm">http://effbot.org/zone/simple-top-down-parsing.htm</a></p>
<div>

<br>
</div>
<div>-Mathieu<br><br><div class="gmail_quote">On Tue, Apr 20, 2010 at 16:39, Bill Barry <span dir="ltr">&lt;<a href="mailto:after.fallout <at> gmail.com">after.fallout <at> gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote">

<div bgcolor="#ffffff" text="#000000">
<div class="im">
Matt Mackall wrote:
<blockquote type="cite">
  You must be new here. There's nothing we care about being
backward-compatible with. Mercurial is not a library and explicitly has
no stable API. We only care about Mercurial's internal API being
friendly for Mercurial's future development. If you decide to link to
Mercurial internals, then you have to expect things to break on a
regular basis because we're not going to stop improving the core any
time soon.

There are three kinds of backward compatibility in the world:

- the kind offered by Mercurial's command line API where you should
expect things to work forever or get lots of advanced notice that
they'll be breaking

</blockquote>
</div>
This was exactly what I was talking about. Keeping this without any
required extra effort is a good thing (for one example it has less
opportunity to introduce bugs). I don't care about the backward
compatibility inside the codebase, just at the CLI level.<div class="im">
<br><blockquote type="cite">
  <blockquote type="cite">
     step 4: eval this statement

  </blockquote>
  eval is banned from use in hg. And regex parsing is insufficient to
properly handle the quoting and nesting requirements here.

</blockquote>
</div>
It doesn't have to be eval, that was just the easy way out. you could
almost as easily push the method calls onto a stack and use getattr
instead.<br><br>
without regexes (though I think they could have been used just as
easily as the simple tokenizer below) or eval, still not a formal
grammer of any kind (again the usual warnings about untested code and
the fact that python isn't a language I feel strong with):<br><br>
class revfinder(object):<br>
&nbsp;&nbsp;&nbsp; unquoteddelims = "()\t\n \"'"<div class="im">
<br>
&nbsp;&nbsp;&nbsp; def __init__(self, repo, query):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; self.repo=repo<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br>
</div>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; """goal:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; turn:<div class="im">
<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; descendant(parent2(1.0)) and ancestor(2.0)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; and author(george) and sorted('date') and reversed()<br>
</div>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; into:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; [reversed, [sorted, [andf, [andf, [descendant, parent2,
"1.0"],<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ancestor, "2.0"], author, "george"], "date"]]<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; """<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; charstack = []<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; delimiters = unquoteddelims<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tokens = []<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for char in query:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if delimiters.find(char) != -1:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; charstack.append(char)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; token = chartoken = ''.join(charstack)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if chartoken:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; charstack = []<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if token=='and' or token=='or':<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; token = token+'f'<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if delimiters == unquoteddelims:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; try:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; token = getattr(self, token)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; except AttributeError:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; pass<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if isinstance(string, tokens[-1]):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tokens[-1] = ' '.join([tokens[-1], chartoken])<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # could use tags on the functions to determine
which case here<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # doing so would allow plugins to extend
(examples: xor, except, ...)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if chartoken=='and' or chartoken=='or':<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # infix operation<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tokens = [token].append(tokens)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; elif chartoken=='sorted' or
chartoken=='reversed':<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # whole-list nonfiltering operations<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tokens = [token, tokens[1]]<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # other<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tokens.append(token)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if char=='"':<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if token[-1]=="\\":<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; charstack = [chartoken, char]<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; elif delimiters = char:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; delimiters = unquoteddelims<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; delimiters = char<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; elif char=="'":<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if delimiters = char:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; delimiters = unquoteddelims<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; delimiters=char<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; self.statement = tokens<br><br>
&nbsp;&nbsp;&nbsp; def evalstmt(self):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tokens = self.statement<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return self.run(tokens[0], tokens[1:])<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br>
&nbsp;&nbsp;&nbsp; def run(self, func, args):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if isinstance(list, args[-1]):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; args[-1] = self.run(args[-1][0], args[-1][1:])<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if len(args) &gt; 2:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if isinstance(list, args[1]):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; args[1] = self.run(args[1][0], args[1][1:])<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; while len(args) &gt;= 2 and callable(args[-2]):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # heres hoping lists and strings are not callable<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; args[-2:-1] = args[-2](args[-1])<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return func(*args)<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # does self need to be a parameter here somewhere<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; # or is it implicitly included because of the getattr done?<div class="im">
<br><br><br>
&nbsp;&nbsp;&nbsp; def andf(self, set1, set2):<br>
</div>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; pass #not interested in writing these right now<div class="im">
<br>
&nbsp;&nbsp;&nbsp; def orf(self, set1, set2):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; pass<br>
&nbsp;&nbsp;&nbsp; def reversed(self, changes):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; pass<br>
&nbsp;&nbsp;&nbsp; def sorted(self, changes, key):<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; pass<br>
&nbsp;&nbsp;&nbsp; ...<br>
</div>
</div>

<br>_______________________________________________<br>
Mercurial-devel mailing list<br><a href="mailto:Mercurial-devel <at> selenic.com">Mercurial-devel <at> selenic.com</a><br><a href="http://selenic.com/mailman/listinfo/mercurial-devel" target="_blank">http://selenic.com/mailman/listinfo/mercurial-devel</a><br><br>
</blockquote>
</div>
<br>
</div>
</div>
Dirkjan Ochtman | 20 Apr 08:45 2010
Picon

Re: [RFC] revision sets

On Tue, Apr 20, 2010 at 00:32, Matt Mackall <mpm <at> selenic.com> wrote:
> I've pitched an idea like this before, usually with a weird
> operator-intensive syntax. This time, I think the right thing is an
> easily-read but more verbose query language.

Your current proposal feels very verbose to me.

Couldn't we just translate the options we have to something queryish?

 hg log -r "branch(foo) and keyword(bar) and date(mar 1 - apr 1)"

could be

 hg log -r "b foo, key bar, d mar 1 - apr 1"

and

 hg log -r "descendant(parent2(1.0)) and ancestor(2.0) and
author(george) and sorted(date) and reversed()"

could be

 hg log -r "desc p2 1.0, anc 2.0, auth george, sort date, reverse"

Steps to get from yours to mine:

 1. Lose "and" as a separator, use ',' or '&'
 2. Allow non-ambiguous abbreviations like we do everywhere else
 3. Optional parens if unambiguous

Also, this proposal might perhaps benefit from a small list of use cases.

Cheers,

Dirkjan
Peter Arrenbrecht | 20 Apr 08:49 2010
Picon

Re: [RFC] revision sets

On Tue, Apr 20, 2010 at 12:32 AM, Matt Mackall <mpm <at> selenic.com> wrote:
> Right now we have a notion of revision ranges, ie:
>
>  hg log -r 0:1.0
>
> Internally, we iterate over this with something like:
>
>  for rev in commandutil.revrange(opts['rev']):
>
> I've been talking about expanding this into a more powerful system that
> would allow specifying dates, keywords, branches, etc. My current
> thought is to make it look like this:
>
>  hg log -r "branch(foo) and keyword(bar) and date(mar 1 - apr 1)"
>
> Which would be identical to:
>
>  hg log -b foo -k bar -d "mar 1 - apr 1"
>
> but would magically work anywhere -r ranges were accepted (export, push,
> etc.).
>
> Further, we'd be able to add lots of interesting primitives:
>
>  hg log -r "descendant(parent2(1.0)) and ancestor(2.0) and
> author(george) and sorted(date) and reversed()"
>
> Read that as: every cset that is descended from the second parent of
> revision 1.0 and is also an ancestor of 2.0 and was written by george,
> sorted by date in reverse order.
>
> revrange would be replaced by a new revset function that would parse the
> query/queries and build an iterator. Some of the operations, like
> keyword() and author(), would obviously be fairly expensive and many
> would fail to work (at least for now) on remote repos.
>
> I've pitched an idea like this before, usually with a weird
> operator-intensive syntax. This time, I think the right thing is an
> easily-read but more verbose query language.
>
> Steps to get from here to there:
>
> - change all callers of revrange to revset
> - design a BNF for the revset query language
> - build a query parser/"compiler"
> - add filters for the query functions
> - simplify some of the existing options (like -d and -k) by turning them
> into queries internally
>
> Thoughts?

Like the verbosity for this as a default. Parametrized templates might
come in handy for people needing the same kind of filter time and
again. So:

  [filters]
  partof(r) = r or ancestor(r)
  incoming(new,old) = partof(new) and not partof(old)

and then

  hg log -r "incoming(1.1, 1.0)"

For things like glog it might be nice to have an API that feeds the
base walk into the filter and gets true/false responses. This is so
glog can still try to show something reasonable when forks or joins
are rejected by the filter.

-parren
Greg Ward | 20 Apr 23:08 2010
Picon

Re: [RFC] revision sets

On Mon, Apr 19, 2010 at 6:32 PM, Matt Mackall <mpm <at> selenic.com> wrote:
> I've been talking about expanding this into a more powerful system that
> would allow specifying dates, keywords, branches, etc. My current
> thought is to make it look like this:
>
>  hg log -r "branch(foo) and keyword(bar) and date(mar 1 - apr 1)"

Cool.  Feels right to me.  I disagree with Bill about the simplicity
of parsing; this absolutely calls out for EBNF and a real parser.
Hopefully it will be a small and simple grammar with a small and
simple parser, but if you have booleans and nesting, you just gotta
have a formal grammar.

> Further, we'd be able to add lots of interesting primitives:
>
>  hg log -r "descendant(parent2(1.0)) and ancestor(2.0) and
> author(george) and sorted(date) and reversed()"
>
> Read that as: every cset that is descended from the second parent of
> revision 1.0 and is also an ancestor of 2.0 and was written by george,
> sorted by date in reverse order.

I was right with you up until sorted() and reversed().  The others are
predicates, which makes sense... and then you introduce something that
*looks* like a predicate but actually has a completely different
effect.

Thinking out loud: in SQL, there is a clear distinction between
selection criteria and ordering directives:

  select * from changelog
  where cond1 AND cond2 AND ... AND condN             # selection
  order by date descending                      # ordering

"order by" is a different part of the grammar than "where", and that
is one good thing about SQL.  Based on that idea, here's a different
way to formulate your example:

  descendant(parent2(1.0)) and
  ancestor(2.0) and
  author(george)
  sort(date, reversed)

The idea is that you could throw in an optional "sorted(KEY[, ORDER])"
at the end of a query.  There is deliberately no "and" there, because
it's not part of the boolean logic that specifies which changesets you
want to see; it's separate, specifying how to present those
changesets.

Responding to Dirkjan's comments:
> 1. Lose "and" as a separator, use ',' or '&'

Sure, but you also have to support "|" for "or".  IMHO this is
screaming out for full boolean logic with nesting.

> 2. Allow non-ambiguous abbreviations like we do everywhere else

Should be doable if everything is a keyword in this mini-language.
But if predicates can be added dynamically (by an extension, say),
then it might get tricky.

> 3. Optional parens if unambiguous

Yuck.  Keep it simple and consistent -- keep the parens.

> Also, this proposal might perhaps benefit from a small list of use cases.

How about, "anything I can do with git-rev-parse I should be able to
do with hg". ;-)

(Yeah, I know, that's a requirement not a use case.  Sue me.)

Greg
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel <at> selenic.com
http://selenic.com/mailman/listinfo/mercurial-devel
Nicolas Dumazet | 21 Apr 02:45 2010
Picon

Re: [RFC] revision sets

Hello!

I'm very interested in those ideas :)

2010/4/21 Greg Ward <greg-hg <at> gerg.ca>:
>> Further, we'd be able to add lots of interesting primitives:
>>
>>  hg log -r "descendant(parent2(1.0)) and ancestor(2.0) and
>> author(george) and sorted(date) and reversed()"
>>
>> Read that as: every cset that is descended from the second parent of
>> revision 1.0 and is also an ancestor of 2.0 and was written by george,
>> sorted by date in reverse order.
>
> I was right with you up until sorted() and reversed().  The others are
> predicates, which makes sense... and then you introduce something that
> *looks* like a predicate but actually has a completely different
> effect.

Same feeling here.
We must have a clear way to separate revision predicates that allow
selection, and other sorting/ordering structures

It could be a simple column, or semi-column?
* "descendant(parent2(1.0)) and ancestor(2.0) and author(george)
[separator] date, reversed"
* "descendant(parent2(1.0)) and ancestor(2.0) and author(george)
[separator] sorted(date, reversed)"

By the way, this sort of syntax could allow chaining constructs:
" selection : sort : selection : sort "

"descendant(parent2(1.0)) and ancestor(2.0) and author(george) :
sort(date, reversed) : even() and not(merge()) : sort(author)"

I know, use cases are getting rare =)

Cheers,
--

-- 
Nicolas Dumazet — NicDumZ
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel <at> selenic.com
http://selenic.com/mailman/listinfo/mercurial-devel
Dirkjan Ochtman | 21 Apr 08:21 2010
Picon

Re: [RFC] revision sets

On Tue, Apr 20, 2010 at 23:08, Greg Ward <greg-hg <at> gerg.ca> wrote:
> Responding to Dirkjan's comments:
>> 1. Lose "and" as a separator, use ',' or '&'
>
> Sure, but you also have to support "|" for "or".  IMHO this is
> screaming out for full boolean logic with nesting.

Sure, I wasn't debating that. I just think & and | would make nice
additions/replacements for and an or. In code I've grown to like
and/or, but in command-lines, which have much less re-use (and
complexity), I think the density is nicer.

>> 2. Allow non-ambiguous abbreviations like we do everywhere else
>
> Should be doable if everything is a keyword in this mini-language.
> But if predicates can be added dynamically (by an extension, say),
> then it might get tricky.

Even if they are, there'll probably be a central dict (like for
commands) where they have to jack in. Let's not assume that extensions
will actually extend the grammar...

>> 3. Optional parens if unambiguous
>
> Yuck.  Keep it simple and consistent -- keep the parens.

Okay, I'll concede that point.

>> Also, this proposal might perhaps benefit from a small list of use cases.
>
> How about, "anything I can do with git-rev-parse I should be able to
> do with hg". ;-)
>
> (Yeah, I know, that's a requirement not a use case.  Sue me.)

Well, I don't know what git-rev-parse can do. But I think it'd be nice
to have five examples of common things that we want to make much
easier than we have now.

Cheers,

Dirkjan
_______________________________________________
Mercurial-devel mailing list
Mercurial-devel <at> selenic.com
http://selenic.com/mailman/listinfo/mercurial-devel
Henrik Stuart | 23 Apr 22:16 2010
Picon

Re: [RFC] revision sets

On 20-04-2010 00:32, Matt Mackall wrote:
> Right now we have a notion of revision ranges, ie:
> 
>  hg log -r 0:1.0
> 
> Internally, we iterate over this with something like:
> 
>  for rev in commandutil.revrange(opts['rev']):
> 
> I've been talking about expanding this into a more powerful system that
> would allow specifying dates, keywords, branches, etc. My current
> thought is to make it look like this:
> 
>  hg log -r "branch(foo) and keyword(bar) and date(mar 1 - apr 1)"

A query language seems like a very good idea. It would probably be
prudent to get a list going on what primitives we would like to support
and what they should do. It would, for instance, be lovely if we could
find separate primtives for the entire branch and the branch tip, and
possibly also branch heads.

[snip]

>  hg log -r "descendant(parent2(1.0)) and ancestor(2.0) and
> author(george) and sorted(date) and reversed()"
> 
> Read that as: every cset that is descended from the second parent of
> revision 1.0 and is also an ancestor of 2.0 and was written by george,
> sorted by date in reverse order.
> 
> revrange would be replaced by a new revset function that would parse the
> query/queries and build an iterator. Some of the operations, like
> keyword() and author(), would obviously be fairly expensive and many
> would fail to work (at least for now) on remote repos.

We should probably greatly consider whether to support this remotely,
depending on what primitives are chosen so as not to put too great a
strain on the remote server.

> I've pitched an idea like this before, usually with a weird
> operator-intensive syntax. This time, I think the right thing is an
> easily-read but more verbose query language.
> 
> Steps to get from here to there:
> 
> - change all callers of revrange to revset
> - design a BNF for the revset query language
> - build a query parser/"compiler"
> - add filters for the query functions
> - simplify some of the existing options (like -d and -k) by turning them
> into queries internally
> 
> Thoughts?

A few pedantic notes: ancestor() and descendant() seem to indicate to me
that they can pick any one ancestor of something and I'd much favour the
plural versions, ancestors() and descendants(), as that indicates the
full set.

Also, since we are working on sets, "and" is really intersection, and
"or" is really union. One could ponder whether it wouldn't be nice to
support the usual set operations in some manner (subtraction and
difference come to mind).

With regards to syntax, I rather favour supporting both the written out
operations and a shortcut like Dirkjan proposed elsewhere in the thread.

And finally a brief stab at defining relevant functions/primitives:

Notation:
  c(_n)?: changeset (a one-element set probably)
  s(_n)?: sets of changesets

hash: changeset with hash
rev: changeset with rev
rev_1:rev_2: set of changesets between rev_1 and rev_2 linearly
.: working context parent1

ancestor(c_1,c_2): common ancestor of c_1 and c_2
ancestors(s): all ancestors of c
children(s): all immediate children of s (one level)
descendants(s): all descendants of s
heads(name): branch heads of branch name
heads(s): branch heads of s
topo-heads(s): topological heads of s
branch(name): set of changes on branch name
tag(name): changeset pointed to by name tag
tip(s): tip-most changeset
tip(name): tip-most name branch changeset
parent1(c): first parent of changeset
parent2(c): second parent of changeset
parents(c): both parents of changeset

keyword(name): changesets where user/commit message/etc. contain name
user(name): changesets where user contains name
date(datespec): changesets with date within datespec
adds(fname): changesets that add fname
removes(fname): changesets that removes fname
modifies(fname): changesets that modify fname
file(fname): changesets that add/remove/modify fname

This requires the user to explicitly state whether he is looking up a
branch or a tag, which some may dislike, but I much prefer being
explicit about it. Also, this incidentally solves the
ambiguousness/precedence between branches and tags for lookup.

I'm not really happy with the topo- prefix. I'm rather open to alternatives.

As Greg pointed out, sorting should probably stay after the query in an
adequately different manner to the rest of the query. I haven't given
that part too much consideration yet.

Finally, if we want to be performant in the face of such a query
language, we will probably need to write not only a compiler, but also a
query optimiser like all those fancy SQL databases. At least I can
imagine generating some rather inefficient queries that can easily be
rewritten to fast ones (of course, we could also just require that the
user write them sensibly, but it will probably be a better idea to
optimise it in the long run).

I did, on purpose, avoid writing the proposed elements in BNF just yet
until things are a bit further along.

--

-- 
Kind regards,
  Henrik Stuart
Henrik Stuart | 26 May 21:24 2010
Picon

Re: [RFC] revision sets

On 23-04-2010 22:16, Henrik Stuart wrote:
> On 20-04-2010 00:32, Matt Mackall wrote:
>> Right now we have a notion of revision ranges, ie:
>>
>>  hg log -r 0:1.0
>>
>> Internally, we iterate over this with something like:
>>
>>  for rev in commandutil.revrange(opts['rev']):
>>
>> I've been talking about expanding this into a more powerful system that
>> would allow specifying dates, keywords, branches, etc. My current
>> thought is to make it look like this:
>>
>>  hg log -r "branch(foo) and keyword(bar) and date(mar 1 - apr 1)"
> 
> A query language seems like a very good idea. It would probably be
> prudent to get a list going on what primitives we would like to support
> and what they should do. It would, for instance, be lovely if we could
> find separate primtives for the entire branch and the branch tip, and
> possibly also branch heads.
> 
> [snip]
> 
>>  hg log -r "descendant(parent2(1.0)) and ancestor(2.0) and
>> author(george) and sorted(date) and reversed()"
>>
>> Read that as: every cset that is descended from the second parent of
>> revision 1.0 and is also an ancestor of 2.0 and was written by george,
>> sorted by date in reverse order.
>>
>> revrange would be replaced by a new revset function that would parse the
>> query/queries and build an iterator. Some of the operations, like
>> keyword() and author(), would obviously be fairly expensive and many
>> would fail to work (at least for now) on remote repos.
> 
> We should probably greatly consider whether to support this remotely,
> depending on what primitives are chosen so as not to put too great a
> strain on the remote server.
> 
>> I've pitched an idea like this before, usually with a weird
>> operator-intensive syntax. This time, I think the right thing is an
>> easily-read but more verbose query language.
>>
>> Steps to get from here to there:
>>
>> - change all callers of revrange to revset
>> - design a BNF for the revset query language
>> - build a query parser/"compiler"
>> - add filters for the query functions
>> - simplify some of the existing options (like -d and -k) by turning them
>> into queries internally
>>
>> Thoughts?
> 
> A few pedantic notes: ancestor() and descendant() seem to indicate to me
> that they can pick any one ancestor of something and I'd much favour the
> plural versions, ancestors() and descendants(), as that indicates the
> full set.
> 
> Also, since we are working on sets, "and" is really intersection, and
> "or" is really union. One could ponder whether it wouldn't be nice to
> support the usual set operations in some manner (subtraction and
> difference come to mind).
> 
> With regards to syntax, I rather favour supporting both the written out
> operations and a shortcut like Dirkjan proposed elsewhere in the thread.
> 
> And finally a brief stab at defining relevant functions/primitives:
> 
> Notation:
>   c(_n)?: changeset (a one-element set probably)
>   s(_n)?: sets of changesets
> 
> hash: changeset with hash
> rev: changeset with rev
> rev_1:rev_2: set of changesets between rev_1 and rev_2 linearly
> .: working context parent1
> 
> ancestor(c_1,c_2): common ancestor of c_1 and c_2
> ancestors(s): all ancestors of c
> children(s): all immediate children of s (one level)
> descendants(s): all descendants of s
> heads(name): branch heads of branch name
> heads(s): branch heads of s
> topo-heads(s): topological heads of s
> branch(name): set of changes on branch name
> tag(name): changeset pointed to by name tag
> tip(s): tip-most changeset
> tip(name): tip-most name branch changeset
> parent1(c): first parent of changeset
> parent2(c): second parent of changeset
> parents(c): both parents of changeset
> 
> keyword(name): changesets where user/commit message/etc. contain name
> user(name): changesets where user contains name
> date(datespec): changesets with date within datespec
> adds(fname): changesets that add fname
> removes(fname): changesets that removes fname
> modifies(fname): changesets that modify fname
> file(fname): changesets that add/remove/modify fname

Replying to myself since nothing has happened since...

I have written a proposal in the form of a Mercurial extension that can
parse and present something very close to the outlined functions above
(with a few changes, and some unimplemented filters, namely the
file-related ones, as well as sorting) using the log command (by way of
a new debugrevspec command).

The extension can be cloned from http://bitbucket.org/hstuart/hg-revspec
and instructions are present in the README.

Please note that Matt has an alternative implementation of this thing as
well. I'll leave it to Matt to present you with a link to his version if
he so pleases.

If you are at all interested in this area, now would be the time to
weigh in with your thoughts, preferences, missing features, etc.

--

-- 
Kind regards,
  Henrik Stuart
Matt Mackall | 27 May 01:28 2010

Re: [RFC] revision sets

On Wed, 2010-05-26 at 21:24 +0200, Henrik Stuart wrote:
> On 23-04-2010 22:16, Henrik Stuart wrote:
> > On 20-04-2010 00:32, Matt Mackall wrote:
> >> Right now we have a notion of revision ranges, ie:
> >>
> >>  hg log -r 0:1.0
> >>
> >> Internally, we iterate over this with something like:
> >>
> >>  for rev in commandutil.revrange(opts['rev']):
> >>
> >> I've been talking about expanding this into a more powerful system that
> >> would allow specifying dates, keywords, branches, etc. My current
> >> thought is to make it look like this:
> >>
> >>  hg log -r "branch(foo) and keyword(bar) and date(mar 1 - apr 1)"
> > 
> > A query language seems like a very good idea. It would probably be
> > prudent to get a list going on what primitives we would like to support
> > and what they should do. It would, for instance, be lovely if we could
> > find separate primtives for the entire branch and the branch tip, and
> > possibly also branch heads.
> > 
> > [snip]
> > 
> >>  hg log -r "descendant(parent2(1.0)) and ancestor(2.0) and
> >> author(george) and sorted(date) and reversed()"
> >>
> >> Read that as: every cset that is descended from the second parent of
> >> revision 1.0 and is also an ancestor of 2.0 and was written by george,
> >> sorted by date in reverse order.
> >>
> >> revrange would be replaced by a new revset function that would parse the
> >> query/queries and build an iterator. Some of the operations, like
> >> keyword() and author(), would obviously be fairly expensive and many
> >> would fail to work (at least for now) on remote repos.
> > 
> > We should probably greatly consider whether to support this remotely,
> > depending on what primitives are chosen so as not to put too great a
> > strain on the remote server.
> > 
> >> I've pitched an idea like this before, usually with a weird
> >> operator-intensive syntax. This time, I think the right thing is an
> >> easily-read but more verbose query language.
> >>
> >> Steps to get from here to there:
> >>
> >> - change all callers of revrange to revset
> >> - design a BNF for the revset query language
> >> - build a query parser/"compiler"
> >> - add filters for the query functions
> >> - simplify some of the existing options (like -d and -k) by turning them
> >> into queries internally
> >>
> >> Thoughts?
> > 
> > A few pedantic notes: ancestor() and descendant() seem to indicate to me
> > that they can pick any one ancestor of something and I'd much favour the
> > plural versions, ancestors() and descendants(), as that indicates the
> > full set.
> > 
> > Also, since we are working on sets, "and" is really intersection, and
> > "or" is really union. One could ponder whether it wouldn't be nice to
> > support the usual set operations in some manner (subtraction and
> > difference come to mind).
> > 
> > With regards to syntax, I rather favour supporting both the written out
> > operations and a shortcut like Dirkjan proposed elsewhere in the thread.
> > 
> > And finally a brief stab at defining relevant functions/primitives:
> > 
> > Notation:
> >   c(_n)?: changeset (a one-element set probably)
> >   s(_n)?: sets of changesets
> > 
> > hash: changeset with hash
> > rev: changeset with rev
> > rev_1:rev_2: set of changesets between rev_1 and rev_2 linearly
> > .: working context parent1
> > 
> > ancestor(c_1,c_2): common ancestor of c_1 and c_2
> > ancestors(s): all ancestors of c
> > children(s): all immediate children of s (one level)
> > descendants(s): all descendants of s
> > heads(name): branch heads of branch name
> > heads(s): branch heads of s
> > topo-heads(s): topological heads of s
> > branch(name): set of changes on branch name
> > tag(name): changeset pointed to by name tag
> > tip(s): tip-most changeset
> > tip(name): tip-most name branch changeset
> > parent1(c): first parent of changeset
> > parent2(c): second parent of changeset
> > parents(c): both parents of changeset
> > 
> > keyword(name): changesets where user/commit message/etc. contain name
> > user(name): changesets where user contains name
> > date(datespec): changesets with date within datespec
> > adds(fname): changesets that add fname
> > removes(fname): changesets that removes fname
> > modifies(fname): changesets that modify fname
> > file(fname): changesets that add/remove/modify fname
> 
> Replying to myself since nothing has happened since...

Actually a lot has happened: we wrote two nearly complete
implementations.

> I have written a proposal in the form of a Mercurial extension that can
> parse and present something very close to the outlined functions above
> (with a few changes, and some unimplemented filters, namely the
> file-related ones, as well as sorting) using the log command (by way of
> a new debugrevspec command).
> 
> The extension can be cloned from http://bitbucket.org/hstuart/hg-revspec
> and instructions are present in the README.
> 
> Please note that Matt has an alternative implementation of this thing as
> well. I'll leave it to Matt to present you with a link to his version if
> he so pleases.
> 
> If you are at all interested in this area, now would be the time to
> weigh in with your thoughts, preferences, missing features, etc.

The thing we're most interested in is probably feedback on the syntax.
Both of our implementations implement very similar query styles:

operators:
  and, or, not, :, .., ()

function-style filters:
  ancestors(a), ancestor(x, y), keyword("foo")

identifiers:
  tip, 1.0, default, "quoted-to-be-parser-friendly"

See these two URLs for examples:

http://bitbucket.org/hstuart/hg-revspec/changeset/446121faa725
http://www.selenic.com/blog/?p=613

We also need to come up with a scheme for sorting. I'm currently
considering something like:

# sort ancestors by user (ascending), then date (descending)
sort(ancestors(1.1), "user -date")

--

-- 
Mathematics is the supreme nostalgia of our time.


Gmane