Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 00/19] new git check-ignore sub-command

This v3 re-roll of my check-ignore series is a reasonably substantial
revamp over v2, and applies on top of Junio's current
nd/attr-match-optim-more branch (82dce998c202).

All feedback and patches from the v2 series has been incorporated, and
several other improvements too, including:

  - composite exclude_lists have been split up into
    exclude_list_groups each containing one exclude_list per source

  - smaller commits for easier review

  - minor memory leaks have been fixed and verified via valgrind

  - t0007-ignores.sh has been renumbered to t0008-ignores.sh to avoid
    a conflict with t0007-git-var.sh

  - improvements to documentation and comments

For reference, the v2 series was announced here:

    http://thread.gmane.org/gmane.comp.version-control.git/204661/focus=206074

All tests pass except for t91*, since there seems to be some
pre-existing breakage in 82dce998c202 relating to git-svn.

Adam Spiers (19):
  api-directory-listing.txt: update to match code
  Improve documentation and comments regarding directory traversal API
  dir.c: rename cryptic 'which' variable to more consistent name
(Continue reading)

Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 05/19] dir.c: rename excluded_from_list() to is_excluded_from_list()

Continue adopting clearer names for exclude functions.  This 'is_*'
naming pattern for functions returning booleans was discussed here:

http://thread.gmane.org/gmane.comp.version-control.git/204661/focus=204924

Also adjust their callers as necessary.

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
 dir.c          | 11 ++++++-----
 dir.h          |  4 ++--
 unpack-trees.c |  8 +++++---
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/dir.c b/dir.c
index f1c0abd..0800491 100644
--- a/dir.c
+++ b/dir.c
 <at>  <at>  -605,9 +605,9  <at>  <at>  int match_pathname(const char *pathname, int pathlen,
 /* Scan the list and let the last match determine the fate.
  * Return 1 for exclude, 0 for include and -1 for undecided.
  */
-int excluded_from_list(const char *pathname,
-		       int pathlen, const char *basename, int *dtype,
-		       struct exclude_list *el)
+int is_excluded_from_list(const char *pathname,
+			  int pathlen, const char *basename, int *dtype,
+			  struct exclude_list *el)
 {
 	int i;
(Continue reading)

Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 02/19] Improve documentation and comments regarding directory traversal API

From the perspective of a newcomer to the codebase, the directory
traversal API has a few potentially confusing properties.  These
comments clarify a few key aspects and will hopefully make it easier
to understand for other newcomers in the future.

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
For some reason this patch was dropped from the v2 series when it was
applied to the 'pu' branch.

 Documentation/technical/api-directory-listing.txt |  9 +++++---
 dir.c                                             |  8 ++++++-
 dir.h                                             | 26 +++++++++++++++++++++--
 3 files changed, 37 insertions(+), 6 deletions(-)

diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt
index 0356d25..944fc39 100644
--- a/Documentation/technical/api-directory-listing.txt
+++ b/Documentation/technical/api-directory-listing.txt
 <at>  <at>  -9,8 +9,11  <at>  <at>  Data structure
 --------------

 `struct dir_struct` structure is used to pass directory traversal
-options to the library and to record the paths discovered.  The notable
-options are:
+options to the library and to record the paths discovered.  A single
+`struct dir_struct` is used regardless of whether or not the traversal
+recursively descends into subdirectories.
+
+The notable options are:
(Continue reading)

Junio C Hamano | 1 Jan 21:52 2013
Picon
Picon

Re: [PATCH v3 02/19] Improve documentation and comments regarding directory traversal API

Adam Spiers <git <at> adamspiers.org> writes:

> diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt
> index 0356d25..944fc39 100644
> --- a/Documentation/technical/api-directory-listing.txt
> +++ b/Documentation/technical/api-directory-listing.txt
>  <at>  <at>  -9,8 +9,11  <at>  <at>  Data structure
>  --------------
>  
>  `struct dir_struct` structure is used to pass directory traversal
> -options to the library and to record the paths discovered.  The notable
> -options are:
> +options to the library and to record the paths discovered.  A single
> +`struct dir_struct` is used regardless of whether or not the traversal
> +recursively descends into subdirectories.

I am somewhat lukewarm on this part of the change.

The added "regardless of..." does not seem to add as much value as
the two extra lines the patch spends.  If we say something like:

	A `struct dir_struct` structure is used to pass options to
        traverse directories recursively, and to record all the
        paths discovered by the traversal.

it might be much more direct and informative, I suspect, though.

After all, the word "traversal" pretty much implies that the library
goes in and out of the directories recursively.

(Continue reading)

Adam Spiers | 2 Jan 13:54 2013

Re: [PATCH v3 02/19] Improve documentation and comments regarding directory traversal API

On Tue, Jan 1, 2013 at 8:52 PM, Junio C Hamano <gitster <at> pobox.com> wrote:
> Adam Spiers <git <at> adamspiers.org> writes:
>
>> diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt
>> index 0356d25..944fc39 100644
>> --- a/Documentation/technical/api-directory-listing.txt
>> +++ b/Documentation/technical/api-directory-listing.txt
>>  <at>  <at>  -9,8 +9,11  <at>  <at>  Data structure
>>  --------------
>>
>>  `struct dir_struct` structure is used to pass directory traversal
>> -options to the library and to record the paths discovered.  The notable
>> -options are:
>> +options to the library and to record the paths discovered.  A single
>> +`struct dir_struct` is used regardless of whether or not the traversal
>> +recursively descends into subdirectories.
>
> I am somewhat lukewarm on this part of the change.
>
> The added "regardless of..." does not seem to add as much value as
> the two extra lines the patch spends.  If we say something like:
>
>         A `struct dir_struct` structure is used to pass options to
>         traverse directories recursively, and to record all the
>         paths discovered by the traversal.
>
> it might be much more direct and informative, I suspect, though.

I somewhat disagree ;) When I first encountered this code, I naturally
assumed that one struct would be created per sub-directory traversed.
(Continue reading)

Adam Spiers | 6 Jan 13:02 2013

Re: [PATCH v3 02/19] Improve documentation and comments regarding directory traversal API

On Wed, Jan 02, 2013 at 12:54:19PM +0000, Adam Spiers wrote:
> On Tue, Jan 1, 2013 at 8:52 PM, Junio C Hamano <gitster <at> pobox.com> wrote:
> > Adam Spiers <git <at> adamspiers.org> writes:
> >> diff --git a/dir.c b/dir.c
> >> index ee8e711..89e27a6 100644
> >> --- a/dir.c
> >> +++ b/dir.c
> >>  <at>  <at>  -2,6 +2,8  <at>  <at> 
> >>   * This handles recursive filename detection with exclude
> >>   * files, index knowledge etc..
> >>   *
> >> + * See Documentation/technical/api-directory-listing.txt
> >> + *
> >>   * Copyright (C) Linus Torvalds, 2005-2006
> >>   *            Junio Hamano, 2005-2006
> >>   */
> >>  <at>  <at>  -476,6 +478,10  <at>  <at>  void add_excludes_from_file(struct dir_struct *dir, const char *fname)
> >>               die("cannot use %s as an exclude file", fname);
> >>  }
> >>
> >> +/*
> >> + * Loads the per-directory exclude list for the substring of base
> >> + * which has a char length of baselen.
> >> + */
> >>  static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
> >>  {
> >>       struct exclude_list *el;
> >>  <at>  <at>  -486,7 +492,7  <at>  <at>  static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
> >>           (baselen + strlen(dir->exclude_per_dir) >= PATH_MAX))
> >>               return; /* too long a path -- ignore */
(Continue reading)

Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 07/19] dir.c: refactor is_excluded_from_list()

The excluded function uses a new helper function called
last_exclude_matching_from_list() to perform the inner loop over all of
the exclude patterns.  The helper just tells us whether the path is
included, excluded, or undecided.

However, it may be useful to know _which_ pattern was triggered.  So
let's pass out the entire exclude match, which contains the status
information we were already passing out.

Further patches can make use of this.

This is a modified forward port of a patch from 2009 by Jeff King:
http://article.gmane.org/gmane.comp.version-control.git/108815

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
 dir.c | 37 ++++++++++++++++++++++++++++---------
 1 file changed, 28 insertions(+), 9 deletions(-)

diff --git a/dir.c b/dir.c
index 8c99dc4..d1a0413 100644
--- a/dir.c
+++ b/dir.c
 <at>  <at>  -602,22 +602,26  <at>  <at>  int match_pathname(const char *pathname, int pathlen,
 	return fnmatch_icase(pattern, name, FNM_PATHNAME) == 0;
 }

-/* Scan the list and let the last match determine the fate.
- * Return 1 for exclude, 0 for include and -1 for undecided.
+/*
(Continue reading)

Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 08/19] dir.c: refactor is_excluded()

In a similar way to the previous commit, this extracts a new helper
function last_exclude_matching() which returns the last exclude_list
element which matched, or NULL if no match was found.  is_excluded()
becomes a wrapper around this, and just returns 0 or 1 depending on
whether any matching exclude_list element was found.

This allows callers to find out _why_ a given path was excluded,
rather than just whether it was or not, paving the way for a new git
sub-command which allows users to test their exclude lists from the
command line.

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
 dir.c | 38 +++++++++++++++++++++++++++++---------
 1 file changed, 29 insertions(+), 9 deletions(-)

diff --git a/dir.c b/dir.c
index d1a0413..b9d4234 100644
--- a/dir.c
+++ b/dir.c
 <at>  <at>  -664,24 +664,44  <at>  <at>  int is_excluded_from_list(const char *pathname,
 	return -1; /* undecided */
 }

-static int is_excluded(struct dir_struct *dir, const char *pathname, int *dtype_p)
+/*
+ * Loads the exclude lists for the directory containing pathname, then
+ * scans all exclude lists to determine whether pathname is excluded.
+ * Returns the exclude_list element which matched, or NULL for
+ * undecided.
(Continue reading)

Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 10/19] dir.c: rename free_excludes() to clear_exclude_list()

It is clearer to use a 'clear_' prefix for functions which empty
and deallocate the contents of a data structure without freeing
the structure itself, and a 'free_' prefix for functions which
also free the structure itself.

http://article.gmane.org/gmane.comp.version-control.git/206128

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
 dir.c          | 6 +++++-
 dir.h          | 2 +-
 unpack-trees.c | 2 +-
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/dir.c b/dir.c
index 16e10b0..41f141c 100644
--- a/dir.c
+++ b/dir.c
 <at>  <at>  -400,7 +400,11  <at>  <at>  static void *read_skip_worktree_file_from_index(const char *path, size_t *size)
 	return data;
 }

-void free_excludes(struct exclude_list *el)
+/*
+ * Frees memory within el which was allocated for exclude patterns and
+ * the file buffer.  Does not free el itself.
+ */
+void clear_exclude_list(struct exclude_list *el)
 {
 	int i;
(Continue reading)

Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 12/19] dir.c: keep track of where patterns came from

For exclude patterns read in from files, the filename is stored in the
exclude list, and the originating line number is stored in the
individual exclude (counting starting at 1).

For exclude patterns provided on the command line, a string describing
the source of the patterns is stored in the exclude list, and the
sequence number assigned to each exclude pattern is negative, with
counting starting at -1.  So for example the 2nd pattern provided via
--exclude would be numbered -2.  This allows any future consumers of
that data to easily distinguish between exclude patterns from files
vs. from the CLI.

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
 builtin/clean.c    |  4 ++--
 builtin/ls-files.c |  5 +++--
 dir.c              | 26 ++++++++++++++++++++------
 dir.h              | 21 +++++++++++++++++++--
 4 files changed, 44 insertions(+), 12 deletions(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index bd18b88..72d2876 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
 <at>  <at>  -97,10 +97,10  <at>  <at>  int cmd_clean(int argc, const char **argv, const char *prefix)
 	if (!ignored)
 		setup_standard_excludes(&dir);

-	add_exclude_list(&dir, EXC_CMDL);
+	add_exclude_list(&dir, EXC_CMDL, "--exclude option");
(Continue reading)

Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 11/19] dir.c: use a single struct exclude_list per source of excludes

Previously each exclude_list could potentially contain patterns
from multiple sources.  For example dir->exclude_list[EXC_FILE]
would typically contain patterns from .git/info/exclude and
core.excludesfile, and dir->exclude_list[EXC_DIRS] could contain
patterns from multiple per-directory .gitignore files during
directory traversal (i.e. when dir->exclude_stack was more than
one item deep).

We split these composite exclude_lists up into three groups of
exclude_lists (EXC_CMDL / EXC_DIRS / EXC_FILE as before), so that each
exclude_list now contains patterns from a single source.  This will
allow us to cleanly track the origin of each pattern simply by adding
a src field to struct exclude_list, rather than to struct exclude,
which would make memory management of the source string tricky in the
EXC_DIRS case where its contents are dynamically generated.

Similarly, by moving the filebuf member from struct exclude_stack to
struct exclude_list, it allows us to track and subsequently free
memory buffers allocated during the parsing of all exclude files,
rather than only tracking buffers allocated for files in the EXC_DIRS
group.

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
 Documentation/technical/api-directory-listing.txt | 12 +++--
 builtin/clean.c                                   |  3 +-
 builtin/ls-files.c                                |  8 +--
 dir.c                                             | 60 ++++++++++++++++-------
 dir.h                                             | 36 ++++++++++----
 unpack-trees.c                                    |  2 +-
(Continue reading)

Junio C Hamano | 4 Jan 22:03 2013
Picon
Picon

Re: [PATCH v3 11/19] dir.c: use a single struct exclude_list per source of excludes

Adam Spiers <git <at> adamspiers.org> writes:

> diff --git a/builtin/clean.c b/builtin/clean.c
> index 0c7b3d0..bd18b88 100644
> --- a/builtin/clean.c
> +++ b/builtin/clean.c
>  <at>  <at>  -97,9 +97,10  <at>  <at>  int cmd_clean(int argc, const char **argv, const char *prefix)
>  	if (!ignored)
>  		setup_standard_excludes(&dir);
>  
> +	add_exclude_list(&dir, EXC_CMDL);
>  	for (i = 0; i < exclude_list.nr; i++)
>  		add_exclude(exclude_list.items[i].string, "", 0,
> -			    &dir.exclude_list[EXC_CMDL]);
> +			    &dir.exclude_list_groups[EXC_CMDL].ary[0]);

This looks somewhat ugly for two reasons.

 * The abstraction add_exclude() offers to its callers is just to
   let them add one pattern to the list of patterns for the kind
   (here, EXC_CMDL); why should they care about .ary[0] part?  Are
   there cases any sane caller (not the implementation of the
   exclude_list_group machinery e.g. add_excludes_from_... function)
   may want to call it with .ary[1]?  I do not think of any.
   Shouldn't the public API function add_exclude() take a pointer to
   the list group itself?

 * When naming an array of things, we tend to prefer naming it

     type thing[count]
(Continue reading)

Junio C Hamano | 5 Jan 08:54 2013
Picon
Picon

Re: [PATCH v3 11/19] dir.c: use a single struct exclude_list per source of excludes

Junio C Hamano <gitster <at> pobox.com> writes:

> Adam Spiers <git <at> adamspiers.org> writes:
>
>> diff --git a/builtin/clean.c b/builtin/clean.c
>> index 0c7b3d0..bd18b88 100644
>> --- a/builtin/clean.c
>> +++ b/builtin/clean.c
>>  <at>  <at>  -97,9 +97,10  <at>  <at>  int cmd_clean(int argc, const char **argv, const char *prefix)
>>  	if (!ignored)
>>  		setup_standard_excludes(&dir);
>>  
>> +	add_exclude_list(&dir, EXC_CMDL);
>>  	for (i = 0; i < exclude_list.nr; i++)
>>  		add_exclude(exclude_list.items[i].string, "", 0,
>> -			    &dir.exclude_list[EXC_CMDL]);
>> +			    &dir.exclude_list_groups[EXC_CMDL].ary[0]);
>
> This looks somewhat ugly for two reasons.
>
>  * The abstraction add_exclude() offers to its callers is just to
>    let them add one pattern to the list of patterns for the kind
>    (here, EXC_CMDL); why should they care about .ary[0] part?  Are
>    there cases any sane caller (not the implementation of the
>    exclude_list_group machinery e.g. add_excludes_from_... function)
>    may want to call it with .ary[1]?  I do not think of any.
>    Shouldn't the public API function add_exclude() take a pointer to
>    the list group itself?
>
>  * When naming an array of things, we tend to prefer naming it
(Continue reading)

Adam Spiers | 6 Jan 16:27 2013

Re: [PATCH v3 11/19] dir.c: use a single struct exclude_list per source of excludes

On Fri, Jan 04, 2013 at 11:54:34PM -0800, Junio C Hamano wrote:
> Junio C Hamano <gitster <at> pobox.com> writes:
> > Adam Spiers <git <at> adamspiers.org> writes:
> >
> >> diff --git a/builtin/clean.c b/builtin/clean.c
> >> index 0c7b3d0..bd18b88 100644
> >> --- a/builtin/clean.c
> >> +++ b/builtin/clean.c
> >>  <at>  <at>  -97,9 +97,10  <at>  <at>  int cmd_clean(int argc, const char **argv, const char *prefix)
> >>  	if (!ignored)
> >>  		setup_standard_excludes(&dir);
> >>  
> >> +	add_exclude_list(&dir, EXC_CMDL);
> >>  	for (i = 0; i < exclude_list.nr; i++)
> >>  		add_exclude(exclude_list.items[i].string, "", 0,
> >> -			    &dir.exclude_list[EXC_CMDL]);
> >> +			    &dir.exclude_list_groups[EXC_CMDL].ary[0]);
> >
> > This looks somewhat ugly for two reasons.
> >
> >  * The abstraction add_exclude() offers to its callers is just to
> >    let them add one pattern to the list of patterns for the kind
> >    (here, EXC_CMDL); why should they care about .ary[0] part?  Are
> >    there cases any sane caller (not the implementation of the
> >    exclude_list_group machinery e.g. add_excludes_from_... function)
> >    may want to call it with .ary[1]?  I do not think of any.
> >    Shouldn't the public API function add_exclude() take a pointer to
> >    the list group itself?
> >
> >  * When naming an array of things, we tend to prefer naming it
(Continue reading)

Adam Spiers | 6 Jan 16:35 2013

[PATCH] api-allocation-growing.txt: encourage better variable naming

The documentation for the ALLOC_GROW API implicitly encouraged
developers to use "ary" as the variable name for the array which is
dynamically grown.  However "ary" is an unusual abbreviation hardly
used anywhere else in the source tree, and it is also better to name
variables based on their contents not on their type.

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
 Documentation/technical/api-allocation-growing.txt | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/Documentation/technical/api-allocation-growing.txt b/Documentation/technical/api-allocation-growing.txt
index 43dbe09..3894815 100644
--- a/Documentation/technical/api-allocation-growing.txt
+++ b/Documentation/technical/api-allocation-growing.txt
 <at>  <at>  -5,7 +5,9  <at>  <at>  Dynamically growing an array using realloc() is error prone and boring.

 Define your array with:

-* a pointer (`ary`) that points at the array, initialized to `NULL`;
+* a pointer (`array`) that points at the array, initialized to `NULL`
+  (although please name the variable based on its contents, not on its
+  type);

 * an integer variable (`alloc`) that keeps track of how big the current
   allocation is, initialized to `0`;
 <at>  <at>  -13,22 +15,22  <at>  <at>  Define your array with:
 * another integer variable (`nr`) to keep track of how many elements the
   array currently has, initialized to `0`.

(Continue reading)

Junio C Hamano | 6 Jan 21:29 2013
Picon
Picon

Re: [PATCH] api-allocation-growing.txt: encourage better variable naming

Adam Spiers <git <at> adamspiers.org> writes:

> The documentation for the ALLOC_GROW API implicitly encouraged
> developers to use "ary" as the variable name for the array which is
> dynamically grown.  However "ary" is an unusual abbreviation hardly
> used anywhere else in the source tree, and it is also better to name
> variables based on their contents not on their type.

Sounds good.  To follow "not type but contents", a further rewrite
with s/array/item/ is even better, no?

I can obviously squash it in without resending, if you agree, or you
can point out why item[] is not a good idea and array[] is better.

>
> Signed-off-by: Adam Spiers <git <at> adamspiers.org>
> ---
>  Documentation/technical/api-allocation-growing.txt | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/technical/api-allocation-growing.txt b/Documentation/technical/api-allocation-growing.txt
> index 43dbe09..3894815 100644
> --- a/Documentation/technical/api-allocation-growing.txt
> +++ b/Documentation/technical/api-allocation-growing.txt
>  <at>  <at>  -5,7 +5,9  <at>  <at>  Dynamically growing an array using realloc() is error prone and boring.
>  
>  Define your array with:
>  
> -* a pointer (`ary`) that points at the array, initialized to `NULL`;
> +* a pointer (`array`) that points at the array, initialized to `NULL`
(Continue reading)

Adam Spiers | 6 Jan 21:52 2013

Re: [PATCH] api-allocation-growing.txt: encourage better variable naming

On Sun, Jan 06, 2013 at 12:29:33PM -0800, Junio C Hamano wrote:
> Adam Spiers <git <at> adamspiers.org> writes:
> 
> > The documentation for the ALLOC_GROW API implicitly encouraged
> > developers to use "ary" as the variable name for the array which is
> > dynamically grown.  However "ary" is an unusual abbreviation hardly
> > used anywhere else in the source tree, and it is also better to name
> > variables based on their contents not on their type.
> 
> Sounds good.  To follow "not type but contents", a further rewrite
> with s/array/item/ is even better, no?
> 
> I can obviously squash it in without resending, if you agree, or you
> can point out why item[] is not a good idea and array[] is better.

I agree.
Junio C Hamano | 6 Jan 21:58 2013
Picon
Picon

Re: [PATCH] api-allocation-growing.txt: encourage better variable naming

Adam Spiers <git <at> adamspiers.org> writes:

>> Sounds good.  To follow "not type but contents", a further rewrite
>> with s/array/item/ is even better, no?
>
> I agree.

Thanks for a quick response; let's do this then.

-- >8 --
From: Adam Spiers <git <at> adamspiers.org>

The documentation for the ALLOC_GROW API implicitly encouraged
developers to use "ary" as the variable name for the array which is
dynamically grown.  However "ary" is an unusual abbreviation hardly
used anywhere else in the source tree, and it is also better to name
variables based on their contents not on their type.

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
Signed-off-by: Junio C Hamano <gitster <at> pobox.com>
---
 Documentation/technical/api-allocation-growing.txt | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/Documentation/technical/api-allocation-growing.txt b/Documentation/technical/api-allocation-growing.txt
index 43dbe09..542946b 100644
--- a/Documentation/technical/api-allocation-growing.txt
+++ b/Documentation/technical/api-allocation-growing.txt
 <at>  <at>  -5,7 +5,9  <at>  <at>  Dynamically growing an array using realloc() is error prone and boring.

(Continue reading)

Adam Spiers | 6 Jan 16:20 2013

Re: [PATCH v3 11/19] dir.c: use a single struct exclude_list per source of excludes

On Fri, Jan 04, 2013 at 01:03:59PM -0800, Junio C Hamano wrote:
> Adam Spiers <git <at> adamspiers.org> writes:
> 
> > diff --git a/builtin/clean.c b/builtin/clean.c
> > index 0c7b3d0..bd18b88 100644
> > --- a/builtin/clean.c
> > +++ b/builtin/clean.c
> >  <at>  <at>  -97,9 +97,10  <at>  <at>  int cmd_clean(int argc, const char **argv, const char *prefix)
> >  	if (!ignored)
> >  		setup_standard_excludes(&dir);
> >  
> > +	add_exclude_list(&dir, EXC_CMDL);
> >  	for (i = 0; i < exclude_list.nr; i++)
> >  		add_exclude(exclude_list.items[i].string, "", 0,
> > -			    &dir.exclude_list[EXC_CMDL]);
> > +			    &dir.exclude_list_groups[EXC_CMDL].ary[0]);
> 
> This looks somewhat ugly for two reasons.
> 
>  * The abstraction add_exclude() offers to its callers is just to
>    let them add one pattern to the list of patterns for the kind
>    (here, EXC_CMDL); why should they care about .ary[0] part?

Because the caller has to decide which exclude list the new exclude
should be added to; that is how it has been for a long time, and I am
not proposing we change that.

There are currently three callers:

  builtin/clean.c:    cmd_clean()
(Continue reading)

Junio C Hamano | 6 Jan 21:25 2013
Picon
Picon

Re: [PATCH v3 11/19] dir.c: use a single struct exclude_list per source of excludes

Adam Spiers <git <at> adamspiers.org> writes:

> On Fri, Jan 04, 2013 at 01:03:59PM -0800, Junio C Hamano wrote:
>> Adam Spiers <git <at> adamspiers.org> writes:
>> 
>> > diff --git a/builtin/clean.c b/builtin/clean.c
>> > index 0c7b3d0..bd18b88 100644
>> > --- a/builtin/clean.c
>> > +++ b/builtin/clean.c
>> >  <at>  <at>  -97,9 +97,10  <at>  <at>  int cmd_clean(int argc, const char **argv, const char *prefix)
>> >  	if (!ignored)
>> >  		setup_standard_excludes(&dir);
>> >  
>> > +	add_exclude_list(&dir, EXC_CMDL);
>> >  	for (i = 0; i < exclude_list.nr; i++)
>> >  		add_exclude(exclude_list.items[i].string, "", 0,
>> > -			    &dir.exclude_list[EXC_CMDL]);
>> > +			    &dir.exclude_list_groups[EXC_CMDL].ary[0]);
>> 
>> This looks somewhat ugly for two reasons.
>> 
>>  * The abstraction add_exclude() offers to its callers is just to
>>    let them add one pattern to the list of patterns for the kind
>>    (here, EXC_CMDL); why should they care about .ary[0] part?
>
> Because the caller has to decide which exclude list the new exclude
> should be added to; that is how it has been for a long time, and I am
> not proposing we change that.

Unless I was mistaken, I never objected to the EXC_CMDL, etc
(Continue reading)

Adam Spiers | 6 Jan 23:53 2013

Re: [PATCH v3 11/19] dir.c: use a single struct exclude_list per source of excludes

On Sun, Jan 06, 2013 at 12:25:48PM -0800, Junio C Hamano wrote:
> Adam Spiers <git <at> adamspiers.org> writes:
> 
> > On Fri, Jan 04, 2013 at 01:03:59PM -0800, Junio C Hamano wrote:
> >> Adam Spiers <git <at> adamspiers.org> writes:
> >> 
> >> > diff --git a/builtin/clean.c b/builtin/clean.c
> >> > index 0c7b3d0..bd18b88 100644
> >> > --- a/builtin/clean.c
> >> > +++ b/builtin/clean.c
> >> >  <at>  <at>  -97,9 +97,10  <at>  <at>  int cmd_clean(int argc, const char **argv, const char *prefix)
> >> >  	if (!ignored)
> >> >  		setup_standard_excludes(&dir);
> >> >  
> >> > +	add_exclude_list(&dir, EXC_CMDL);
> >> >  	for (i = 0; i < exclude_list.nr; i++)
> >> >  		add_exclude(exclude_list.items[i].string, "", 0,
> >> > -			    &dir.exclude_list[EXC_CMDL]);
> >> > +			    &dir.exclude_list_groups[EXC_CMDL].ary[0]);
> >> 
> >> This looks somewhat ugly for two reasons.
> >> 
> >>  * The abstraction add_exclude() offers to its callers is just to
> >>    let them add one pattern to the list of patterns for the kind
> >>    (here, EXC_CMDL); why should they care about .ary[0] part?
> >
> > Because the caller has to decide which exclude list the new exclude
> > should be added to; that is how it has been for a long time, and I am
> > not proposing we change that.
> 
(Continue reading)

Adam Spiers | 7 Jan 00:17 2013

Re: [PATCH v3 11/19] dir.c: use a single struct exclude_list per source of excludes

On Sun, Jan 06, 2013 at 10:53:11PM +0000, Adam Spiers wrote:
> That's a valid point.  However, the ary[0] part which assumes external
> knowledge of the internal implementation can trivially be avoided by
> squashing this patch onto the commit we are discussing:

[snipped]

> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> index 0ca9d8e..0406adc 100644
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
>  <at>  <at>  -420,10 +420,11  <at>  <at>  static int option_parse_z(const struct option *opt,
>  static int option_parse_exclude(const struct option *opt,
>  				const char *arg, int unset)
>  {
> -	struct exclude_list_group *group = opt->value;
> +	struct string_list *exclude_list = opt->value;
>  
>  	exc_given = 1;
> -	add_exclude(arg, "", 0, &group->el[0]);
> +	string_list_append(exclude_list, arg);
> +	fprintf(stderr, "append %s\n", arg);

Whoops :-)

[snipped]

>  <at>  <at>  -524,9 +527,13  <at>  <at>  int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
>  	if (read_cache() < 0)
>  		die("index file corrupt");
(Continue reading)

Junio C Hamano | 7 Jan 00:19 2013
Picon
Picon

Re: [PATCH v3 11/19] dir.c: use a single struct exclude_list per source of excludes

Adam Spiers <git <at> adamspiers.org> writes:

> That's a valid point.  However, the ary[0] part which assumes external
> knowledge of the internal implementation can trivially be avoided by
> squashing this patch onto the commit we are discussing:
>
> diff --git a/builtin/clean.c b/builtin/clean.c
> index dd89737..6e21ca6 100644
> --- a/builtin/clean.c
> +++ b/builtin/clean.c
>  <at>  <at>  -45,6 +45,7  <at>  <at>  int cmd_clean(int argc, const char **argv, const char *prefix)
>  	static const char **pathspec;
>  	struct strbuf buf = STRBUF_INIT;
>  	struct string_list exclude_list = STRING_LIST_INIT_NODUP;
> +	struct exclude_list *el;
>  	const char *qname;
>  	char *seen = NULL;
>  	struct option options[] = {
>  <at>  <at>  -97,10 +98,9  <at>  <at>  int cmd_clean(int argc, const char **argv, const char *prefix)
>  	if (!ignored)
>  		setup_standard_excludes(&dir);
>  
> -	add_exclude_list(&dir, EXC_CMDL);
> +	el = add_exclude_list(&dir, EXC_CMDL);
>  	for (i = 0; i < exclude_list.nr; i++)
> -		add_exclude(exclude_list.items[i].string, "", 0,
> -			    &dir.exclude_list_group[EXC_CMDL].el[0]);
> +		add_exclude(exclude_list.items[i].string, "", 0, el);
>  
>  	pathspec = get_pathspec(prefix, argv);
(Continue reading)

Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 13/19] dir.c: provide clear_directory() for reclaiming dir_struct memory

By the end of a directory traversal, a dir_struct instance will
typically contains pointers to various data structures on the heap.
clear_directory() provides a convenient way to reclaim that memory.

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
 Documentation/technical/api-directory-listing.txt |  2 ++
 dir.c                                             | 30 +++++++++++++++++++++++
 dir.h                                             |  1 +
 3 files changed, 33 insertions(+)

diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt
index fa9c8ae..fbceb62 100644
--- a/Documentation/technical/api-directory-listing.txt
+++ b/Documentation/technical/api-directory-listing.txt
 <at>  <at>  -81,4 +81,6  <at>  <at>  marked. If you to exclude files, make sure you have loaded index first.

 * Use `dir.entries[]`.

+* Call `free_directory()` when none of the contained elements are no longer in use.
+
 (JC)
diff --git a/dir.c b/dir.c
index aefe2bb..0d1b7e9 100644
--- a/dir.c
+++ b/dir.c
 <at>  <at>  -1555,3 +1555,33  <at>  <at>  void free_pathspec(struct pathspec *pathspec)
 	free(pathspec->items);
 	pathspec->items = NULL;
 }
(Continue reading)

Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 17/19] pathspec.c: extract new validate_path() for reuse

This will be reused by a new git check-ignore command.

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
 pathspec.c | 20 ++++++++++++++------
 pathspec.h |  1 +
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index 8aea0d2..6724121 100644
--- a/pathspec.c
+++ b/pathspec.c
 <at>  <at>  -77,9 +77,20  <at>  <at>  void treat_gitlinks(const char **pathspec)
 }

 /*
+ * Dies if the given path refers to a file inside a symlinked
+ * directory.
+ */
+void validate_path(const char *path, const char *prefix)
+{
+	if (has_symlink_leading_path(path, strlen(path))) {
+		int len = prefix ? strlen(prefix) : 0;
+		die(_("'%s' is beyond a symbolic link"), path + len);
+	}
+}
+
+/*
  * Normalizes argv relative to prefix, via get_pathspec(), and then
- * dies if any path in the normalized list refers to a file inside a
(Continue reading)

Junio C Hamano | 28 Dec 21:44 2012
Picon
Picon

Re: [PATCH v3 17/19] pathspec.c: extract new validate_path() for reuse

Adam Spiers <git <at> adamspiers.org> writes:

> This will be reused by a new git check-ignore command.
>
> Signed-off-by: Adam Spiers <git <at> adamspiers.org>
> ---
>  pathspec.c | 20 ++++++++++++++------
>  pathspec.h |  1 +
>  2 files changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/pathspec.c b/pathspec.c
> index 8aea0d2..6724121 100644
> --- a/pathspec.c
> +++ b/pathspec.c
>  <at>  <at>  -77,9 +77,20  <at>  <at>  void treat_gitlinks(const char **pathspec)
>  }
>  
>  /*
> + * Dies if the given path refers to a file inside a symlinked
> + * directory.
> + */
> +void validate_path(const char *path, const char *prefix)

The name needs to be a lot more specific.

There may be 47 different kinds of "validations" various callers may
want to do on a path, but this function only caters to one kind of
callers that want to make sure that the path refers to something
that we would directly add to our index.

(Continue reading)

Adam Spiers | 28 Dec 22:08 2012

Re: [PATCH v3 17/19] pathspec.c: extract new validate_path() for reuse

On Fri, Dec 28, 2012 at 8:44 PM, Junio C Hamano <gitster <at> pobox.com> wrote:
> Adam Spiers <git <at> adamspiers.org> writes:
>
>> This will be reused by a new git check-ignore command.
>>
>> Signed-off-by: Adam Spiers <git <at> adamspiers.org>
>> ---
>>  pathspec.c | 20 ++++++++++++++------
>>  pathspec.h |  1 +
>>  2 files changed, 15 insertions(+), 6 deletions(-)
>>
>> diff --git a/pathspec.c b/pathspec.c
>> index 8aea0d2..6724121 100644
>> --- a/pathspec.c
>> +++ b/pathspec.c
>>  <at>  <at>  -77,9 +77,20  <at>  <at>  void treat_gitlinks(const char **pathspec)
>>  }
>>
>>  /*
>> + * Dies if the given path refers to a file inside a symlinked
>> + * directory.
>> + */
>> +void validate_path(const char *path, const char *prefix)
>
> The name needs to be a lot more specific.
>
> There may be 47 different kinds of "validations" various callers may
> want to do on a path, but this function only caters to one kind of
> callers that want to make sure that the path refers to something
> that we would directly add to our index.
(Continue reading)

Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 18/19] setup.c: document get_pathspec()

Since we have just created a new pathspec-handling library, now is a
good time to add some comments explaining get_pathspec().

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
 setup.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/setup.c b/setup.c
index 7663a4c..03d6d5c 100644
--- a/setup.c
+++ b/setup.c
 <at>  <at>  -249,6 +249,21  <at>  <at>  static const char *prefix_pathspec(const char *prefix, int prefixlen, const char
 		return prefix_path(prefix, prefixlen, copyfrom);
 }

+/*
+ * prefix - a path relative to the root of the working tree
+ * pathspec - a list of paths underneath the prefix path
+ *
+ * Iterates over pathspec, prepending each path with prefix,
+ * and return the resulting list.
+ *
+ * If pathspec is empty, return a singleton list containing prefix.
+ *
+ * If pathspec and prefix are both empty, return an empty list.
+ *
+ * This is typically used by built-in commands such as add.c, in order
+ * to normalize argv arguments provided to the built-in into a list of
+ * paths to process, all relative to the root of the working tree.
(Continue reading)

Junio C Hamano | 28 Dec 21:36 2012
Picon
Picon

Re: [PATCH v3 18/19] setup.c: document get_pathspec()

Adam Spiers <git <at> adamspiers.org> writes:

> Since we have just created a new pathspec-handling library, now is a
> good time to add some comments explaining get_pathspec().
>
> Signed-off-by: Adam Spiers <git <at> adamspiers.org>
> ---

Yes, but we would rather not to see new users of this function added
to our codebase in its current form, as explained in the nearby
comment.  We would want to migrate everybody to "struct pathspec"
based interface to support magic pathspecs in the longer term.

>  setup.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
>
> diff --git a/setup.c b/setup.c
> index 7663a4c..03d6d5c 100644
> --- a/setup.c
> +++ b/setup.c
>  <at>  <at>  -249,6 +249,21  <at>  <at>  static const char *prefix_pathspec(const char *prefix, int prefixlen, const char
>  		return prefix_path(prefix, prefixlen, copyfrom);
>  }
>  
> +/*
> + * prefix - a path relative to the root of the working tree
> + * pathspec - a list of paths underneath the prefix path
> + *
> + * Iterates over pathspec, prepending each path with prefix,
> + * and return the resulting list.
(Continue reading)

Adam Spiers | 28 Dec 21:40 2012

Re: [PATCH v3 18/19] setup.c: document get_pathspec()

On Fri, Dec 28, 2012 at 8:36 PM, Junio C Hamano <gitster <at> pobox.com> wrote:
> Adam Spiers <git <at> adamspiers.org> writes:
>
>> Since we have just created a new pathspec-handling library, now is a
>> good time to add some comments explaining get_pathspec().
>>
>> Signed-off-by: Adam Spiers <git <at> adamspiers.org>
>> ---
>
> Yes, but we would rather not to see new users of this function added
> to our codebase in its current form, as explained in the nearby
> comment.  We would want to migrate everybody to "struct pathspec"
> based interface to support magic pathspecs in the longer term.

I see.  Please feel free to drop that patch from the series or amend
as you see fit.
Adam Spiers | 29 Dec 01:52 2012

Re: [PATCH v3 18/19] setup.c: document get_pathspec()

On Fri, Dec 28, 2012 at 8:40 PM, Adam Spiers <git <at> adamspiers.org> wrote:
> On Fri, Dec 28, 2012 at 8:36 PM, Junio C Hamano <gitster <at> pobox.com> wrote:
>> Adam Spiers <git <at> adamspiers.org> writes:
>>
>>> Since we have just created a new pathspec-handling library, now is a
>>> good time to add some comments explaining get_pathspec().
>>>
>>> Signed-off-by: Adam Spiers <git <at> adamspiers.org>
>>> ---
>>
>> Yes, but we would rather not to see new users of this function added
>> to our codebase in its current form, as explained in the nearby
>> comment.  We would want to migrate everybody to "struct pathspec"
>> based interface to support magic pathspecs in the longer term.
>
> I see.  Please feel free to drop that patch from the series or amend
> as you see fit.

I've added this sentence to the top of the comments above
get_pathspec():

    /*
     * N.B. get_pathspec() is deprecated in favor of the "struct pathspec"
     * based interface - see pathspec_magic above.
     *
    [...]

That should be sufficient to discourage people from adding new users
of get_pathspec().
(Continue reading)

Junio C Hamano | 29 Dec 02:36 2012
Picon
Picon

Re: [PATCH v3 18/19] setup.c: document get_pathspec()

Adam Spiers <git <at> adamspiers.org> writes:

> I've added this sentence to the top of the comments above
> get_pathspec():
>
>     /*
>      * N.B. get_pathspec() is deprecated in favor of the "struct pathspec"
>      * based interface - see pathspec_magic above.
>      *
>     [...]
>
> That should be sufficient to discourage people from adding new users
> of get_pathspec().

Yeah, that sounds sensible.

Thanks, and happy new year to you ;-)
Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 16/19] pathspec.c: move reusable code from builtin/add.c

Extract the following functions from builtin/add.c to pathspec.c, in
preparation for reuse by a new git check-ignore command:

  - fill_pathspec_matches()
  - find_used_pathspec()
  - treat_gitlink()
  - treat_gitlinks()
  - validate_pathspec()

The functions being extracted are not changed in any way, except
removal of the 'static' qualifier.

Also add a comment documenting validate_pathspec().

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
 Makefile      |  2 ++
 builtin/add.c | 92 +-----------------------------------------------------
 pathspec.c    | 99 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 pathspec.h    |  5 +++
 4 files changed, 107 insertions(+), 91 deletions(-)
 create mode 100644 pathspec.c
 create mode 100644 pathspec.h

diff --git a/Makefile b/Makefile
index 13293d3..48facad 100644
--- a/Makefile
+++ b/Makefile
 <at>  <at>  -645,6 +645,7  <at>  <at>  LIB_H += pack-refs.h
 LIB_H += pack-revindex.h
(Continue reading)

Junio C Hamano | 28 Dec 21:32 2012
Picon
Picon

Re: [PATCH v3 16/19] pathspec.c: move reusable code from builtin/add.c

Adam Spiers <git <at> adamspiers.org> writes:

> diff --git a/pathspec.h b/pathspec.h
> new file mode 100644
> index 0000000..8bb670b
> --- /dev/null
> +++ b/pathspec.h
>  <at>  <at>  -0,0 +1,5  <at>  <at> 
> +extern char *find_used_pathspec(const char **pathspec);
> +extern void fill_pathspec_matches(const char **pathspec, char *seen, int specs);
> +extern const char *treat_gitlink(const char *path);
> +extern void treat_gitlinks(const char **pathspec);
> +extern const char **validate_pathspec(const char **argv, const char *prefix);

Protect this against multiple inclusion with "#ifndef PATHSPEC_H".

Adam Spiers | 28 Dec 21:45 2012

Re: [PATCH v3 16/19] pathspec.c: move reusable code from builtin/add.c

On Fri, Dec 28, 2012 at 8:32 PM, Junio C Hamano <gitster <at> pobox.com> wrote:
> Adam Spiers <git <at> adamspiers.org> writes:
>
>> diff --git a/pathspec.h b/pathspec.h
>> new file mode 100644
>> index 0000000..8bb670b
>> --- /dev/null
>> +++ b/pathspec.h
>>  <at>  <at>  -0,0 +1,5  <at>  <at> 
>> +extern char *find_used_pathspec(const char **pathspec);
>> +extern void fill_pathspec_matches(const char **pathspec, char *seen, int specs);
>> +extern const char *treat_gitlink(const char *path);
>> +extern void treat_gitlinks(const char **pathspec);
>> +extern const char **validate_pathspec(const char **argv, const char *prefix);
>
> Protect this against multiple inclusion with "#ifndef PATHSPEC_H".

Yep good idea, how should I submit this?  It will cause a trivially
resolvable conflict with the next patch in the series (17/19):

  pathspec.c: extract new validate_path() for reuse

but I'd prefer not to re-roll 16--19 when just 16 and 17 are sufficient.
Adam Spiers | 29 Dec 01:40 2012

Re: [PATCH v3 16/19] pathspec.c: move reusable code from builtin/add.c

On Fri, Dec 28, 2012 at 8:45 PM, Adam Spiers <git <at> adamspiers.org> wrote:
> On Fri, Dec 28, 2012 at 8:32 PM, Junio C Hamano <gitster <at> pobox.com> wrote:
>> Adam Spiers <git <at> adamspiers.org> writes:
>>
>>> diff --git a/pathspec.h b/pathspec.h
>>> new file mode 100644
>>> index 0000000..8bb670b
>>> --- /dev/null
>>> +++ b/pathspec.h
>>>  <at>  <at>  -0,0 +1,5  <at>  <at> 
>>> +extern char *find_used_pathspec(const char **pathspec);
>>> +extern void fill_pathspec_matches(const char **pathspec, char *seen, int specs);
>>> +extern const char *treat_gitlink(const char *path);
>>> +extern void treat_gitlinks(const char **pathspec);
>>> +extern const char **validate_pathspec(const char **argv, const char *prefix);
>>
>> Protect this against multiple inclusion with "#ifndef PATHSPEC_H".
>
> Yep good idea, how should I submit this?  It will cause a trivially
> resolvable conflict with the next patch in the series (17/19):
>
>   pathspec.c: extract new validate_path() for reuse

I was wrong about that - it didn't cause a conflict, although it does
marginally change the context at the end of the pathspec.h hunk in the
above patch.

> but I'd prefer not to re-roll 16--19 when just 16 and 17 are sufficient.

Based on your other feedback, all of 16--19 require changes, and as
(Continue reading)

Junio C Hamano | 28 Dec 21:48 2012
Picon
Picon

Re: [PATCH v3 16/19] pathspec.c: move reusable code from builtin/add.c

Adam Spiers <git <at> adamspiers.org> writes:

> diff --git a/pathspec.c b/pathspec.c
> new file mode 100644
> index 0000000..8aea0d2
> --- /dev/null
> +++ b/pathspec.c
>  <at>  <at>  -0,0 +1,99  <at>  <at> 
> +#include "cache.h"
> +#include "dir.h"
> +#include "pathspec.h"
> +
> +void fill_pathspec_matches(const char **pathspec, char *seen, int specs)
> +{

It did not matter while it was an implementation detail of "git
add", but as a public function, something needs to clarify that this
"fill"s matches *against the index*, not against a tree or the files
in the current directory.  The same comment applies to all the
internal functions this patch exposes to the outside world.
Adam Spiers | 28 Dec 22:15 2012

Re: [PATCH v3 16/19] pathspec.c: move reusable code from builtin/add.c

On Fri, Dec 28, 2012 at 8:48 PM, Junio C Hamano <gitster <at> pobox.com> wrote:
> Adam Spiers <git <at> adamspiers.org> writes:
>
>> diff --git a/pathspec.c b/pathspec.c
>> new file mode 100644
>> index 0000000..8aea0d2
>> --- /dev/null
>> +++ b/pathspec.c
>>  <at>  <at>  -0,0 +1,99  <at>  <at> 
>> +#include "cache.h"
>> +#include "dir.h"
>> +#include "pathspec.h"
>> +
>> +void fill_pathspec_matches(const char **pathspec, char *seen, int specs)
>> +{
>
> It did not matter while it was an implementation detail of "git
> add", but as a public function, something needs to clarify that this
> "fill"s matches *against the index*, not against a tree or the files
> in the current directory.  The same comment applies to all the
> internal functions this patch exposes to the outside world.

Prior to submitting the v3 series, I attempted to understand
fill_pathspec_matches() and find_used_pathspec() well enough to
document them all, but I failed.  Perhaps some kind soul could explain
what is the exact purpose of these functions, and in particular the
role of char *seen in both?
Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 19/19] Add git-check-ignore sub-command

This works in a similar manner to git-check-attr.

Thanks to Jeff King and Junio C Hamano for the idea:
http://thread.gmane.org/gmane.comp.version-control.git/108671/focus=108815

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
Since v2, the missing documentation has been fixed, and the erroneous
tweak to t/t9902-completion.sh has been removed.

 .gitignore                                        |   1 +
 Documentation/git-check-ignore.txt                |  89 ++++
 Documentation/gitignore.txt                       |   6 +-
 Documentation/technical/api-directory-listing.txt |   2 +-
 Makefile                                          |   1 +
 builtin.h                                         |   1 +
 builtin/check-ignore.c                            | 170 +++++++
 command-list.txt                                  |   1 +
 contrib/completion/git-completion.bash            |   1 +
 git.c                                             |   1 +
 t/t0008-ignores.sh                                | 595 ++++++++++++++++++++++
 11 files changed, 865 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/git-check-ignore.txt
 create mode 100644 builtin/check-ignore.c
 create mode 100755 t/t0008-ignores.sh

diff --git a/.gitignore b/.gitignore
index f1acd3e..20ef4e8 100644
--- a/.gitignore
+++ b/.gitignore
(Continue reading)

Junio C Hamano | 28 Dec 22:21 2012
Picon
Picon

Re: [PATCH v3 19/19] Add git-check-ignore sub-command

Adam Spiers <git <at> adamspiers.org> writes:

> diff --git a/builtin/check-ignore.c b/builtin/check-ignore.c
> new file mode 100644
> index 0000000..c825736
> --- /dev/null
> +++ b/builtin/check-ignore.c
>  <at>  <at>  -0,0 +1,170  <at>  <at> 
> +#include "builtin.h"
> +#include "cache.h"
> +#include "dir.h"
> +#include "quote.h"
> +#include "pathspec.h"
> +#include "parse-options.h"
> +
> +static int quiet, verbose, stdin_paths;
> +static const char * const check_ignore_usage[] = {
> +"git check-ignore [options] pathname...",
> +"git check-ignore [options] --stdin < <list-of-paths>",
> +NULL
> +};
> +
> +static int null_term_line;
> +
> +static const struct option check_ignore_options[] = {
> +	OPT__QUIET(&quiet, N_("suppress progress reporting")),
> +	OPT__VERBOSE(&verbose, N_("be verbose")),
> +	OPT_GROUP(""),
> +	OPT_BOOLEAN(0, "stdin", &stdin_paths,
> +		    N_("read file names from stdin")),
(Continue reading)

Adam Spiers | 29 Dec 02:23 2012

Re: [PATCH v3 19/19] Add git-check-ignore sub-command

On Fri, Dec 28, 2012 at 01:21:02PM -0800, Junio C Hamano wrote:
> Adam Spiers <git <at> adamspiers.org> writes:
> > +static void output_exclude(const char *path, struct exclude *exclude)
> > +{
> > +	char *bang = exclude->flags & EXC_FLAG_NEGATIVE ? "!" : "";
> > +	char *dir  = (exclude->flags & EXC_FLAG_MUSTBEDIR) ? "/" : "";
> 
> That's inconsistent.  Either s/bang/negated/ or s/dir/slash/ (but
> obviously not both).

OK.  I'll make the use of parentheses there consistent too.

> > +static int check_ignore(const char *prefix, const char **pathspec)
> > +{
> > +	struct dir_struct dir;
> > +...
> > +	if (pathspec) {
> > +...
> > +	} else {
> > +		printf("no pathspec\n");
> > +	}
> 
> Is this an error message, an informative warning, or what?  The
> command is presumably for script consumption downstream of a pipe.
> Does it make sense to emit this message to their standard input?
> Does it even have to be output?  Especially what should happen when
> the user says "--quiet"?
>
> Perhaps
> 
(Continue reading)

Adam Spiers | 29 Dec 04:32 2012

Re: [PATCH v3 19/19] Add git-check-ignore sub-command

On Sat, Dec 29, 2012 at 01:23:52AM +0000, Adam Spiers wrote:
> FYI, attached is the diff between check-ignore-v3 and my current
> check-ignore, which is available at github:
> 
>     https://github.com/aspiers/git/commits/check-ignore

[snipped]

> diff --git a/pathspec.c b/pathspec.c
> index 6724121..3789b14 100644
> --- a/pathspec.c
> +++ b/pathspec.c

[snipped]

> -void validate_path(const char *path, const char *prefix)
> +void die_if_path_beyond_symlink(const char *path, const char *prefix)

[snipped]

> diff --git a/pathspec.h b/pathspec.h
> index c251441..1961b19 100644
> --- a/pathspec.h
> +++ b/pathspec.h
>  <at>  <at>  -1,6 +1,11  <at>  <at> 
> +#ifndef PATHSPEC_H
> +#define PATHSPEC_H
> +
>  extern char *find_used_pathspec(const char **pathspec);
>  extern void fill_pathspec_matches(const char **pathspec, char *seen, int specs);
(Continue reading)

Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 14/19] add.c: refactor treat_gitlinks()

Extract the body of the for loop in treat_gitlinks() into a separate
treat_gitlink() function so that it can be reused elsewhere.  This
paves the way for a new check-ignore sub-command.

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
 builtin/add.c | 50 ++++++++++++++++++++++++++++++++------------------
 1 file changed, 32 insertions(+), 18 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index c689f37..856d232 100644
--- a/builtin/add.c
+++ b/builtin/add.c
 <at>  <at>  -153,31 +153,45  <at>  <at>  static char *prune_directory(struct dir_struct *dir, const char **pathspec, int
 	return seen;
 }

-static void treat_gitlinks(const char **pathspec)
+/*
+ * Check whether path refers to a submodule, or something inside a
+ * submodule.  If the former, returns the path with any trailing slash
+ * stripped.  If the latter, dies with an error message.
+ */
+const char *treat_gitlink(const char *path)
 {
-	int i;
-
-	if (!pathspec || !*pathspec)
-		return;
-
(Continue reading)

Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 06/19] dir.c: rename excluded() to is_excluded()

Continue adopting clearer names for exclude functions.  This is_*
naming pattern for functions returning booleans was discussed here:

http://thread.gmane.org/gmane.comp.version-control.git/204661/focus=204924

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
 attr.c |  2 +-
 dir.c  | 10 +++++-----
 dir.h  |  4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/attr.c b/attr.c
index 2fc6353..5362563 100644
--- a/attr.c
+++ b/attr.c
 <at>  <at>  -284,7 +284,7  <at>  <at>  static struct match_attr *parse_attr_line(const char *line, const char *src,
  * (reading the file from top to bottom), .gitattribute of the root
  * directory (again, reading the file from top to bottom) down to the
  * current directory, and then scan the list backwards to find the first match.
- * This is exactly the same as what excluded() does in dir.c to deal with
+ * This is exactly the same as what is_excluded() does in dir.c to deal with
  * .gitignore
  */

diff --git a/dir.c b/dir.c
index 0800491..8c99dc4 100644
--- a/dir.c
+++ b/dir.c
 <at>  <at>  -645,7 +645,7  <at>  <at>  int is_excluded_from_list(const char *pathname,
(Continue reading)

Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 15/19] add.c: remove unused argument from validate_pathspec()

The 'argc' argument passed to validate_pathspec() was never used.

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
 builtin/add.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 856d232..1ba2a86 100644
--- a/builtin/add.c
+++ b/builtin/add.c
 <at>  <at>  -211,7 +211,7  <at>  <at>  static void refresh(int verbose, const char **pathspec)
         free(seen);
 }

-static const char **validate_pathspec(int argc, const char **argv, const char *prefix)
+static const char **validate_pathspec(const char **argv, const char *prefix)
 {
 	const char **pathspec = get_pathspec(prefix, argv);

 <at>  <at>  -262,7 +262,7  <at>  <at>  int interactive_add(int argc, const char **argv, const char *prefix, int patch)
 	const char **pathspec = NULL;

 	if (argc) {
-		pathspec = validate_pathspec(argc, argv, prefix);
+		pathspec = validate_pathspec(argv, prefix);
 		if (!pathspec)
 			return -1;
 	}
 <at>  <at>  -428,7 +428,7  <at>  <at>  int cmd_add(int argc, const char **argv, const char *prefix)
(Continue reading)

Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 01/19] api-directory-listing.txt: update to match code

7c4c97c0ac turned the flags in struct dir_struct into a single bitfield
variable, but forgot to update this document.

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
For some reason this patch was dropped from the v2 series when it was
applied to the 'pu' branch.

 Documentation/technical/api-directory-listing.txt | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt
index add6f43..0356d25 100644
--- a/Documentation/technical/api-directory-listing.txt
+++ b/Documentation/technical/api-directory-listing.txt
 <at>  <at>  -17,24 +17,24  <at>  <at>  options are:
 	The name of the file to be read in each directory for excluded
 	files (typically `.gitignore`).

-`collect_ignored`::
+`flags`::

-	Include paths that are to be excluded in the result.
+	A bit-field of options:

-`show_ignored`::
+`DIR_SHOW_IGNORED`:::

 	The traversal is for finding just ignored files, not unignored
 	files.
(Continue reading)

Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 09/19] dir.c: refactor is_path_excluded()

In a similar way to the previous commit, this extracts a new helper
function last_exclude_matching_path() which return the last
exclude_list element which matched, or NULL if no match was found.
is_path_excluded() becomes a wrapper around this, and just returns 0
or 1 depending on whether any matching exclude_list element was found.

This allows callers to find out _why_ a given path was excluded,
rather than just whether it was or not, paving the way for a new git
sub-command which allows users to test their exclude lists from the
command line.

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
 dir.c | 47 ++++++++++++++++++++++++++++++++++++++---------
 dir.h |  3 +++
 2 files changed, 41 insertions(+), 9 deletions(-)

diff --git a/dir.c b/dir.c
index b9d4234..16e10b0 100644
--- a/dir.c
+++ b/dir.c
 <at>  <at>  -709,6 +709,7  <at>  <at>  void path_exclude_check_init(struct path_exclude_check *check,
 			     struct dir_struct *dir)
 {
 	check->dir = dir;
+	check->exclude = NULL;
 	strbuf_init(&check->path, 256);
 }

 <at>  <at>  -718,18 +719,21  <at>  <at>  void path_exclude_check_clear(struct path_exclude_check *check)
(Continue reading)

Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 04/19] dir.c: rename path_excluded() to is_path_excluded()

Start adopting clearer names for exclude functions.  This 'is_*'
naming pattern for functions returning booleans was agreed here:

http://thread.gmane.org/gmane.comp.version-control.git/204661/focus=204924

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
 builtin/add.c      | 2 +-
 builtin/ls-files.c | 2 +-
 dir.c              | 4 ++--
 dir.h              | 2 +-
 unpack-trees.c     | 2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 89dce56..c689f37 100644
--- a/builtin/add.c
+++ b/builtin/add.c
 <at>  <at>  -453,7 +453,7  <at>  <at>  int cmd_add(int argc, const char **argv, const char *prefix)
 			    && !file_exists(pathspec[i])) {
 				if (ignore_missing) {
 					int dtype = DT_UNKNOWN;
-					if (path_excluded(&check, pathspec[i], -1, &dtype))
+					if (is_path_excluded(&check, pathspec[i], -1, &dtype))
 						dir_add_ignored(&dir, pathspec[i], strlen(pathspec[i]));
 				} else
 					die(_("pathspec '%s' did not match any files"),
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 31b3f2d..ef7f99a 100644
--- a/builtin/ls-files.c
(Continue reading)

Adam Spiers | 27 Dec 03:32 2012

[PATCH v3 03/19] dir.c: rename cryptic 'which' variable to more consistent name

'el' is only *slightly* less cryptic, but is already used as the
variable name for a struct exclude_list pointer in numerous other
places, so this reduces the number of cryptic variable names in use by
one :-)

Signed-off-by: Adam Spiers <git <at> adamspiers.org>
---
 dir.c | 10 +++++-----
 dir.h |  4 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/dir.c b/dir.c
index 89e27a6..f31aa59 100644
--- a/dir.c
+++ b/dir.c
 <at>  <at>  -349,7 +349,7  <at>  <at>  void parse_exclude_pattern(const char **pattern,
 }

 void add_exclude(const char *string, const char *base,
-		 int baselen, struct exclude_list *which)
+		 int baselen, struct exclude_list *el)
 {
 	struct exclude *x;
 	int patternlen;
 <at>  <at>  -373,8 +373,8  <at>  <at>  void add_exclude(const char *string, const char *base,
 	x->base = base;
 	x->baselen = baselen;
 	x->flags = flags;
-	ALLOC_GROW(which->excludes, which->nr + 1, which->alloc);
-	which->excludes[which->nr++] = x;
(Continue reading)

Michael Leal | 27 Dec 06:15 2012
Picon

Re: [PATCH v3 00/19] new git check-ignore sub-command

Adam Spiers <git <at> 
adamspiers.org> writes:

> 
> This v3 re-roll of my check-
ignore series is a reasonably 
substantial
> revamp over v2, and applies on 
top of Junio's current
> nd/attr-match-optim-more 
branch (82dce998c202).
> 
> All feedback and patches from 
the v2 series has been 
incorporated, and
> several other improvements 
too, including:
> 
>   - composite exclude_lists 
have been split up into
>     exclude_list_groups each 
containing one exclude_list per 
source
> 
>   - smaller commits for easier 
review
> 
>   - minor memory leaks have 
been fixed and verified via 
valgrind
(Continue reading)

Junio C Hamano | 28 Dec 19:50 2012
Picon
Picon

Re: [PATCH v3 00/19] new git check-ignore sub-command

Adam Spiers <git <at> adamspiers.org> writes:

> This v3 re-roll of my check-ignore series is a reasonably substantial
> revamp over v2, and applies on top of Junio's current
> nd/attr-match-optim-more branch (82dce998c202).

Thanks.

Does this (and should this, if it doesn't) interact with the more
recent discussion around "git status --untracked/--ignored" [*1*],
which also wants to touch the recursive directory traversal logic in
"dir.c"?

[Reference]

*1* http://thread.gmane.org/gmane.comp.version-control.git/212127/focus=212136
Adam Spiers | 28 Dec 20:39 2012

Re: [PATCH v3 00/19] new git check-ignore sub-command

On Fri, Dec 28, 2012 at 6:50 PM, Junio C Hamano <gitster <at> pobox.com> wrote:
> Adam Spiers <git <at> adamspiers.org> writes:
>
>> This v3 re-roll of my check-ignore series is a reasonably substantial
>> revamp over v2, and applies on top of Junio's current
>> nd/attr-match-optim-more branch (82dce998c202).
>
> Thanks.
>
> Does this (and should this, if it doesn't) interact with the more
> recent discussion around "git status --untracked/--ignored" [*1*],
> which also wants to touch the recursive directory traversal logic in
> "dir.c"?

I cannot think of a reason why they would or should interact.  If I'm
wrong, I expect that either set of unit tests would show me up :-)
Antoine Pelisse | 28 Dec 21:15 2012
Picon

Re: [PATCH v3 00/19] new git check-ignore sub-command

I think they will interact, but I need to have a deeper look to Adam's series.
If it does, do you want me to base my work on the top of his branch ?

On Fri, Dec 28, 2012 at 8:39 PM, Adam Spiers <git <at> adamspiers.org> wrote:
> On Fri, Dec 28, 2012 at 6:50 PM, Junio C Hamano <gitster <at> pobox.com> wrote:
>> Adam Spiers <git <at> adamspiers.org> writes:
>>
>>> This v3 re-roll of my check-ignore series is a reasonably substantial
>>> revamp over v2, and applies on top of Junio's current
>>> nd/attr-match-optim-more branch (82dce998c202).
>>
>> Thanks.
>>
>> Does this (and should this, if it doesn't) interact with the more
>> recent discussion around "git status --untracked/--ignored" [*1*],
>> which also wants to touch the recursive directory traversal logic in
>> "dir.c"?
>
> I cannot think of a reason why they would or should interact.  If I'm
> wrong, I expect that either set of unit tests would show me up :-)
Junio C Hamano | 28 Dec 22:31 2012
Picon
Picon

Re: [PATCH v3 00/19] new git check-ignore sub-command

Antoine Pelisse <apelisse <at> gmail.com> writes:

> I think they will interact, but I need to have a deeper look to Adam's series.
> If it does, do you want me to base my work on the top of his branch ?

Not necessarily.  If it becomes absolutely necessary to introduce
patch dependencies, I would rather see an addition of new command
rebased on a fix.

I just wanted to make sure that parties touching the same area of
the codebase (and me who will be merging their efforts) are aware of
what is going on.
Junio C Hamano | 28 Dec 22:23 2012
Picon
Picon

Re: [PATCH v3 00/19] new git check-ignore sub-command

After skimming the series twice quickly, I found that the early part
of refactorings are excellently done.  Making existing private
functions into public needs a lot more careful thought on namings, I
think, though.

The end result looks promising.  Thanks for a pleasant read.

Gmane