Rocky Bernstein | 18 Sep 04:50

reading a file into an array. mapfile? (f)?

I'd like to a read a file (a zsh script file) into an array fast. It is not uncommon for a GNU autoconf configure script to be tens of thousands of lines long.  (The zsh configure script is over 20,000 lines long).

I know about redirecting input in a loop or alternatively using "read" in a loop. For a large file this will tend to be slow. I also know about mapfile which reads the file and turns it into a single long zsh string. Question: if the underlying file changes, what does mapfile do? Update its data? Keep the original? Show something which is indeterminant?

There is also the zsh parameter expansion operator (f) "a shorthand for 'pws:\n:'". But I don't see how to use that with either  mapfile or input redirection to save this into an array variable short of putting this in a loop, in which case it  probably would then be no better (faster or more efficient) then doing it without the expansion operator.

One last possibility is to write a module to do what I want possibly building on mapfile. Basically this is what was done in bashdb's readarray (aka bash 4.0's mapfile).

Thoughts? Comments?

Thanks.

Bart Schaefer | 18 Sep 06:44
Gravatar

Re: reading a file into an array. mapfile? (f)?

On Sep 17, 10:53pm, Rocky Bernstein wrote:
} Subject: reading a file into an array. mapfile? (f)?
} 
} I'd like to a read a file (a zsh script file) into an array fast.

Ending up with what, one line per array entry?  I'm guessing so since
you mention the (f) expansion flag.

} [...] I also know about mapfile which reads the file and turns it
} into a single long zsh string. Question: if the underlying file
} changes, what does mapfile do? Update its data? Keep the original?
} Show something which is indeterminant?

When you reference a hash key in the mapfile hash, zsh calls mmap()
to access the file contents, but immediately allocates enough memory
to contain the data and copies into it.  The file is then unmapped.
This is done because parameter values are stored with zsh's internal
"metafication" already applied, and it's obviously not possible to
metafy the file in place.

If the file is modified during the brief period when zsh has it mapped
and is copying it, you could get indeterminate results.  It probably
depends on the system's mmap() implementation.  After the file has been
copied, zsh no longer pays attention to it.

If you assign a value to a field in the mapfile hash, zsh attempts
to mmap() the the corresponding disk file for writing, and whatever
you assigned replaces the file contents by way of msync().  You can
(I think) assign to slices of the file, but nothing magical is done,
so the entire file is rewritten unless the msync() implementation is
clever.

} There is also the zsh parameter expansion operator (f) "a shorthand
} for 'pws:\n:'". But I don't see how to use that with either mapfile or
} input redirection to save this into an array variable short of putting
} this in a loop

It's much simpler than you seem to believe:

lines=( ${(f)mapfile[/path/to/file]} )

Splitting up /etc/termcap this way (17890 lines on my system) takes
a little less than 0.08 seconds on my 3GHz Pentium 4.  Fully parsing
termcap into "shell words" with (z) takes about 0.13 seconds.  For
/usr/share/dict/words (479829 lines), (f) takes about 0.8 seconds but
(z) takes almost 13 seconds.

Rocky Bernstein | 18 Sep 12:03

Re: reading a file into an array. mapfile? (f)?

Thanks! Works great.

Using mapfile and (f) on zsh's configure script takes 0.189s (which includes loading zsh and mapfile) while doing this via a read loop takes over a minute.

Given this, I find this wording in  zshmodules a little misleading:

       Thus  it should not automatically be assumed that use of mapfile repre‐
       sents a gain in efficiency over use of other mechanisms. 

Ok. I won't assume it; I will just make use of its speedup over a read loop.

Before posting I tried googling for this and didn't turn up anything. Since this is so simple and I think common (perhaps more common that the case where one a file as a single long string) possibly this might be mentioned in the mapfile doc?

I sort of agree with this comment in zshmodules:
       It  is  unfortunate that the mechanism for loading modules does not yet
       allow the user to specify the name of the shell parameter to  be  given
       the special behaviour.

Here's how it is done in Ruby which is extremely simple: if there is an associative array SCRIPT_LINES__ defined file lines are saved into this array when it reads a file. So translating to zsh-speak:

  typeset -A SCRIPT_LINES___
turns on saving file lines and
  unset SCRIPT_LINES__
turns it off. (It's off by default.)

At any rate, I guess I no longer have an excuse for implementing file listing in zshdb, so I guess that's next up.

Any thoughts on how to get checksum information? I can shell out to "sum" or "md5sum". But given I have the file data as a string if there is a solution usesi zsh only, that is preferable.


On Thu, Sep 18, 2008 at 12:44 AM, Bart Schaefer <schaefer <at> brasslantern.com> wrote:
On Sep 17, 10:53pm, Rocky Bernstein wrote:
} Subject: reading a file into an array. mapfile? (f)?
}
} I'd like to a read a file (a zsh script file) into an array fast.

Ending up with what, one line per array entry?  I'm guessing so since
you mention the (f) expansion flag.

} [...] I also know about mapfile which reads the file and turns it
} into a single long zsh string. Question: if the underlying file
} changes, what does mapfile do? Update its data? Keep the original?
} Show something which is indeterminant?

When you reference a hash key in the mapfile hash, zsh calls mmap()
to access the file contents, but immediately allocates enough memory
to contain the data and copies into it.  The file is then unmapped.
This is done because parameter values are stored with zsh's internal
"metafication" already applied, and it's obviously not possible to
metafy the file in place.

If the file is modified during the brief period when zsh has it mapped
and is copying it, you could get indeterminate results.  It probably
depends on the system's mmap() implementation.  After the file has been
copied, zsh no longer pays attention to it.

If you assign a value to a field in the mapfile hash, zsh attempts
to mmap() the the corresponding disk file for writing, and whatever
you assigned replaces the file contents by way of msync().  You can
(I think) assign to slices of the file, but nothing magical is done,
so the entire file is rewritten unless the msync() implementation is
clever.

} There is also the zsh parameter expansion operator (f) "a shorthand
} for 'pws:\n:'". But I don't see how to use that with either mapfile or
} input redirection to save this into an array variable short of putting
} this in a loop

It's much simpler than you seem to believe:

lines=( ${(f)mapfile[/path/to/file]} )

Splitting up /etc/termcap this way (17890 lines on my system) takes
a little less than 0.08 seconds on my 3GHz Pentium 4.  Fully parsing
termcap into "shell words" with (z) takes about 0.13 seconds.  For
/usr/share/dict/words (479829 lines), (f) takes about 0.8 seconds but
(z) takes almost 13 seconds.


Peter Stephenson | 18 Sep 12:50
Favicon
Gravatar

Re: reading a file into an array. mapfile? (f)?

On Thu, 18 Sep 2008 06:03:38 -0400
"Rocky Bernstein" <rocky.bernstein <at> gmail.com> wrote:
> Given this, I find this wording in  zshmodules a little misleading:
> 
>        Thus  it should not automatically be assumed that use of mapfile
> repre‐
>        sents a gain in efficiency over use of other mechanisms.
> 
> Ok. I won't assume it; I will just make use of its speedup over a read loop.

Yes, it more or less is guaranteed to be vastly faster than some other
mechanisms.  What I really meant was `don't assume it's faster than
"$(<filename)"', but it doesn't really need to say that.

> Before posting I tried googling for this and didn't turn up anything. Since
> this is so simple and I think common (perhaps more common that the case
> where one a file as a single long string) possibly this might be mentioned
> in the mapfile doc?

Yes, I think so.

By the way, I'm happy to get partial patches which have been modified to
say something like "say something about X HERE", which helps me locate
where a change is necessary even if the text isn't complete.

> I sort of agree with this comment in zshmodules:
>        It  is  unfortunate that the mechanism for loading modules does not
> yet
>        allow the user to specify the name of the shell parameter to  be
> given
>        the special behaviour.

We've got a better interface to modules now, so we can pass down extra
information with "zmodload -F".  However, it needs to be done carefully:
it's possible different functions would want to map the behaviour onto
different variables.  That's not impossible but would need thought.

It should probably have been called zsh_mapfile to extend the name space in
a more natural way.  I can easily (as with the builtin zstat in zsh/stat)
make the module provide the variable under two different names, so by
default you get mapfile and zsh_mapfile but you could arrange to get only
zsh_mapfile.

> Any thoughts on how to get checksum information? I can shell out to "sum" or
> "md5sum". But given I have the file data as a string if there is a solution
> usesi zsh only, that is preferable.

It sounds like this would need to be a new module in order to do it
internally, with some configuration probing for appropriate libraries using
Clint's new system where the libraries only become dependencies for the
module itself.  It looks like openssl provides this.  You can do things
like "md5sum <<<$file_contents", but there's no real gain over using the
file.  On the other hand, cryptographic functions may typically be
intensive enough that running them inside the shell isn't much of a gain.

Index: Doc/Zsh/mod_mapfile.yo
===================================================================
RCS file: /cvsroot/zsh/zsh/Doc/Zsh/mod_mapfile.yo,v
retrieving revision 1.2
diff -u -r1.2 mod_mapfile.yo
--- Doc/Zsh/mod_mapfile.yo	3 Jul 2007 17:06:04 -0000	1.2
+++ Doc/Zsh/mod_mapfile.yo	18 Sep 2008 10:29:54 -0000
@@ -25,15 +25,18 @@

 The parameter tt(mapfile) may be made read-only; in that case, files
 referenced may not be written or deleted.
+
+A file may conveniently be read into an array as one line per element
+with the form `var(array)tt(=LPAR()${(f)mapfile[)var(filename)tt(]RPAR())'.
 )
 enditem()

 subsect(Limitations)

 Although reading and writing of the file in question is efficiently
-handled, zsh's internal memory management may be arbitrarily baroque.  Thus
-it should not automatically be assumed that use of tt(mapfile) represents a
-gain in efficiency over use of other mechanisms.  Note in particular that
+handled, zsh's internal memory management may be arbitrarily baroque;
+however, tt(mapfile) is usually very much more efficient than
+anything involving a loop.  Note in particular that
 the whole contents of the file will always reside physically in memory when
 accessed (possibly multiple times, due to standard parameter substitution
 operations).  In particular, this means handling of sufficiently long files

--

-- 
Peter Stephenson <pws <at> csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070

Rocky Bernstein | 19 Sep 17:57

Re: reading a file into an array. mapfile? (f)?



On Thu, Sep 18, 2008 at 6:50 AM, Peter Stephenson <pws <at> csr.com> wrote:
...
By the way, I'm happy to get partial patches which have been modified to
say something like "say something about X HERE", which helps me locate
where a change is necessary even if the text isn't complete.

Sorry. Will try to do better in the future.

...  You can do things
like "md5sum <<<$file_contents", but there's no real gain over using the
file. 

The gain is consistency and accuracy in knowing that the checksum matches the data you have recorded for it. And that's the big reason that the text is saved and checksum'd in the first place. (And an important reason for Ruby's SCRIPT_LINES__).
Bart Schaefer | 18 Sep 17:10
Gravatar

Re: reading a file into an array. mapfile? (f)?

On Sep 18,  6:03am, Rocky Bernstein wrote:
}
} I sort of agree with this comment in zshmodules:
}        It is unfortunate that the mechanism for loading modules
}        does not yet allow the user to specify the name of the shell
}        parameter to be given the special behaviour.
} 
} Here's how it is done in Ruby which is extremely simple: if there is an
} associative array SCRIPT_LINES__ defined file lines are saved into this
} array when it reads a file.

I think you and the zsh/mapfile manual are talking about two different
things here.

What you seem to be asking for (based on what SCRIPT_LINES__ really
does in Ruby) is to have the zsh script parser stuff the lines it reads
into a variable as it parses them, so that (for example) "autoload"
would magically copy the function text into the SCRIPT_LINES__ array.

The excerpt above means that it's not possible when loading zsh/mapfile
to cause the variable "mapfile" to have a different name.  That has
nothing to do with script parsing.

Rocky Bernstein | 18 Sep 17:35

Re: reading a file into an array. mapfile? (f)?



On Thu, Sep 18, 2008 at 11:10 AM, Bart Schaefer <schaefer <at> brasslantern.com> wrote:
On Sep 18,  6:03am, Rocky Bernstein wrote:
}
} I sort of agree with this comment in zshmodules:
}        It is unfortunate that the mechanism for loading modules
}        does not yet allow the user to specify the name of the shell
}        parameter to be given the special behaviour.
}
} Here's how it is done in Ruby which is extremely simple: if there is an
} associative array SCRIPT_LINES__ defined file lines are saved into this
} array when it reads a file.

I think you and the zsh/mapfile manual are talking about two different
things here.

Yes, you are correct. Automatically copying program text is what I mean and it's very useful, and used not just in debuggers but in other things. In ruby it's used I think in rcov (testing program coverage) for example.
 


What you seem to be asking for (based on what SCRIPT_LINES__ really
does in Ruby) is to have the zsh script parser stuff the lines it reads
into a variable as it parses them, so that (for example) "autoload"
would magically copy the function text into the SCRIPT_LINES__ array.

The excerpt above means that it's not possible when loading zsh/mapfile
to cause the variable "mapfile" to have a different name.  That has
nothing to do with script parsing.

Rocky Bernstein | 19 Sep 02:47

Re: reading a file into an array. mapfile? (f)?



On Thu, Sep 18, 2008 at 12:44 AM, Bart Schaefer <schaefer <at> brasslantern.com> wrote:


It's much simpler than you seem to believe:

lines=( ${(f)mapfile[/path/to/file]} )

Doesn't seem to work if the file (/path/to/file) has 0-length lines. These don't get stored in the array (lines).

Might be more difficult than you seem to believe :-)

Peter Stephenson | 19 Sep 11:07
Favicon
Gravatar

Re: reading a file into an array. mapfile? (f)?

On Thu, 18 Sep 2008 20:47:00 -0400
"Rocky Bernstein" <rocky.bernstein <at> gmail.com> wrote:
> On Thu, Sep 18, 2008 at 12:44 AM, Bart Schaefer
> <schaefer <at> brasslantern.com>wrote:
> > It's much simpler than you seem to believe:
> >
> > lines=( ${(f)mapfile[/path/to/file]} )
> 
> 
> Doesn't seem to work if the file (/path/to/file) has 0-length lines. These
> don't get stored in the array (lines).

Try

lines=( "${(f)mapfile[/path/to/file]}" )

Explanation:  without the quotes, elements that turn into an empty string
are removed.  (f) includes the effect of the (@) flag, which is a
generalised equivalent of the behaviour of "${array[@]}", so simply adding
the quotes is enough to get the empty lines.

I'd better update the change I made to the mapfile documentation.

--

-- 
Peter Stephenson <pws <at> csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070

Rocky Bernstein | 19 Sep 12:39

Re: reading a file into an array. mapfile? (f)?

This doesn't seem to work for me. What am I doing wrong?

Here is my test program

zmodload -ap zsh/mapfile mapfile
echo 'line 1' > ./file.txt
echo '' >> ./file.txt
echo 'line 2' >> ./file.txt
text=( "${(f)mapfile[./file.txt]}" )
declare -p text
print ${#text[ <at> ]} lines

text=( "${(f)mapfile[/src/external-cvs/zsh/configure]}" )
print ${#text[ <at> ]} lines
wc -l /src/external-cvs/zsh/configure

And here is the output I get:

typeset -a text
text=('line 1' 'line 2' '')
3 lines
18726 lines
21124 /src/external-cvs/zsh/configure

Further tests show that the behavior seems to be to drop blank lines but put one at the end of the file.

Thanks.





On Fri, Sep 19, 2008 at 5:07 AM, Peter Stephenson <pws <at> csr.com> wrote:
On Thu, 18 Sep 2008 20:47:00 -0400
"Rocky Bernstein" <rocky.bernstein <at> gmail.com> wrote:
> On Thu, Sep 18, 2008 at 12:44 AM, Bart Schaefer
> <schaefer <at> brasslantern.com>wrote:
> > It's much simpler than you seem to believe:
> >
> > lines=( ${(f)mapfile[/path/to/file]} )
>
>
> Doesn't seem to work if the file (/path/to/file) has 0-length lines. These
> don't get stored in the array (lines).

Try

lines=( "${(f)mapfile[/path/to/file]}" )

Explanation:  without the quotes, elements that turn into an empty string
are removed.  (f) includes the effect of the ( <at> ) flag, which is a
generalised equivalent of the behaviour of "${array[ <at> ]}", so simply adding
the quotes is enough to get the empty lines.

I'd better update the change I made to the mapfile documentation.

--
Peter Stephenson <pws <at> csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070

Peter Stephenson | 19 Sep 13:00
Favicon
Gravatar

Re: reading a file into an array. mapfile? (f)?

"Rocky Bernstein" wrote:
> Further tests show that the behavior seems to be to drop blank lines but put
> one at the end of the file.

Apologies, you're right; it seems you *do* need the @ flag, but my test
was too simple to show it.

array=("${(f@)mapfile[foo.txt]}")

I'm not entirely sure why that is but I must be misremembering the (f)
rules.

Also, it seems you always get an extra blank at the end: I think that's
because mapfile returns the complete file including the "\n" and the (f)
flag splits on that "\n" giving you a blank after it.  This should be
entirely predictable, however (I hope): you get that last blank line if
there was a final newline, and if you don't get it there wasn't a final
newline.

I think that's the one difference from "${(f@)$(<foo.txt)}": in that
case a final newline is always stripped, but consequently you can't tell
if there was one there.

--

-- 
Peter Stephenson <pws <at> csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070

Rocky Bernstein | 19 Sep 16:44

Re: reading a file into an array. mapfile? (f)?

Thanks, Bart and Peter - this works great! I will note and make use of the fact that the lack of a null line at the end means the file doesn't have a final newline.

So now with this, file listing (the "list" command) is now in the debugger. Down the line, file information should be kept. Most important would be modification time and some sort of checksum (or "cryptographic") information. Too bad this isn't already in the zsh/files module.

(And it would be cool and I think relatively straightforward to add a Ruby-like SCRIPT_LINES__ capability. In the context of a debugger or any profiling/introspection tool it means that one reliably has the lines of the script under inspection.)

On Fri, Sep 19, 2008 at 7:00 AM, Peter Stephenson <pws <at> csr.com> wrote:
"Rocky Bernstein" wrote:
> Further tests show that the behavior seems to be to drop blank lines but put
> one at the end of the file.

Apologies, you're right; it seems you *do* need the <at> flag, but my test
was too simple to show it.

array=("${(f <at> )mapfile[foo.txt]}")

I'm not entirely sure why that is but I must be misremembering the (f)
rules.

Also, it seems you always get an extra blank at the end: I think that's
because mapfile returns the complete file including the "\n" and the (f)
flag splits on that "\n" giving you a blank after it.  This should be
entirely predictable, however (I hope): you get that last blank line if
there was a final newline, and if you don't get it there wasn't a final
newline.

I think that's the one difference from "${(f <at> )$(<foo.txt)}": in that
case a final newline is always stripped, but consequently you can't tell
if there was one there.

--
Peter Stephenson <pws <at> csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070

Phil Pennock | 20 Sep 22:29

Re: reading a file into an array. mapfile? (f)?

On 2008-09-19 at 10:44 -0400, Rocky Bernstein wrote:
> So now with this, file listing (the "list" command) is now in the debugger.
> Down the line, file information should be kept. Most important would be
> modification time and some sort of checksum (or "cryptographic")
> information. Too bad this isn't already in the zsh/files module.

Mod-time you get from zsh/stat.

You probably want to avoid loading the 'stat' built-in name, since
nowadays that's an external command with different behaviour, but:
 zmodload -aF zsh/stat b:zstat
will get you 'zstat', same behaviour.

% zstat -H foo src
% print ${(k)foo}
blksize nlink size rdev mode mtime inode blocks device atime uid link ctime gid
% print $foo[mtime]
1220980489
% zstat -sH foo src
% print $foo[mtime]
Tue Sep  9 10:14:49

Cryptographic, I'm more cautious of, as the cryptographic libraries tend
to be fairly large and the ABI compatibility awkward.  Having an
external command seg-fault because the version of OpenSSL it's loading
doesn't match the version it was linked against is one thing.  Having
your shell seg-fault and die when a module is loaded or you run a
command is another.

While zsh could just implement certain checksum algorithms internally,
picking which ones are "safe" is a cryptanalysis job and the sort of
thing we should stay away from, as the decisions made would last for
years (measured in decades, probably).

IMO adding cryptographic support to the shell would be into the area of
bloat; zsh is still a Unix shell and still well suited to running
external commands.

Is there a reason that a single fork/exec to get a checksum is a problem
(besides worrying about the portability of the checksum commands used)?

-Phil

Rocky Bernstein | 21 Sep 05:32

Re: reading a file into an array. mapfile? (f)?



On Sat, Sep 20, 2008 at 4:29 PM, Phil Pennock <zsh-workers+phil.pennock <at> spodhuis.org> wrote:
On 2008-09-19 at 10:44 -0400, Rocky Bernstein wrote:
> So now with this, file listing (the "list" command) is now in the debugger.
> Down the line, file information should be kept. Most important would be
> modification time and some sort of checksum (or "cryptographic")
> information. Too bad this isn't already in the zsh/files module.

Mod-time you get from zsh/stat.

You probably want to avoid loading the 'stat' built-in name, since
nowadays that's an external command with different behaviour, but:
 zmodload -aF zsh/stat b:zstat
will get you 'zstat', same behaviour.

% zstat -H foo src
% print ${(k)foo}
blksize nlink size rdev mode mtime inode blocks device atime uid link ctime gid
% print $foo[mtime]
1220980489
% zstat -sH foo src
% print $foo[mtime]
Tue Sep  9 10:14:49

Ok. Thanks! Missed seeing this because it's not mentioned the last installed manpage for zshmodules that I have. (However I do see it now in Doc/Zsh/mod/mod_stat.yo


Cryptographic, I'm more cautious of, as the cryptographic libraries tend
to be fairly large and the ABI compatibility awkward.  Having an
external command seg-fault because the version of OpenSSL it's loading
doesn't match the version it was linked against is one thing.  Having
your shell seg-fault and die when a module is loaded or you run a
command is another.

Actually I didn't use the word "Cryptographic" initially; someone else did.  And I put it in quotes. I'm more thinking hash of the file data more like a signature or checksum.


While zsh could just implement certain checksum algorithms internally,
picking which ones are "safe" is a cryptanalysis job and the sort of
thing we should stay away from, as the decisions made would last for
years (measured in decades, probably).

I think you're making this a bigger deal than it need be. A module implementing SHA1 would probably be sufficient although what implementers I think tend to find is that if you code SHA1 adding others isn't a big deal. So Perl, Python and Ruby all have such modules for SHA1 along with and for a vast number of other kinds of digests.


IMO adding cryptographic support to the shell would be into the area of
bloat;

Perhaps. All of this is a matter of taste and opinion. To me it's as much bloat as say mod_files given that, as you say zsh is still well-suited to running external commands. :-) If it's put in a module, folks can decide to install it or not.
 
zsh is still a Unix shell and still well suited to running
external commands.

Is there a reason that a single fork/exec to get a checksum is a problem
(besides worrying about the portability of the checksum commands used)?

Portability is what exactly what I was thinking.

Here is the particular scenario I'm thinking. In debuggers for C, C++, Java, Python and Ruby (among probably others) one can debug a program remotely. What often happens here is that the source code may be coming from a different filesystem than where the program is in fact running. In this situation one would like a signature or checksum of the data to verify that the source code is the same. I've noticed in the past that "sum" on Solaris gives a different result than "sum" on GNU/Linux. I think however that "md5sum" does give the same result.

But in practice, I am not aware of any debugging systems make use of a checksum of the source currently. Kind of a shame, but that's the way it is. And in truth I can't see zshdb getting good enough in the short term to need this anytime soon. And if I really needed it, I could code up a zsh module.

I had just thought that if it happens to be out there -- like fast file reading into an array or getting the modification time --  I'd use it in preparation for someday when it really could be used.

Although the example I give is drawn from debuggers, the same is true in other problem domains. git for example makes use of signatures for files. And again the data doesn't have to reside strictly in a file, it could be an internal string.

But again, thanks for the info.



-Phil

Aaron Davies | 22 Sep 04:25

Re: reading a file into an array. mapfile? (f)?

On Sun, Sep 21, 2008 at 11:32 AM, Rocky Bernstein
<rocky.bernstein <at> gmail.com> wrote:

> Here is the particular scenario I'm thinking. In debuggers for C, C++, Java,
> Python and Ruby (among probably others) one can debug a program remotely.
> What often happens here is that the source code may be coming from a
> different filesystem than where the program is in fact running. In this
> situation one would like a signature or checksum of the data to verify that
> the source code is the same. I've noticed in the past that "sum" on Solaris
> gives a different result than "sum" on GNU/Linux. I think however that
> "md5sum" does give the same result.

if all you're interested in is checking data consistency, maybe you
should use crc, which anyone who knows anything about hashing will
immediately realize means you make no guarantees about security. (it's
probably also much easier to write and faster to compute)
--

-- 
Aaron Davies
aaron.davies <at> gmail.com

Bart Schaefer | 19 Sep 17:19
Gravatar

Re: reading a file into an array. mapfile? (f)?

On Sep 19, 12:00pm, Peter Stephenson wrote:
} Subject: Re: reading a file into an array. mapfile? (f)?
}
} "Rocky Bernstein" wrote:
} > Further tests show that the behavior seems to be to drop blank lines
} > but put one at the end of the file.
} 
} Apologies, you're right; it seems you *do* need the @ flag, but my test
} was too simple to show it.
} 
} array=("${(f@)mapfile[foo.txt]}")
} 
} I'm not entirely sure why that is but I must be misremembering the (f)
} rules.

This is a fairly recent change (well, about a year ago):

     For historical reasons, the usual behaviour that empty array
     elements are retained inside double quotes is disabled for arrays
     generated by splitting; hence the following:

          line="one::three"
          print -l "${(s.:.)line}"

     produces two lines of output for one and three and elides the
     empty field.  To override this behaviour, supply the "(@)" flag as
     well, i.e.  "${(@s.:.)line}".

Peter Stephenson | 19 Sep 17:29
Favicon
Gravatar

Re: reading a file into an array. mapfile? (f)?

Bart Schaefer wrote:
> On Sep 19, 12:00pm, Peter Stephenson wrote:
> } Apologies, you're right; it seems you *do* need the @ flag, but my test
> } was too simple to show it.
> } 
> } array=("${(f@)mapfile[foo.txt]}")
> } 
> } I'm not entirely sure why that is but I must be misremembering the (f)
> } rules.
> 
> This is a fairly recent change (well, about a year ago):
> 
>      For historical reasons, the usual behaviour that empty array
>      elements are retained inside double quotes is disabled for arrays
>      generated by splitting; hence the following:
> 
>           line="one::three"
>           print -l "${(s.:.)line}"
> 
>      produces two lines of output for one and three and elides the
>      empty field.  To override this behaviour, supply the "(@)" flag as
>      well, i.e.  "${(@s.:.)line}".

Ah, yes, I remember the change.  However, now you've reminded me, I
think what's changed is the ability to add the (@) to get the normal
splitting-in-double-quotes behaviour.  The fact that splitting overrides
the effect of the quotes is longstanding (and in my view wrong, although
it's too late to change it).

--

-- 
Peter Stephenson <pws <at> csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


Gmane