Roger Mason | 21 May 16:52 2013
Picon
Picon

parsing a CSV file

Hello,

I'm attempting to write a parser for files that look like this:

Bruker Nano GmbH Berlin, Germany
Esprit 1.9

Date: 02/05/2013 10:06:49 AM
Real time: 15000
Energy Counts
-0.474    0
.....

The line before the ellipsis is repeated many times (such lines 
represents a spectrum).  I need to be able to extract numbers from lines 
containing <string: > and I want to extract the number pairs following 
"Energy Counts\n".  The extracted data will then be written to a file in 
a different format.  For now I'll be satisfied with reading the "header" 
info, i.e. down to "Energy Counts\n".

Thus far, I have:
-- derived from RWH
-- file: ch16/csv2.hs
import Text.ParserCombinators.Parsec

headerLines = endBy csvFile endHeader
csvFile = endBy line eol
line = sepBy cell (char ',')
cell = many (noneOf ",\n")
eol = char '\n'
(Continue reading)

Roman Cheplyaka | 21 May 17:06 2013

Re: parsing a CSV file

* Roger Mason <rmason <at> mun.ca> [2013-05-21 12:22:53-0230]
> Thus far, I have:
> -- derived from RWH
> -- file: ch16/csv2.hs
> import Text.ParserCombinators.Parsec
> 
> headerLines = endBy csvFile endHeader
> csvFile = endBy line eol
> line = sepBy cell (char ',')
> cell = many (noneOf ",\n")
> eol = char '\n'
> 
> parseCSV :: String -> Either ParseError [[String]]
> parseCSV input = parse csvFile "(unknown)" input
> 
> parseHDR :: String -> Either ParseError [[String]]
> parseHDR input = parse headerLines "(unknown)" input
> 
> endHeader = string "Energy Counts"
> 
> This loads into GHCi (7.6.2) OK.  However, when I test it:
> 
> parseHDR "Bruker Nano GmbH Berlin, Germany\nEsprit 1.9\n\nDate:
> 02/05/2013 10:06:49 AM\nReal time: 15000\nEnergy Counts"
> 
> Not in scope: `parseHDR'
> 
> which makes sense because
> 
> ghci> :t endHeader
(Continue reading)

Roger Mason | 21 May 18:03 2013
Picon
Picon

Re: parsing a CSV file

Hi Roman,

On 05/21/2013 12:36 PM, Roman Cheplyaka wrote:
>
> Clearly, my naiive implementation of endHeader is no good.
> Hi Roger,
>
> "Not in scope" means that that thing is not defined.
>
> So it's not a problem with your implementation, but with the way you
> load it.
>
> If you copy-paste your ghci session here, you may get further help.
>
> Roman
Starting with a clean ghci session I get this:

ghci> :l csv.hs
[1 of 1] Compiling Main             ( csv.hs, interpreted )

csv.hs:15:24:
     Couldn't match type `[Char]' with `Char'
     Expected type: Text.Parsec.Prim.Parsec String () [[String]]
       Actual type: Text.Parsec.Prim.ParsecT
                      String () Data.Functor.Identity.Identity [[[[Char]]]]
     In the first argument of `parse', namely `headerLines'
     In the expression: parse headerLines "(unknown)" input
     In an equation for `parseHDR':
         parseHDR input = parse headerLines "(unknown)" input
Failed, modules loaded: none.
(Continue reading)

Roman Cheplyaka | 21 May 19:45 2013

Re: parsing a CSV file

* Roger Mason <rmason <at> mun.ca> [2013-05-21 13:33:47-0230]
> Hi Roman,
> 
> On 05/21/2013 12:36 PM, Roman Cheplyaka wrote:
> >
> >Clearly, my naiive implementation of endHeader is no good.
> >Hi Roger,
> >
> >"Not in scope" means that that thing is not defined.
> >
> >So it's not a problem with your implementation, but with the way you
> >load it.
> >
> >If you copy-paste your ghci session here, you may get further help.
> >
> >Roman
> Starting with a clean ghci session I get this:
> 
> ghci> :l csv.hs
> [1 of 1] Compiling Main             ( csv.hs, interpreted )
> 
> csv.hs:15:24:
>     Couldn't match type `[Char]' with `Char'
>     Expected type: Text.Parsec.Prim.Parsec String () [[String]]
>       Actual type: Text.Parsec.Prim.ParsecT
>                      String () Data.Functor.Identity.Identity [[[[Char]]]]
>     In the first argument of `parse', namely `headerLines'
>     In the expression: parse headerLines "(unknown)" input
>     In an equation for `parseHDR':
>         parseHDR input = parse headerLines "(unknown)" input
(Continue reading)

Roger Mason | 21 May 20:09 2013
Picon
Picon

Re: parsing a CSV file

Thank you.

Roger

On 05/21/2013 03:15 PM, Roman Cheplyaka wrote:
> So this is the real error. If you read it carefully, it says that it 
> expected [[String]] but got [[[[Char]]]] (i.e. [[[String]]]) as a 
> result of the headerLines parser. I don't have time right now to look 
> closer at your code, but I suggest studying the types of combinators 
> you use (such as endBy) and trying to write down type signatures for 
> the rest of the values you define. This way you'll find the error and 
> better understand your program. A useful trick is to start ghci with 
> -fdefer-type-errors and use ":t" to inspect types of various 
> expressions that you encounter. Roman 

This electronic communication is governed by the terms and conditions at
http://www.mun.ca/cc/policies/electronic_communications_disclaimer_2012.php

Gmane