Craig Innes | 13 Mar 21:32 2013
Picon

Annotating an AST with type checking / source line number info

Hi there,

I am not hugely familiar with compilers or the particulars of GHC, but am interested in creating a few programs which manipulate Haskell source code in particular ways. Two things I would like to be able to do are:

- Swap every occurrence of a particular type for a different / dummy value of that type.

- Find the line / column number of every place a value of a particular type is used

From compiling Haskell programs, it seems clear that GHC performs type checking, and when I get a compile error it is able extract line and column information about where I had the error, so it appears as if GHC is annotating the source with this information as it compiles.

I am struggling to find a lot in the way of learning materials for GHC, but from trawling through the API documentation (I am using GHC version 7.4.2), it seems like I can generate an abstract syntax tree via the method:

parseModule :: GhcMonad m => ModSummary -> m ParsedModule

My question is this: what combination of functions do I have to use to get not only an AST for my source, but an AST annotated with typing information and line / column number annotations for values within it?

Also, as I am a bit of a newbie to this whole GHC API thing, any pointers to resources to learn more about it would be enormously appreciated.

Thanks,

Craig


_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users <at> haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Ranjit Jhala | 13 Mar 21:43 2013

Re: Annotating an AST with type checking / source line number info

Hi Craig -- 

you might look at: 

  http://goto.ucsd.edu/~rjhala/llvm-haskell/doc/html/liquidtypes/Language-Haskell-Liquid-GhcInterface.html#v:getGhcInfo
  http://goto.ucsd.edu/~rjhala/llvm-haskell/doc/html/liquidtypes/src/Language-Haskell-Liquid-GhcInterface.html#getGhcModGuts1

for an example of how how to turn a `FilePath` into a GHC `[CoreBind]`.

  http://www.haskell.org/ghc/docs/latest/html/libraries/ghc/CoreSyn.html#t:CoreBind

The latter (i.e. `CoreBind` and related `Expr` and friends) are GHC's 
"core" representation of programs AFTER some amount of simplification. 
(This may be too "late" for your purposes, i.e. you may want something prior
to simplification, but look at this chain

       mod_guts   <- coreModule `fmap` (desugarModule =<< typecheckModule =<< parseModule modSummary)

to get a sense of the different steps...

Hope this helps!

Ranjit.

On Mar 13, 2013, at 1:32 PM, Craig Innes wrote:

> Hi there,
> 
> I am not hugely familiar with compilers or the particulars of GHC, but am interested in creating a few
programs which manipulate Haskell source code in particular ways. Two things I would like to be able to do are:
> 
> - Swap every occurrence of a particular type for a different / dummy value of that type.
> 
> - Find the line / column number of every place a value of a particular type is used
> 
> From compiling Haskell programs, it seems clear that GHC performs type checking, and when I get a compile
error it is able extract line and column information about where I had the error, so it appears as if GHC is
annotating the source with this information as it compiles.
> 
> I am struggling to find a lot in the way of learning materials for GHC, but from trawling through the API
documentation (I am using GHC version 7.4.2), it seems like I can generate an abstract syntax tree via the method:
> 
> parseModule :: GhcMonad m => ModSummary -> m ParsedModule
> 
> My question is this: what combination of functions do I have to use to get not only an AST for my source, but an
AST annotated with typing information and line / column number annotations for values within it?
> 
> Also, as I am a bit of a newbie to this whole GHC API thing, any pointers to resources to learn more about it
would be enormously appreciated.
> 
> Thanks,
> 
> Craig 
> 
> 
> _______________________________________________
> Glasgow-haskell-users mailing list
> Glasgow-haskell-users <at> haskell.org
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Richard Eisenberg | 14 Mar 19:00 2013

Re: Annotating an AST with type checking / source line number info

Disclaimer: I have worked on a few projects updating GHC itself; I have never used the GHC API.

You've asked for two different things: location information and type information in an AST.

The first is easy: it's already there, I believe. Many of the GHC types are prefixed with "L". By convention, an (LFoo a b) is a synonym for Located (Foo a b). Located, in turn, is defined as if by

> newtype Located e = L SrcSpan e

(Located is actually a synonym for something else, but this isn't important here.) So, every time you see a type prefixed with L in the AST, you have location information. These types are very prevalent, so I think you should be able to find what you're looking for. SrcSpan is in ghc/compiler/basicTypes/SrcLoc.lhs, if you want to have a look.

As for types: this one is a little harder. Haskell code is translated into an internal language, variously called Core or System FC. The GHC type for Haskell expressions is HsExpr. The GHC type for FC expressions in CoreExpr. These types are quite different. FC expressions have explicit type annotations that can be extracted easily. Haskell expressions don't, as far as I know. As you might expect, FC expressions don't have location information,* so they may not be of much use to you. I don't know of a way to get a type of a HsExpr, but maybe someone else does.

*Though FC expressions don't have location information, FC expressions do contain Vars, which in turn contain Names, which in turn do contain location information. I don't know if this is the location of the declaration of the name or the use of the name, but you may find a useful nugget there.

Is it important that your tool produces Haskell source code? I believe there's a plugin architecture that would allow you to manipulate structures on their way through the compiler, if that solves your problem.

I hope this helps!
Richard

On Mar 13, 2013, at 4:32 PM, Craig Innes wrote:

Hi there,

I am not hugely familiar with compilers or the particulars of GHC, but am interested in creating a few programs which manipulate Haskell source code in particular ways. Two things I would like to be able to do are:

- Swap every occurrence of a particular type for a different / dummy value of that type.

- Find the line / column number of every place a value of a particular type is used

From compiling Haskell programs, it seems clear that GHC performs type checking, and when I get a compile error it is able extract line and column information about where I had the error, so it appears as if GHC is annotating the source with this information as it compiles.

I am struggling to find a lot in the way of learning materials for GHC, but from trawling through the API documentation (I am using GHC version 7.4.2), it seems like I can generate an abstract syntax tree via the method:

parseModule :: GhcMonad m => ModSummary -> m ParsedModule

My question is this: what combination of functions do I have to use to get not only an AST for my source, but an AST annotated with typing information and line / column number annotations for values within it?

Also, as I am a bit of a newbie to this whole GHC API thing, any pointers to resources to learn more about it would be enormously appreciated.

Thanks,

Craig


_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users <at> haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users <at> haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Gmane