Jon Fairbairn | 24 Oct 12:08 2012
X-Face
Picon
Picon

Sparse records/ADTs


Is there a convenient way of handling a data structure with lots
of fields of different types that may or may not be filled in?

Something equivalent to

data D = D {a::Maybe A, b::Maybe B, c::Maybe C, …}

but with better space efficiency and a more convenient empty
object.

An easy alternative is

data E = Ea A | Eb B | Ec C | …
type R = [E]

which has a straightforward empty object, but one then must
define

   getA e = listToMaybe [a | Ea a <- e]

for each field, which is tedious (and O(n)). Obviously Templates
would help, but is there an alternative I’ve missed?

--

-- 
Jón Fairbairn                                 Jon.Fairbairn <at> cl.cam.ac.uk

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
(Continue reading)

Roman Cheplyaka | 24 Oct 12:55 2012

Re: Sparse records/ADTs

* Jon Fairbairn <jon.fairbairn <at> cl.cam.ac.uk> [2012-10-24 11:08:29+0100]
> Is there a convenient way of handling a data structure with lots
> of fields of different types that may or may not be filled in?
> 
> Something equivalent to
> 
> data D = D {a::Maybe A, b::Maybe B, c::Maybe C, …}
> 
> but with better space efficiency and a more convenient empty
> object.
> 
> An easy alternative is
> 
> data E = Ea A | Eb B | Ec C | …
> type R = [E]
> 
> which has a straightforward empty object, but one then must
> define
> 
>    getA e = listToMaybe [a | Ea a <- e]
> 
> for each field, which is tedious (and O(n)). Obviously Templates
> would help, but is there an alternative I’ve missed?

For runtime efficiency it's better to use Data.Map.

For keys of this map you have two alternatives: either define

    data Key = Ka | Kb | Kc | ...

(Continue reading)

Roman Cheplyaka | 24 Oct 13:06 2012

Re: Sparse records/ADTs

* Roman Cheplyaka <roma <at> ro-che.info> [2012-10-24 13:55:17+0300]
> * Jon Fairbairn <jon.fairbairn <at> cl.cam.ac.uk> [2012-10-24 11:08:29+0100]
> > Is there a convenient way of handling a data structure with lots
> > of fields of different types that may or may not be filled in?
> > 
> > Something equivalent to
> > 
> > data D = D {a::Maybe A, b::Maybe B, c::Maybe C, …}
> > 
> > but with better space efficiency and a more convenient empty
> > object.
> > 
> > An easy alternative is
> > 
> > data E = Ea A | Eb B | Ec C | …
> > type R = [E]
> > 
> > which has a straightforward empty object, but one then must
> > define
> > 
> >    getA e = listToMaybe [a | Ea a <- e]
> > 
> > for each field, which is tedious (and O(n)). Obviously Templates
> > would help, but is there an alternative I’ve missed?
> 
> For runtime efficiency it's better to use Data.Map.

Actually, you can use Data.IntMap for even better performance, if you
define an Enum instance for your keys.

(Continue reading)

Daniel Trstenjak | 24 Oct 13:02 2012
Picon

Re: Sparse records/ADTs


Hi Jon

On Wed, Oct 24, 2012 at 11:08:29AM +0100, Jon Fairbairn wrote:
> for each field, which is tedious (and O(n)). Obviously Templates
> would help, but is there an alternative I’ve missed?

perhaps something like:

data Type = Ta | Tb | Tc ...

data E    = Ea A | Eb B | Ec C | ...

type D    = HashMap Type E

get :: Type -> D -> Maybe E
get = lookup

Greetings,
Daniel

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Twan van Laarhoven | 24 Oct 17:10 2012
Picon

Re: Sparse records/ADTs

On 24/10/12 12:08, Jon Fairbairn wrote:
>
> Is there a convenient way of handling a data structure with lots
> of fields of different types that may or may not be filled in?
>

Not sure about convenience, but here is a type safe solution with O(log n) 
lookups and updates. The idea is to define a GADT tree type with a fixed layout:

     -- define the structure
     type MyT = TBranch (TLeaf A) (TBranch (TLeaf B) (TLeaf C))
     -- a value level tree that uses that structure
     type My  = GTree MyT

You still have to define the paths to the members

     pa = GL GH
     pb = GR (GL GH)
     pc = GR (GR GH)

But once you have that you can perform lookups and updates:

     *Main> glookup pc (gupdate pa (Just A) (gupdate pc (Just C) gempty))
     Just C

It shouldn't be too hard to make a template haskell function that generates 
these paths. Or perhaps the corresponding lenses.

Twan
(Continue reading)

Jon Fairbairn | 26 Oct 17:17 2012
X-Face
Picon
Picon

Re: Sparse records/ADTs

Twan van Laarhoven <twanvl <at> gmail.com> writes:

> On 24/10/12 12:08, Jon Fairbairn wrote:
>>
>> Is there a convenient way of handling a data structure with lots
>> of fields of different types that may or may not be filled in?
>>
>
> Not sure about convenience, but here is a type safe solution
> with O(log n) lookups and updates. The idea is to define a
> GADT tree type with a fixed layout:

Thanks for your reply (and for all the others). Since type safe
is something that (for me) goes without saying, this is the best
solution, but it doesn’t really satisfy the convenience aspect.
(I had already looked at solutions using Map and contemplated a
tree structure, but didn’t like anything I had come up with). In
short, it looks like the answer to my question is “No.” :-/

--

-- 
Jón Fairbairn                                 Jon.Fairbairn <at> cl.cam.ac.uk

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Yuri de Wit | 26 Oct 17:25 2012
Picon

Re: Sparse records/ADTs

Would this be relevant?




On Fri, Oct 26, 2012 at 11:17 AM, Jon Fairbairn <jon.fairbairn <at> cl.cam.ac.uk> wrote:
Twan van Laarhoven <twanvl <at> gmail.com> writes:

> On 24/10/12 12:08, Jon Fairbairn wrote:
>>
>> Is there a convenient way of handling a data structure with lots
>> of fields of different types that may or may not be filled in?
>>
>
> Not sure about convenience, but here is a type safe solution
> with O(log n) lookups and updates. The idea is to define a
> GADT tree type with a fixed layout:

Thanks for your reply (and for all the others). Since type safe
is something that (for me) goes without saying, this is the best
solution, but it doesn’t really satisfy the convenience aspect.
(I had already looked at solutions using Map and contemplated a
tree structure, but didn’t like anything I had come up with). In
short, it looks like the answer to my question is “No.” :-/

--
Jón Fairbairn                                 Jon.Fairbairn <at> cl.cam.ac.uk


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Jon Fairbairn | 27 Oct 12:04 2012
X-Face
Picon
Picon

Re: Sparse records/ADTs

Yuri de Wit <ydewit <at> gmail.com> writes:

> Would this be relevant?
>
> https://github.com/jonsterling/Data.Records

That looks promising, thanks.  It does use rather a lot of
extensions, so it’ll take me a while to understand it.
--

-- 
Jón Fairbairn                                 Jon.Fairbairn <at> cl.cam.ac.uk

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Sjoerd Visscher | 27 Oct 12:19 2012

Re: Sparse records/ADTs

Maybe the vault package works for you?
http://hackage.haskell.org/package/vault

Sjoerd Visscher

On Oct 26, 2012, at 5:17 PM, Jon Fairbairn <jon.fairbairn <at> cl.cam.ac.uk> wrote:

> Twan van Laarhoven <twanvl <at> gmail.com> writes:
> 
>> On 24/10/12 12:08, Jon Fairbairn wrote:
>>> 
>>> Is there a convenient way of handling a data structure with lots
>>> of fields of different types that may or may not be filled in?
>>> 
>> 
>> Not sure about convenience, but here is a type safe solution
>> with O(log n) lookups and updates. The idea is to define a
>> GADT tree type with a fixed layout:
> 
> Thanks for your reply (and for all the others). Since type safe
> is something that (for me) goes without saying, this is the best
> solution, but it doesn’t really satisfy the convenience aspect.
> (I had already looked at solutions using Map and contemplated a
> tree structure, but didn’t like anything I had come up with). In
> short, it looks like the answer to my question is “No.” :-/
> 
> -- 
> Jón Fairbairn                                 Jon.Fairbairn <at> cl.cam.ac.uk
> 
> 
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe <at> haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
AntC | 25 Oct 01:10 2012
Picon

Re: Sparse records/ADTs

Jon Fairbairn <jon.fairbairn <at> cl.cam.ac.uk> writes:

> 
> 
> Is there a convenient way of handling a data structure with lots
> of fields of different types that may or may not be filled in?
> 

Hi Jon, if your question had appeared in a database forum, the answer would 
be ...

Sounds like you need to do normal-form analysis:
- what are the functional dependencies between the fields?
- is your data structure something like a 'universal relation'?
- is the structure better expressed as several records?
  (vertically partitioned)

Relevant treatments could be "How to Handle missing information without using 
NULL"  http://www.dcs.warwick.ac.uk/~hugh/TTM/Missing-info-without-nulls.pdf

I appreciate that SQL's NULL is not the same as Haskell's Maybe/Nothing (and 
SQL's semantics for NULL are utterly horrible). Nevertheless, if you're 
concerned to avoid the wasted space of sparse records, it could be a helpful 
discipline to design data structures without needing Maybe's.

AntC
David Thomas | 25 Oct 02:30 2012
Picon

Re: Sparse records/ADTs

You've got a bunch of great answers, if there's no rhyme or reason to
which fields are missing.

If, on the other hand, they will tend to be present or absent in
groups, you could decompose your data-structure a bit, for fast
lookups, good space efficiency, and maybe even slightly more
interesting checks from the type system.

On Wed, Oct 24, 2012 at 3:08 AM, Jon Fairbairn
<jon.fairbairn <at> cl.cam.ac.uk> wrote:
>
> Is there a convenient way of handling a data structure with lots
> of fields of different types that may or may not be filled in?
>
> Something equivalent to
>
> data D = D {a::Maybe A, b::Maybe B, c::Maybe C, …}
>
> but with better space efficiency and a more convenient empty
> object.
>
> An easy alternative is
>
> data E = Ea A | Eb B | Ec C | …
> type R = [E]
>
> which has a straightforward empty object, but one then must
> define
>
>    getA e = listToMaybe [a | Ea a <- e]
>
> for each field, which is tedious (and O(n)). Obviously Templates
> would help, but is there an alternative I’ve missed?
>
> --
> Jón Fairbairn                                 Jon.Fairbairn <at> cl.cam.ac.uk
>
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe <at> haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe

Gmane