Kyle Hanson | 30 Aug 00:35 2013
Picon

Debugging ByteString and Data.Binary.Get memory usage

OK

I have a bunch of BSON documents that I convert to ByteStrings, put in a Map, and write to a socket based on the response. I noticed some high memory usage (in the GBs) so I decided to investigate. I simplified my problem into a small program that demonstrates clearer what is happening. 

I wrote two versions, one with a Lazy Map and Lazy ByteStrings and one with a Strict Map and Strict ByteStrings. Both share the same memory behavior (except the lazy BS one is faster)

Here is the strict version:


And here is the lazy version:


I wrote this and compared the memory and speed behavior of ByteStrings generated by converting it from a BSON document and ByteStrings generated more purely.

The length of the ByteString from a BSON document is 68k and the length of the "pure" BS is 70k. 

This is my weird memory behavior, both BSON and "pure" methods use the same amount of memory after inserting 10k of them (90mb)

However when I go to lookup a value, the BSON Map explodes the memory to over 250mb. Even if I lookup just 1 value. Looking up any number of values in the "pure BS" keeps the memory usage stable (90mb).

I am hoping someone can help me understand this. I have read some posts about Temporary ByteStrings causing memory issues but I don't know how to get started debugging.

--
Kyle Hanson
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Johan Tibell | 30 Aug 00:47 2013
Picon

Re: Debugging ByteString and Data.Binary.Get memory usage

A good starting point is to estimate how much space you think the data should take using e.g.


If you do that, is the actual space usage close to what you expected?


On Thu, Aug 29, 2013 at 5:35 PM, Kyle Hanson <hanooter <at> gmail.com> wrote:
OK

I have a bunch of BSON documents that I convert to ByteStrings, put in a Map, and write to a socket based on the response. I noticed some high memory usage (in the GBs) so I decided to investigate. I simplified my problem into a small program that demonstrates clearer what is happening. 

I wrote two versions, one with a Lazy Map and Lazy ByteStrings and one with a Strict Map and Strict ByteStrings. Both share the same memory behavior (except the lazy BS one is faster)

Here is the strict version:


And here is the lazy version:


I wrote this and compared the memory and speed behavior of ByteStrings generated by converting it from a BSON document and ByteStrings generated more purely.

The length of the ByteString from a BSON document is 68k and the length of the "pure" BS is 70k. 

This is my weird memory behavior, both BSON and "pure" methods use the same amount of memory after inserting 10k of them (90mb)

However when I go to lookup a value, the BSON Map explodes the memory to over 250mb. Even if I lookup just 1 value. Looking up any number of values in the "pure BS" keeps the memory usage stable (90mb).

I am hoping someone can help me understand this. I have read some posts about Temporary ByteStrings causing memory issues but I don't know how to get started debugging.

--
Kyle Hanson

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Bob Ippolito | 30 Aug 06:09 2013

Re: Debugging ByteString and Data.Binary.Get memory usage

Building a map with foldr seems unwise, have you tried doing it with fromListWith instead? Or foldl'? In either case, since you don't even put the map into WHNF, none of the computation is done at all in either case until the first lookup.


On Thu, Aug 29, 2013 at 3:35 PM, Kyle Hanson <hanooter <at> gmail.com> wrote:
OK

I have a bunch of BSON documents that I convert to ByteStrings, put in a Map, and write to a socket based on the response. I noticed some high memory usage (in the GBs) so I decided to investigate. I simplified my problem into a small program that demonstrates clearer what is happening. 

I wrote two versions, one with a Lazy Map and Lazy ByteStrings and one with a Strict Map and Strict ByteStrings. Both share the same memory behavior (except the lazy BS one is faster)

Here is the strict version:


And here is the lazy version:


I wrote this and compared the memory and speed behavior of ByteStrings generated by converting it from a BSON document and ByteStrings generated more purely.

The length of the ByteString from a BSON document is 68k and the length of the "pure" BS is 70k. 

This is my weird memory behavior, both BSON and "pure" methods use the same amount of memory after inserting 10k of them (90mb)

However when I go to lookup a value, the BSON Map explodes the memory to over 250mb. Even if I lookup just 1 value. Looking up any number of values in the "pure BS" keeps the memory usage stable (90mb).

I am hoping someone can help me understand this. I have read some posts about Temporary ByteStrings causing memory issues but I don't know how to get started debugging.

--
Kyle Hanson

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Kyle Hanson | 30 Aug 06:33 2013
Picon

Re: Debugging ByteString and Data.Binary.Get memory usage

Thanks Bob, 

I made it foldr because it was meant to simulate the sequential IO action that my server uses to populate the Map.

I found the problem to be that I need to force the map to evaluate so adding a little $! fixed the problem

--
Kyle Hanson




On Thu, Aug 29, 2013 at 9:09 PM, Bob Ippolito <bob <at> redivi.com> wrote:
Building a map with foldr seems unwise, have you tried doing it with fromListWith instead? Or foldl'? In either case, since you don't even put the map into WHNF, none of the computation is done at all in either case until the first lookup.


On Thu, Aug 29, 2013 at 3:35 PM, Kyle Hanson <hanooter <at> gmail.com> wrote:
OK

I have a bunch of BSON documents that I convert to ByteStrings, put in a Map, and write to a socket based on the response. I noticed some high memory usage (in the GBs) so I decided to investigate. I simplified my problem into a small program that demonstrates clearer what is happening. 

I wrote two versions, one with a Lazy Map and Lazy ByteStrings and one with a Strict Map and Strict ByteStrings. Both share the same memory behavior (except the lazy BS one is faster)

Here is the strict version:


And here is the lazy version:


I wrote this and compared the memory and speed behavior of ByteStrings generated by converting it from a BSON document and ByteStrings generated more purely.

The length of the ByteString from a BSON document is 68k and the length of the "pure" BS is 70k. 

This is my weird memory behavior, both BSON and "pure" methods use the same amount of memory after inserting 10k of them (90mb)

However when I go to lookup a value, the BSON Map explodes the memory to over 250mb. Even if I lookup just 1 value. Looking up any number of values in the "pure BS" keeps the memory usage stable (90mb).

I am hoping someone can help me understand this. I have read some posts about Temporary ByteStrings causing memory issues but I don't know how to get started debugging.

--
Kyle Hanson

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe



_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
Bob Ippolito | 30 Aug 06:43 2013

Re: Debugging ByteString and Data.Binary.Get memory usage

foldl' is the right way to simulate the sequential IO action, foldr would be doing it in reverse (and for large enough input will stack overflow).


On Thu, Aug 29, 2013 at 9:33 PM, Kyle Hanson <hanooter <at> gmail.com> wrote:
Thanks Bob, 

I made it foldr because it was meant to simulate the sequential IO action that my server uses to populate the Map.

I found the problem to be that I need to force the map to evaluate so adding a little $! fixed the problem

--
Kyle Hanson




On Thu, Aug 29, 2013 at 9:09 PM, Bob Ippolito <bob <at> redivi.com> wrote:
Building a map with foldr seems unwise, have you tried doing it with fromListWith instead? Or foldl'? In either case, since you don't even put the map into WHNF, none of the computation is done at all in either case until the first lookup.


On Thu, Aug 29, 2013 at 3:35 PM, Kyle Hanson <hanooter <at> gmail.com> wrote:
OK

I have a bunch of BSON documents that I convert to ByteStrings, put in a Map, and write to a socket based on the response. I noticed some high memory usage (in the GBs) so I decided to investigate. I simplified my problem into a small program that demonstrates clearer what is happening. 

I wrote two versions, one with a Lazy Map and Lazy ByteStrings and one with a Strict Map and Strict ByteStrings. Both share the same memory behavior (except the lazy BS one is faster)

Here is the strict version:


And here is the lazy version:


I wrote this and compared the memory and speed behavior of ByteStrings generated by converting it from a BSON document and ByteStrings generated more purely.

The length of the ByteString from a BSON document is 68k and the length of the "pure" BS is 70k. 

This is my weird memory behavior, both BSON and "pure" methods use the same amount of memory after inserting 10k of them (90mb)

However when I go to lookup a value, the BSON Map explodes the memory to over 250mb. Even if I lookup just 1 value. Looking up any number of values in the "pure BS" keeps the memory usage stable (90mb).

I am hoping someone can help me understand this. I have read some posts about Temporary ByteStrings causing memory issues but I don't know how to get started debugging.

--
Kyle Hanson

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe




_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe <at> haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Gmane