Eric Firing | 7 Oct 03:18
Favicon

path simplification with nan (or move_to)

Mike, John,

Because path simplification does not work with anything but a continuous 
line, it is turned off if there are any nans in the path.  The result is 
that if one does this:

import numpy as np
xx = np.arange(200000)
yy = np.random.rand(200000)
#plot(xx, yy)
yy[1000] = np.nan
plot(xx, yy)

the plot fails with an incomplete rendering and general 
unresponsiveness; apparently some mysterious agg limit is quietly 
exceeded.  With or without the nan, this test case also shows the 
bizarre slowness of add_line that I asked about in a message yesterday, 
and that has me completely baffled.

Both of these are major problems for real-world use.

Do you have any thoughts on timing and strategy for solving this 
problem?  A few weeks ago, when the problem with nans and path 
simplification turned up, I tried to figure out what was going on and 
how to fix it, but I did not get very far.  I could try again, but as 
you know I don't get along well with C++.

I am also wondering whether more than straightforward path 
simplification with nan/moveto might be needed.  Suppose there is a 
nightmarish time series with every third point being bad, so it is 
(Continue reading)

Michael Droettboom | 7 Oct 18:33
Gravatar

Re: path simplification with nan (or move_to)

Eric Firing wrote:
> Mike, John,
>
> Because path simplification does not work with anything but a 
> continuous line, it is turned off if there are any nans in the path.  
> The result is that if one does this:
>
> import numpy as np
> xx = np.arange(200000)
> yy = np.random.rand(200000)
> #plot(xx, yy)
> yy[1000] = np.nan
> plot(xx, yy)
>
> the plot fails with an incomplete rendering and general 
> unresponsiveness; apparently some mysterious agg limit is quietly 
> exceeded.
The limit in question is "cell_block_limit" in 
agg_rasterizer_cells_aa.h.  The relationship between the number vertices 
and the number of rasterization cells I suspect depends on the nature of 
the values. 

However, if we want to increase the limit, each "cell_block" is 4096 
cells, each with 16 bytes, and currently it maxes out at 1024 cell 
blocks, for a total of 67,108,864 bytes.  So, the question is, how much 
memory should be devoted to rasterization, when the data set is large 
like this?  I think we could safely quadruple this number for a lot of 
modern machines, and this maximum won't affect people plotting smaller 
data sets, since the memory is dynamically allocated anyway.  It works 
for me, but I have 4GB RAM here at work.
(Continue reading)

Eric Firing | 8 Oct 00:07
Favicon

Re: path simplification with nan (or move_to)

Michael Droettboom wrote:
> Eric Firing wrote:
>> Mike, John,
>>
>> Because path simplification does not work with anything but a 
>> continuous line, it is turned off if there are any nans in the path.  
>> The result is that if one does this:
>>
>> import numpy as np
>> xx = np.arange(200000)
>> yy = np.random.rand(200000)
>> #plot(xx, yy)
>> yy[1000] = np.nan
>> plot(xx, yy)
>>
>> the plot fails with an incomplete rendering and general 
>> unresponsiveness; apparently some mysterious agg limit is quietly 
>> exceeded.
> The limit in question is "cell_block_limit" in 
> agg_rasterizer_cells_aa.h.  The relationship between the number vertices 
> and the number of rasterization cells I suspect depends on the nature of 
> the values.
> However, if we want to increase the limit, each "cell_block" is 4096 
> cells, each with 16 bytes, and currently it maxes out at 1024 cell 
> blocks, for a total of 67,108,864 bytes.  So, the question is, how much 
> memory should be devoted to rasterization, when the data set is large 
> like this?  I think we could safely quadruple this number for a lot of 
> modern machines, and this maximum won't affect people plotting smaller 
> data sets, since the memory is dynamically allocated anyway.  It works 
> for me, but I have 4GB RAM here at work.
(Continue reading)

Eric Firing | 8 Oct 10:26
Favicon

Re: path simplification with nan (or move_to)

The patch in that last message of mine was clearly not quite right.  I 
have gone through several iterations, and have seemed tantalizingly 
close, but I still don't have it right yet.  I need to leave it alone 
for a while, but I do think it is important to get this working 
correctly ASAP--certainly it is for my own work, at least.

What happens with a nan should be somewhat similar to what happens with 
clipping, so perhaps one could take advantage of part of the clipping 
logic, but I have not looked at this approach closely.

Eric

Eric Firing wrote:
> Michael Droettboom wrote:
>> Eric Firing wrote:
>>> Mike, John,
>>>
>>> Because path simplification does not work with anything but a 
>>> continuous line, it is turned off if there are any nans in the path.  
>>> The result is that if one does this:
>>>
>>> import numpy as np
>>> xx = np.arange(200000)
>>> yy = np.random.rand(200000)
>>> #plot(xx, yy)
>>> yy[1000] = np.nan
>>> plot(xx, yy)
>>>
>>> the plot fails with an incomplete rendering and general 
>>> unresponsiveness; apparently some mysterious agg limit is quietly 
(Continue reading)

Michael Droettboom | 8 Oct 14:38
Gravatar

Re: path simplification with nan (or move_to)

Eric Firing wrote:
> Michael Droettboom wrote:
>> Eric Firing wrote:
>>> Mike, John,
>>>
>>> Because path simplification does not work with anything but a 
>>> continuous line, it is turned off if there are any nans in the 
>>> path.  The result is that if one does this:
>>>
>>> import numpy as np
>>> xx = np.arange(200000)
>>> yy = np.random.rand(200000)
>>> #plot(xx, yy)
>>> yy[1000] = np.nan
>>> plot(xx, yy)
>>>
>>> the plot fails with an incomplete rendering and general 
>>> unresponsiveness; apparently some mysterious agg limit is quietly 
>>> exceeded.
>> The limit in question is "cell_block_limit" in 
>> agg_rasterizer_cells_aa.h.  The relationship between the number 
>> vertices and the number of rasterization cells I suspect depends on 
>> the nature of the values.
>> However, if we want to increase the limit, each "cell_block" is 4096 
>> cells, each with 16 bytes, and currently it maxes out at 1024 cell 
>> blocks, for a total of 67,108,864 bytes.  So, the question is, how 
>> much memory should be devoted to rasterization, when the data set is 
>> large like this?  I think we could safely quadruple this number for a 
>> lot of modern machines, and this maximum won't affect people plotting 
>> smaller data sets, since the memory is dynamically allocated anyway.  
(Continue reading)

Michael Droettboom | 8 Oct 18:37
Gravatar

Re: path simplification with nan (or move_to)

Michael Droettboom wrote:
> Eric Firing wrote:
>   
>> Michael Droettboom wrote:
>>     
>>> Eric Firing wrote:
>>>       
>>>> Mike, John,
>>>>
>>>> Because path simplification does not work with anything but a 
>>>> continuous line, it is turned off if there are any nans in the 
>>>> path.  The result is that if one does this:
>>>>
>>>> import numpy as np
>>>> xx = np.arange(200000)
>>>> yy = np.random.rand(200000)
>>>> #plot(xx, yy)
>>>> yy[1000] = np.nan
>>>> plot(xx, yy)
>>>>
>>>> the plot fails with an incomplete rendering and general 
>>>> unresponsiveness; apparently some mysterious agg limit is quietly 
>>>> exceeded.
>>>>         
>>> The limit in question is "cell_block_limit" in 
>>> agg_rasterizer_cells_aa.h.  The relationship between the number 
>>> vertices and the number of rasterization cells I suspect depends on 
>>> the nature of the values.
>>> However, if we want to increase the limit, each "cell_block" is 4096 
>>> cells, each with 16 bytes, and currently it maxes out at 1024 cell 
(Continue reading)

John Hunter | 8 Oct 18:58
Gravatar

Re: path simplification with nan (or move_to)

On Wed, Oct 8, 2008 at 11:37 AM, Michael Droettboom <mdroe@...> wrote:

> I figured this out.  When this happens, a RuntimeError("Agg rendering
> complexity exceeded") is thrown.

Do you think it is a good idea to put a little helper note in the
exception along the lines of

  throw "Agg rendering complexity exceeded; you may want to increase
the cell_block_size in agg_rasterizer_cells_aa.h"

in case someone gets this exception two years from now and none of us
can remember this brilliant fix :-)

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
Michael Droettboom | 8 Oct 19:18
Gravatar

Re: path simplification with nan (or move_to)

John Hunter wrote:
> On Wed, Oct 8, 2008 at 11:37 AM, Michael Droettboom <mdroe@...> wrote:
>
>   
>> I figured this out.  When this happens, a RuntimeError("Agg rendering
>> complexity exceeded") is thrown.
>>     
>
> Do you think it is a good idea to put a little helper note in the
> exception along the lines of
>
>   throw "Agg rendering complexity exceeded; you may want to increase
> the cell_block_size in agg_rasterizer_cells_aa.h"
>
> in case someone gets this exception two years from now and none of us
> can remember this brilliant fix :-)
>   
We can suggest that, or suggest that the size of the data is too large 
(which is easier for most users to fix, I would suspect).  What about:

"Agg rendering complexity exceeded.  Consider downsampling or decimating 
your data."

along with a comment (not thrown), saying

/* If this is thrown too often, increase cell_block_limit. */

Mike

--

-- 
(Continue reading)

Eric Firing | 9 Oct 03:40
Favicon

Re: path simplification with nan (or move_to)

Michael Droettboom wrote:
> John Hunter wrote:
>> On Wed, Oct 8, 2008 at 11:37 AM, Michael Droettboom
<mdroe@...> wrote:
>>
>>   
>>> I figured this out.  When this happens, a RuntimeError("Agg rendering
>>> complexity exceeded") is thrown.
>>>     
>> Do you think it is a good idea to put a little helper note in the
>> exception along the lines of
>>
>>   throw "Agg rendering complexity exceeded; you may want to increase
>> the cell_block_size in agg_rasterizer_cells_aa.h"
>>
>> in case someone gets this exception two years from now and none of us
>> can remember this brilliant fix :-)
>>   
> We can suggest that, or suggest that the size of the data is too large 
> (which is easier for most users to fix, I would suspect).  What about:
> 
> "Agg rendering complexity exceeded.  Consider downsampling or decimating 
> your data."
> 
> along with a comment (not thrown), saying
> 
> /* If this is thrown too often, increase cell_block_limit. */
> 
> Mike
> 
(Continue reading)

John Hunter | 9 Oct 03:56
Gravatar

Re: path simplification with nan (or move_to)

On Wed, Oct 8, 2008 at 8:40 PM, Eric Firing <efiring@...> wrote:
> Thanks for doing this--it has already helped me in my testing of the
> gappy-path simplification support, which I have now committed.  As you
> suggested earlier, I included in path.py a check for a compatible codes
> array.
>
> The agg limit still can be a problem.  It looks like chunking could be added
> easily by making the backend_agg draw_path a python method calling the
> renderer method; if the path length exceeds some threshold, then subpaths
> would be generated and passed to the renderer method.

In unrelated news, I am not in favor of the recent change to warn on
non-GUI backends when "show" is called.  I realize this may sometimes
cause head-scratching behavior for some users who call show and no
figure pops up, but I think this must be pretty rare.  99% of users
have a matplotlibrc which defines a GUI default.  AFAIK, the only
exceptions to this are 1) when the user has changed the rc (power
user, needs no protection) or 2) no GUI was available at build time
and the image backend was the default backend chosen (warning more
appropriate at build time).  If I am missing a use case, let me know.

I like the design where the same script can be used to either generate
a UI figure or hardcopy depending on an rc settng or a command flag
(eg backend driver) and would rather not do the warning.  This has
been a very infrequent problem for users over the years (a few times
at most?) so I am not sure that the annoyance of the warning is
justified.

If  2) in the choices above is the case you are concerned about, and
you want this warning feature in this case, we can add an rc param
(Continue reading)

Michael Droettboom | 9 Oct 13:14
Gravatar

Re: path simplification with nan (or move_to)

John Hunter wrote:
> On Wed, Oct 8, 2008 at 8:40 PM, Eric Firing <efiring@...> wrote:
>   
>> Thanks for doing this--it has already helped me in my testing of the
>> gappy-path simplification support, which I have now committed.  As you
>> suggested earlier, I included in path.py a check for a compatible codes
>> array.
>>
>> The agg limit still can be a problem.  It looks like chunking could be added
>> easily by making the backend_agg draw_path a python method calling the
>> renderer method; if the path length exceeds some threshold, then subpaths
>> would be generated and passed to the renderer method.
>>     
>
> In unrelated news, I am not in favor of the recent change to warn on
> non-GUI backends when "show" is called.  I realize this may sometimes
> cause head-scratching behavior for some users who call show and no
> figure pops up, but I think this must be pretty rare.  99% of users
> have a matplotlibrc which defines a GUI default.  AFAIK, the only
> exceptions to this are 1) when the user has changed the rc (power
> user, needs no protection) or 2) no GUI was available at build time
> and the image backend was the default backend chosen (warning more
> appropriate at build time).  If I am missing a use case, let me know.
>   
The motivation was when a popular Linux distribution misconfigured the 
default to the Agg backend:

https://bugs.launchpad.net/ubuntu/+source/matplotlib/+bug/278764

This warning was meant as future insurance against this happening -- and 
(Continue reading)

Michael Droettboom | 9 Oct 13:42
Gravatar

Re: path simplification with nan (or move_to)

Michael Droettboom wrote:
> John Hunter wrote:
>   
>> In unrelated news, I am not in favor of the recent change to warn on
>> non-GUI backends when "show" is called.  I realize this may sometimes
>> cause head-scratching behavior for some users who call show and no
>> figure pops up, but I think this must be pretty rare.  99% of users
>> have a matplotlibrc which defines a GUI default.  AFAIK, the only
>> exceptions to this are 1) when the user has changed the rc (power
>> user, needs no protection) or 2) no GUI was available at build time
>> and the image backend was the default backend chosen (warning more
>> appropriate at build time).  If I am missing a use case, let me know.
>>   
>>     
> The motivation was when a popular Linux distribution misconfigured the 
> default to the Agg backend:
>
> https://bugs.launchpad.net/ubuntu/+source/matplotlib/+bug/278764
>
> This warning was meant as future insurance against this happening -- and 
> hoping that if the packagers don't make a GUI backend the default (in an 
> attempt to reduce dependencies), that they at least would include the 
> warning patch so that users aren't left feeling that matplotlib is "broken".
>
> But we can't prevent all downstream packaging errors, so maybe this 
> patch doesn't belong in trunk. ;)
>   
>> I like the design where the same script can be used to either generate
>> a UI figure or hardcopy depending on an rc settng or a command flag
>> (eg backend driver) and would rather not do the warning.  This has
(Continue reading)

John Hunter | 9 Oct 14:31
Gravatar

Re: path simplification with nan (or move_to)

On Thu, Oct 9, 2008 at 6:14 AM, Michael Droettboom <mdroe@...> wrote:

>> If  2) in the choices above is the case you are concerned about, and
>> you want this warning feature in this case, we can add an rc param
>> which is autoset at build time, something like "show.warn =
>> True|False" since the build script is setting the default image
>> backend and can set "show.warn = True" when it sets an image backend
>> by default, otherwise False.
>>
>
> I intended the warning to warn against misconfiguration, so one shouldn't
> have to explicitly configure anything to get it... ;)

This is mostly academic, since I am happy with your latest changes
because I can run backend driver or do

 > python somefile.py -dAgg

and get no warning.

But .... I wasn't suggesting explicit configuration by the user.  At
build time, mpl looks for a functioning backend in setup.py and if it
fails to find one, sets Agg and creates the default matplotlibrc from
matplotlibrc.template.  In the case where a no GUI was detected, the
build script could also set a rc warn-on-show flag.  The ubuntu
packager, who probably built mpl in an environment with no X11 and got
no functioning GUI, would get a rc file with backend Agg and the
warn-on-show flag set.

But I think we can leave things as they are.
(Continue reading)


Gmane