Martin Spacek | 3 Dec 2007 02:22

Re: Loading a > GB file into array

Sebastian Haase wrote:
> reading this thread I have two comments.
> a) *Displaying* at 200Hz probably makes little sense, since humans
> would only see about max. of 30Hz (aka video frame rate).
> Consequently you would want to separate your data frame rate, that (as
> I understand) you want to save data to disk and - asynchrounously -
> "display as many frames as you can" (I have used pyOpenGL for this
> with great satisfaction)

Hi Sebastian,

Although 30Hz looks pretty good, if you watch a 60fps movie, you can
easily tell the difference. It's much smoother. Try recording AVIs on a
point and shoot digital camera, if you have one that can do both 30fps
and 60fps (like my fairly old Canon SD200).

And that's just perception. We're doing neurophysiology, recording from
neurons in the visual cortex, which can phase lock to CRT screen rasters
up to 100Hz. This is an artifact we don't want to deal with, so we use a
200Hz monitor. I need to be certain of exactly what's on the monitor on
every refresh, ie every 5ms, so I run python (with Andrew Straw's
package VisionEgg) as a "realtime" priority process in windows on a dual
core computer, which lets me reliably update the video frame buffer in
time for the next refresh, without having to worry about windows
multitasking butting in and stealing CPU cycles for the next 15-20ms.
Python runs on one core in "realtime", windows does its junk on the
other core. Right now, every 3rd video refresh (ie every 15ms, which is
66.7 Hz, close to the original 60fps the movie was recorded at) I update
with a new movie frame. That update needs to happen in less than 5ms,
every time. If there's any disk access involved during the update, it
(Continue reading)

Francesc Altet | 3 Dec 2007 14:40

Re: Loading a > GB file into array

A Monday 03 December 2007, Martin Spacek escrigué:

> Sebastian Haase wrote:

> > reading this thread I have two comments.

> > a) *Displaying* at 200Hz probably makes little sense, since humans

> > would only see about max. of 30Hz (aka video frame rate).

> > Consequently you would want to separate your data frame rate, that

> > (as I understand) you want to save data to disk and -

> > asynchrounously - "display as many frames as you can" (I have used

> > pyOpenGL for this with great satisfaction)

>

> Hi Sebastian,

>

> Although 30Hz looks pretty good, if you watch a 60fps movie, you can

> easily tell the difference. It's much smoother. Try recording AVIs on

> a point and shoot digital camera, if you have one that can do both

> 30fps and 60fps (like my fairly old Canon SD200).

>

> And that's just perception. We're doing neurophysiology, recording

> from neurons in the visual cortex, which can phase lock to CRT screen

> rasters up to 100Hz. This is an artifact we don't want to deal with,

> so we use a 200Hz monitor. I need to be certain of exactly what's on

> the monitor on every refresh, ie every 5ms, so I run python (with

> Andrew Straw's package VisionEgg) as a "realtime" priority process in

> windows on a dual core computer, which lets me reliably update the

> video frame buffer in time for the next refresh, without having to

> worry about windows multitasking butting in and stealing CPU cycles

> for the next 15-20ms. Python runs on one core in "realtime", windows

> does its junk on the other core. Right now, every 3rd video refresh

> (ie every 15ms, which is 66.7 Hz, close to the original 60fps the

> movie was recorded at) I update with a new movie frame. That update

> needs to happen in less than 5ms, every time. If there's any disk

> access involved during the update, it inevitably exceeds that time

> limit, so I have to have it all in RAM before playback begins. Having

> a second I/O thread running on the second core would be great though.

Perhaps something that can surely improve your timings is first performing a read of your data file(s) while throwing the data as you are reading it. This serves only to load the file entirely (if you have memory enough, but this seems your case) in OS page cache. Then, the second time that your code has to read the data, the OS only have to retrieve it from its cache (i.e. in memory) rather than from disk.

You can do this with whatever technique you want, but if you are after reading from a single container and memmap is giving you headaches in 32-bit platforms, you might try PyTables because it allows 64-bit disk addressing transparently, even on 32-bit machines.

HTH,

--

>0,0< Francesc Altet     http://www.carabos.com/

V V Cárabos Coop. V.   Enjoy Data

"-"

_______________________________________________
Numpy-discussion mailing list
Numpy-discussion <at> scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion
Martin Spacek | 3 Dec 2007 21:44

Re: Loading a > GB file into array

Francesc Altet wrote:
> Perhaps something that can surely improve your timings is first 
> performing a read of your data file(s) while throwing the data as you 
> are reading it. This serves only to load the file entirely (if you have 
> memory enough, but this seems your case) in OS page cache. Then, the 
> second time that your code has to read the data, the OS only have to 
> retrieve it from its cache (i.e. in memory) rather than from disk.

I think I tried that, loading the whole file into memory, throwing it 
away, then trying to load on the fly from "disk" (which would now 
hopefully be done more optimally the 2nd time around) while displaying 
the movie, but I still got update times > 5ms. The file's just too big 
to get any improvement by sort of preloading this way.

> You can do this with whatever technique you want, but if you are after 
> reading from a single container and memmap is giving you headaches in 
> 32-bit platforms, you might try PyTables because it allows 64-bit disk 
> addressing transparently, even on 32-bit machines.

PyTables sounds interesting, I might take a look. Thanks.

Martin
Gael Varoquaux | 3 Dec 2007 07:27
Favicon
Gravatar

Re: Loading a > GB file into array

On Sun, Dec 02, 2007 at 05:22:49PM -0800, Martin Spacek wrote:
> so I run python (with Andrew Straw's
> package VisionEgg) as a "realtime" priority process in windows on a dual
> core computer, which lets me reliably update the video frame buffer in
> time for the next refresh, without having to worry about windows
> multitasking butting in and stealing CPU cycles for the next 15-20ms.

Very interesting. Have you made measurements to see how many times you
lost one of your cycles. I made these kind of measurements on Linux using
the real-time clock with C and it was very interesting (
http://www.gael-varoquaux.info/computers/real-time ). I want to redo them
with Python, as I except to have similar results with Python. It would be
interesting to see how Windows fits in the picture (I know nothing about
Windows, so I really can't make measurements on Windows).

Cheers,

Gaël 
Martin Spacek | 4 Dec 2007 02:20

Re: Loading a > GB file into array

Gael Varoquaux wrote:
> Very interesting. Have you made measurements to see how many times you
> lost one of your cycles. I made these kind of measurements on Linux using
> the real-time clock with C and it was very interesting (
> http://www.gael-varoquaux.info/computers/real-time ). I want to redo them
> with Python, as I except to have similar results with Python. It would be
> interesting to see how Windows fits in the picture (I know nothing about
> Windows, so I really can't make measurements on Windows).

Neat, thanks for that, I'll have a look. I'm very slowly transitioning 
my computing "life" over to linux, but I've been told by Andrew Straw (I 
think http://visionegg.org might have some details, see the mailing 
list) that it's harder to get close to a real-time OS with linux (while 
running high level stuff like opengl and python) than it is in windows. 
I hope that's changed, or is changing. I'd love to switch over to 64-bit 
linux. As far as windows is considered, I'd like 32bit winxp to be my 
last iteration.

Martin
David Cournapeau | 4 Dec 2007 06:13
Picon
Picon

Re: Loading a > GB file into array

Martin Spacek wrote:
> Gael Varoquaux wrote:
>> Very interesting. Have you made measurements to see how many times you
>> lost one of your cycles. I made these kind of measurements on Linux using
>> the real-time clock with C and it was very interesting (
>> http://www.gael-varoquaux.info/computers/real-time ). I want to redo them
>> with Python, as I except to have similar results with Python. It would be
>> interesting to see how Windows fits in the picture (I know nothing about
>> Windows, so I really can't make measurements on Windows).
>
> Neat, thanks for that, I'll have a look. I'm very slowly transitioning 
> my computing "life" over to linux, but I've been told by Andrew Straw (I 
> think http://visionegg.org might have some details, see the mailing 
> list) that it's harder to get close to a real-time OS with linux (while 
> running high level stuff like opengl and python) than it is in windows.
My impression is that is is more like the contrary; linux implements 
many  posix facilities for more 'real-time' behaviour: it implements a 
FIFO scheduler, you have mlock facilities to avoid paging, etc... and of 
course, you can control your environment much more easily (one buggy 
driver can kill the whole thing as far as latency is concerned, for 
example). I did not find those info you are talking about on visionegg  ?

Now, for python, this is a different matter. In you need to do things in 
real-time, setting a high priority is not enough, and python has several 
characteristics which make it less than suitable for real-time (heavy 
usage of memory allocation, garbage collector, etc...). I guess that 
when you do things every few ms, with enough memory (every ms gives you 
millions of cycle on modern machines), you can hope that is it not too 
much of a problem (at least for memory allocation; I could not find in 
Andrew's slides whether he disabled the GC). But I doubt you can do much 
better.

I wonder if python can be compiled with a real time memory allocator, 
and even whether it makes sense at all (I am thinking about something 
like TLSF: http://rtportal.upv.es/rtmalloc/).
>  
> I hope that's changed, or is changing. I'd love to switch over to 64-bit 
> linux. As far as windows is considered, I'd like 32bit winxp to be my 
> last iteration.
With recent kernels, you can get really good latency if you do it right 
(around 1-2 ms worst case under high load, including high IO pressure). 
I know nothing about video programming, but I would guess that as far as 
the kernel is concerned, this does not change much. I have not tried 
them myself, but ubuntu studio has its own kernel with 'real-time' 
patched (voluntary preempt from Ingo, for example), and is available 
both for 32 and 64 bits architectures. One problem I can think of for 
video is that if you need binary-only drivers: those are generally 
pretty bad as far as low latency is concerned (nvidia drivers always 
cause some kind of problems with low latency and 'real-time' kernels).

http://ubuntustudio.org/

David
Gael Varoquaux | 4 Dec 2007 10:36
Favicon
Gravatar

Re: Loading a > GB file into array

On Tue, Dec 04, 2007 at 02:13:53PM +0900, David Cournapeau wrote:
> With recent kernels, you can get really good latency if you do it right 
> (around 1-2 ms worst case under high load, including high IO pressure). 

As you can see on my page, I indeed measured less than 1ms latency on
Linux under load with kernel more than a year old. These things have
gotten much better recently and with a premptible kernel you should be
able to get 1ms easily. Going below 0.5ms without using a realtime OS (ie
a realtime kernel, under linux) is really pushing it.

Cheers,

Gaël
Andrew Straw | 4 Dec 2007 11:28
Picon
Gravatar

Re: Loading a > GB file into array

Hi all,

I haven't done any serious testing in the past couple years, but for 
this particular task -- drawing frames using OpenGL without ever 
skipping a video update -- it is my impression that as of a few Ubuntu 
releases ago (Edgy?) Windows still beat linux.

Just now, I have investigated on 2.6.22-14-generic x86_64 as pacakged by 
Ubuntu 7.10, and I didn't skip a frame out of 1500 at 60 Hz. That's not 
much testing, but it is certainly better performance than I've seen in 
the recent past, so I'll certainly be doing some more testing soon. Oh, 
how I'd love to never be forced to use Windows again.

Leaving my computer displaying moving images overnight, (and tomorrow at 
lab on a 200 Hz display),
Andrew

Gael Varoquaux wrote:
> On Tue, Dec 04, 2007 at 02:13:53PM +0900, David Cournapeau wrote:
>   
>> With recent kernels, you can get really good latency if you do it right 
>> (around 1-2 ms worst case under high load, including high IO pressure). 
>>     
>
> As you can see on my page, I indeed measured less than 1ms latency on
> Linux under load with kernel more than a year old. These things have
> gotten much better recently and with a premptible kernel you should be
> able to get 1ms easily. Going below 0.5ms without using a realtime OS (ie
> a realtime kernel, under linux) is really pushing it.
>
> Cheers,
>
> Gaël
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion <at> scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>   
David Cournapeau | 4 Dec 2007 11:38
Picon
Picon

Re: Loading a > GB file into array

Andrew Straw wrote:
> Hi all,
>
> I haven't done any serious testing in the past couple years, but for 
> this particular task -- drawing frames using OpenGL without ever 
> skipping a video update -- it is my impression that as of a few Ubuntu 
> releases ago (Edgy?) Windows still beat linux.
>   
The problem is that this is the kind of things which is really 
distribution (because of kernel patch) dependent.
> Just now, I have investigated on 2.6.22-14-generic x86_64 as pacakged by 
> Ubuntu 7.10, and I didn't skip a frame out of 1500 at 60 Hz. That's not 
> much testing, but it is certainly better performance than I've seen in 
> the recent past, so I'll certainly be doing some more testing soon. Oh, 
> how I'd love to never be forced to use Windows again.
>   
You should try the rt kernel: https://wiki.ubuntu.com/RealTime/Gutsy. 
This does make a huge difference.

cheers,

David
David Cournapeau | 4 Dec 2007 11:05
Picon
Picon

Re: Loading a > GB file into array

Gael Varoquaux wrote:
> On Tue, Dec 04, 2007 at 02:13:53PM +0900, David Cournapeau wrote:
>   
>> With recent kernels, you can get really good latency if you do it right 
>> (around 1-2 ms worst case under high load, including high IO pressure). 
>>     
>
> As you can see on my page, I indeed measured less than 1ms latency on
> Linux under load with kernel more than a year old. These things have
> gotten much better recently and with a premptible kernel you should be
> able to get 1ms easily. Going below 0.5ms without using a realtime OS (ie
> a realtime kernel, under linux) is really pushing it.
>   
Yes, 1ms is possible for quite a long time; the problem was how to get 
there (kernel patches, special permissions, etc... Many of those 
problems are now gone). I've read that you could get around 0.2 ms and 
even below (worst case) with the last kernels + RT preempt (that is you 
still use linux, and not rtlinux). Below 1 ms does not make much sense 
for audio applications, so I don't know much below this range :)

But I am really curious if you can get those numbers with python, 
because of malloc, the gc and co. I mean for example, 0.5 ms latency for 
a 1 Ghz CPU means that you get something like a 500 000 CPU cycles, and 
I can imagine a cycle of garbage collection taking that many cycles, 
without even considering pages of virtual memory which are swapped (in 
this case, we are talking millions of cycles).

cheers,

David
Timothy Hochberg | 4 Dec 2007 17:07
Picon

Re: Loading a > GB file into array



On Dec 4, 2007 3:05 AM, David Cournapeau <david <at> ar.media.kyoto-u.ac.jp> wrote:
Gael Varoquaux wrote:
> On Tue, Dec 04, 2007 at 02:13:53PM +0900, David Cournapeau wrote:
>
>> With recent kernels, you can get really good latency if you do it right
>> (around 1-2 ms worst case under high load, including high IO pressure).
>>
>
> As you can see on my page, I indeed measured less than 1ms latency on
> Linux under load with kernel more than a year old. These things have
> gotten much better recently and with a premptible kernel you should be
> able to get 1ms easily. Going below 0.5ms without using a realtime OS (ie
> a realtime kernel, under linux) is really pushing it.
>
Yes, 1ms is possible for quite a long time; the problem was how to get
there (kernel patches, special permissions, etc... Many of those
problems are now gone). I've read that you could get around 0.2 ms and
even below (worst case) with the last kernels + RT preempt (that is you
still use linux, and not rtlinux). Below 1 ms does not make much sense
for audio applications, so I don't know much below this range :)

But I am really curious if you can get those numbers with python,
because of malloc, the gc and co. I mean for example, 0.5 ms latency for
a 1 Ghz CPU means that you get something like a 500 000 CPU cycles, and
I can imagine a cycle of garbage collection taking that many cycles,
without even considering pages of virtual memory which are swapped (in
this case, we are talking millions of cycles).

If the garbage collector is causing a slowdown, it is possible to turn it off. Then you have to be careful to break cycles manually. Non cyclic garbage will get picked up by reference counting, so you can ignore that. Figuring out references in the context of numpy might be a little tricky given that views imply references, but it's probably not impossible.

-tim


 


cheers,

David
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion <at> scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion



--
.  __
.   |-\
.
.  tim.hochberg <at> ieee.org
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion <at> scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Gmane