Frank Heckes | 17 Feb 2012 15:59
Picon
Picon
Favicon

Are measurements cumulative?

Hi all,

sorry if this has been asked and answered before or if I was to blind to
find it inside the documentation and the mail archive.

How does the tool calculate

-a- quantities (like number of slabs, memory allocated, queuelength,...)
-b- utilisation (CPU used%,...)
-c- rates (disks read/write, iops,...)

when using different sampling intervals (-i ...)?

For rates it seems obvious(?) to divide the difference of counter values
at the end and and the start by the sampling time.

Could someone explain what is the effect on choosing longer sampling
intervals and how 'averaging' is done and which meas are considered in
the calculation?

Many thanks in advance.

Cheers

-Frank

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
(Continue reading)

Mark Seger | 17 Feb 2012 16:10
Picon
Gravatar

Re: Are measurements cumulative?



On Fri, Feb 17, 2012 at 9:59 AM, Frank Heckes <f.heckes-97/bSmCnXvjoK6nBLMlh1Q@public.gmane.org> wrote:
Hi all,

sorry if this has been asked and answered before or if I was to blind to
find it inside the documentation and the mail archive.

could be you're blind OR it could be it's not there.  ;)
lot's of documentation and when you're a one person team you miss things.
 
How does the tool calculate

-a- quantities (like number of slabs, memory allocated, queuelength,...)

these are simply instantaneous values as reported in /proc/xxx
 
-b- utilisation (CPU used%,...)

all cpu times reported in /proc/stat are in jiffies, so I just look at the change between to samples and add up the user, system and other times ignoring iowait since that's not real cpu.  this then gives me the total number of jiffies in the interval.  now it's simply a matter of something like

user/total*100 to tell me the % time spend in user time

some tools actually report based on a single core and so a cpu bound 8 core system would be reported as 800% but collectl reports 100%
 
-c- rates (disks read/write, iops,...)

that's simply the different between start/finish divided by seconds.  if you include -on, it divides by 1 to give absolute numbers
 
when using different sampling intervals (-i ...)?

numbers should all be the same unless -on specified
 
For rates it seems obvious(?) to divide the difference of counter values
at the end and and the start by the sampling time.

correct
 
Could someone explain what is the effect on choosing longer sampling
intervals and how 'averaging' is done and which meas are considered in
the calculation?

the only different in using longer sampling intervals is loss of accuracy.  my favorite example is if you have a 30 second spike in the network and are only sampling every couple of minutes, you'll see an elevation but never know you were saturated for 1/2 minute.  even at 10 seconds you'll miss shorter spiked, but 10 seconds has shown to be a good compromise, though some user choose 5 or even 1 second.

hope this helps

-mark
 
Many thanks in advance.

Cheers

-Frank






-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------

Kennen Sie schon unsere app? http://www.fz-juelich.de/app

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
Collectl-interest mailing list
Collectl-interest <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/collectl-interest

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
Frank Heckes | 17 Feb 2012 17:50
Picon
Picon
Favicon

Re: Are measurements cumulative?

Hello Mark,

many thanks for the quick and detailed answer.
Two question came up while reading your answer (See below, sorry).

Cheers

-Frank

On Fri, 2012-02-17 at 16:10 +0100, Mark Seger wrote:
>
>

>
>         How does the tool calculate
>
>         -a- quantities (like number of slabs, memory allocated,
>         queuelength,...)
>
> these are simply instantaneous values as reported in /proc/xxx
>
Okay, this means that one might miss peak values upon increasing the
measurement time.

>         -b- utilisation (CPU used%,...)
>
>
> all cpu times reported in /proc/stat are in jiffies, so I just look at
> the change between to samples and add up the user, system and other
> times ignoring iowait since that's not real cpu.  this then gives me
> the total number of jiffies in the interval.  now it's simply a matter
> of something like
>
>
> user/total*100 to tell me the % time spend in user time
>
>
> some tools actually report based on a single core and so a cpu bound 8
> core system would be reported as 800% but collectl reports 100%
>
Is the disk utilisation also based on jiffles?

>         -c- rates (disks read/write, iops,...)
>
>
> that's simply the different between start/finish divided by seconds.
>  if you include -on, it divides by 1 to give absolute numbers
>
>         when using different sampling intervals (-i ...)?
>
>
> numbers should all be the same unless -on specified
>
>         For rates it seems obvious(?) to divide the difference of
>         counter values
>         at the end and and the start by the sampling time.
>
>
> correct
>
>         Could someone explain what is the effect on choosing longer
>         sampling
>         intervals and how 'averaging' is done and which meas are
>         considered in
>         the calculation?
>
>
> the only different in using longer sampling intervals is loss of
> accuracy.  my favorite example is if you have a 30 second spike in the
> network and are only sampling every couple of minutes, you'll see an
> elevation but never know you were saturated for 1/2 minute.  even at
> 10 seconds you'll miss shorter spiked, but 10 seconds has shown to be
> a good compromise, though some user choose 5 or even 1 second.
Just for interest would it make sense that the tool would have some
'internal' hidden counters to make some maybe configurable number of
'hidden' measurements and sum values for each counter in these counters
to take the average in the end?
Maybe this is dump, cause one could choose a smaller measurement
interval from the start that would lead to the same computational
(concerning CPU and memory) overhead, but it would help to get more
'accurate' meas even for bigger interval with lower amount of
performance measurement data(?).

>
> hope this helps
>
Yes, very much. Many thanks!

Cheers

-Frank

-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------

Kennen Sie schon unsere app? http://www.fz-juelich.de/app

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
Mark Seger | 17 Feb 2012 19:24
Picon
Gravatar

Re: Are measurements cumulative?


>
>         How does the tool calculate
>
>         -a- quantities (like number of slabs, memory allocated,
>         queuelength,...)
>
> these are simply instantaneous values as reported in /proc/xxx
>
Okay, this means that one might miss peak values upon increasing the
measurement time.

exactly.  that's one of the reasons I go crazy when I see people running sar at a 10 minute interval
 
>         -b- utilisation (CPU used%,...)
>
>
> all cpu times reported in /proc/stat are in jiffies, so I just look at
> the change between to samples and add up the user, system and other
> times ignoring iowait since that's not real cpu.  this then gives me
> the total number of jiffies in the interval.  now it's simply a matter
> of something like
>
>
> user/total*100 to tell me the % time spend in user time
>
>
> some tools actually report based on a single core and so a cpu bound 8
> core system would be reported as 800% but collectl reports 100%
>
Is the disk utilisation also based on jiffles?

if Time:HiRes is installed it uses that.  if not installed it does use jiffies

 
>         -c- rates (disks read/write, iops,...)
>
>
> that's simply the different between start/finish divided by seconds.
>  if you include -on, it divides by 1 to give absolute numbers
>
>         when using different sampling intervals (-i ...)?
>
>
> numbers should all be the same unless -on specified
>
>         For rates it seems obvious(?) to divide the difference of
>         counter values
>         at the end and and the start by the sampling time.
>
>
> correct
>
>         Could someone explain what is the effect on choosing longer
>         sampling
>         intervals and how 'averaging' is done and which meas are
>         considered in
>         the calculation?
>
>
> the only different in using longer sampling intervals is loss of
> accuracy.  my favorite example is if you have a 30 second spike in the
> network and are only sampling every couple of minutes, you'll see an
> elevation but never know you were saturated for 1/2 minute.  even at
> 10 seconds you'll miss shorter spiked, but 10 seconds has shown to be
> a good compromise, though some user choose 5 or even 1 second.
Just for interest would it make sense that the tool would have some
'internal' hidden counters to make some maybe configurable number of
'hidden' measurements and sum values for each counter in these counters
to take the average in the end?
Maybe this is dump, cause one could choose a smaller measurement
interval from the start that would lead to the same computational
(concerning CPU and memory) overhead, but it would help to get more
'accurate' meas even for bigger interval with lower amount of
performance measurement data(?).

it could do lots of things but in the spirit of simplicity (many would argue it lost its simplicity long ago ;)), it is what it is.  Also, collectl NEVER looks at the data it collects, at least not when explicitly displaying results as I wanted to keep the collection as light-weight as possible. 

-mark 

>
> hope this helps
>
Yes, very much. Many thanks!

Cheers

-Frank






-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------

Kennen Sie schon unsere app? http://www.fz-juelich.de/app

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
Collectl-interest mailing list
Collectl-interest <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/collectl-interest

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/

Gmane