Colin Cross | 30 Apr 21:12 2010

[PATCH V2] [ARM] Add ARCH_PROVIDES_UDELAY config option

On SMP kernels, the loops_per_jiffy value is not scaled due to the
possibility of one CPU affecting the speed of another CPU while the
second CPU is in a udelay loop.  Since loops_per_jiffy is calculated
once on boot for SMP kernels, udelays are too long if the CPU
frequency is scaled down from the boot frequency, or too short if the
frequency is scaled up.  Some SOCs have a timer with a constant tick
rate that can be used to time udelays, similar to the TSC on x86.
Provide a config flag to allow these SOCs to override the default
ARM udelay implementation.

Signed-off-by: Colin Cross <ccross <at> android.com>
---
 arch/arm/Kconfig             |    3 +++
 arch/arm/include/asm/delay.h |    4 ++++
 arch/arm/lib/Makefile        |    6 +++++-
 3 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 33d2825..d9923b0 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
 <at>  <at>  -175,6 +175,9  <at>  <at>  config ARM_L1_CACHE_SHIFT_6
 	help
 	  Setting ARM L1 cache line size to 64 Bytes.

+config ARCH_PROVIDES_UDELAY
+  bool
+
 if OPROFILE

(Continue reading)

Colin Cross | 30 Apr 21:37 2010

Re: [PATCH V2] [ARM] Add ARCH_PROVIDES_UDELAY config option

An alternative to this patch would be to add a config option to use
sched_clock() to provide the counter instead of the cycle loop.  The
same loops_per_jiffy calibration could be done to  determine the
sched_clock frequency.  Any machine with an available constant tick
rate counter, which is likely to be used for sched_clock() already,
can enable CONFIG_UDELAY_USES_SCHED_CLOCK.

On Fri, Apr 30, 2010 at 12:12 PM, Colin Cross <ccross <at> android.com> wrote:
> On SMP kernels, the loops_per_jiffy value is not scaled due to the
> possibility of one CPU affecting the speed of another CPU while the
> second CPU is in a udelay loop.  Since loops_per_jiffy is calculated
> once on boot for SMP kernels, udelays are too long if the CPU
> frequency is scaled down from the boot frequency, or too short if the
> frequency is scaled up.  Some SOCs have a timer with a constant tick
> rate that can be used to time udelays, similar to the TSC on x86.
> Provide a config flag to allow these SOCs to override the default
> ARM udelay implementation.
>
> Signed-off-by: Colin Cross <ccross <at> android.com>
> ---
>  arch/arm/Kconfig             |    3 +++
>  arch/arm/include/asm/delay.h |    4 ++++
>  arch/arm/lib/Makefile        |    6 +++++-
>  3 files changed, 12 insertions(+), 1 deletions(-)
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 33d2825..d9923b0 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
>  <at>  <at>  -175,6 +175,9  <at>  <at>  config ARM_L1_CACHE_SHIFT_6
(Continue reading)

Kevin Hilman | 1 May 00:11 2010

Re: [PATCH V2] [ARM] Add ARCH_PROVIDES_UDELAY config option

Colin Cross <ccross <at> android.com> writes:

> An alternative to this patch would be to add a config option to use
> sched_clock() to provide the counter instead of the cycle loop.  The
> same loops_per_jiffy calibration could be done to  determine the
> sched_clock frequency.  Any machine with an available constant tick
> rate counter, which is likely to be used for sched_clock() already,
> can enable CONFIG_UDELAY_USES_SCHED_CLOCK.

Or even better, why not have an option to use the clocksource which is
most likely using the constant tick timer as well.

Kevin

> On Fri, Apr 30, 2010 at 12:12 PM, Colin Cross <ccross <at> android.com> wrote:
>> On SMP kernels, the loops_per_jiffy value is not scaled due to the
>> possibility of one CPU affecting the speed of another CPU while the
>> second CPU is in a udelay loop.  Since loops_per_jiffy is calculated
>> once on boot for SMP kernels, udelays are too long if the CPU
>> frequency is scaled down from the boot frequency, or too short if the
>> frequency is scaled up.  Some SOCs have a timer with a constant tick
>> rate that can be used to time udelays, similar to the TSC on x86.
>> Provide a config flag to allow these SOCs to override the default
>> ARM udelay implementation.
>>
>> Signed-off-by: Colin Cross <ccross <at> android.com>
Saravana Kannan | 1 May 02:04 2010

Re: [PATCH V2] [ARM] Add ARCH_PROVIDES_UDELAY config option

On some machines/boards, the timer used for the sched clock does not 
have usec accuracy. So, we should not assume that a sched clock is 
sufficient.

It won't be wrong, but if we can't ever give usec level accuracy, we 
shouldn't act as if we can.

Cross,

My comment about CPU freq switching while we are delay looping is 
present even for non-SMP kernels. Yes, it's bad, but we don't have a 
scalable solution yet. So, you might want to reword it yet again :-)

Thanks,
Saravana

Kevin Hilman wrote:
> Colin Cross <ccross <at> android.com> writes:
> 
>> An alternative to this patch would be to add a config option to use
>> sched_clock() to provide the counter instead of the cycle loop.  The
>> same loops_per_jiffy calibration could be done to  determine the
>> sched_clock frequency.  Any machine with an available constant tick
>> rate counter, which is likely to be used for sched_clock() already,
>> can enable CONFIG_UDELAY_USES_SCHED_CLOCK.
> 
> Or even better, why not have an option to use the clocksource which is
> most likely using the constant tick timer as well.
> 
> Kevin
(Continue reading)

Russell King - ARM Linux | 1 May 12:01 2010
Picon

Re: [PATCH V2] [ARM] Add ARCH_PROVIDES_UDELAY config option

On Fri, Apr 30, 2010 at 03:11:20PM -0700, Kevin Hilman wrote:
> Colin Cross <ccross <at> android.com> writes:
> 
> > An alternative to this patch would be to add a config option to use
> > sched_clock() to provide the counter instead of the cycle loop.  The
> > same loops_per_jiffy calibration could be done to  determine the
> > sched_clock frequency.  Any machine with an available constant tick
> > rate counter, which is likely to be used for sched_clock() already,
> > can enable CONFIG_UDELAY_USES_SCHED_CLOCK.
> 
> Or even better, why not have an option to use the clocksource which is
> most likely using the constant tick timer as well.

We may be running into the same problem which we did with the printk
clock - that is using a machine provided sched_clock() or clocksource
requires MMIO accesses, which can only be done after the IO mappings
have been initialized.

Let's hope no one ever uses udelay() before the necessary IO mappings
are present.
Saravana Kannan | 22 May 00:01 2010

Re: [PATCH V2] [ARM] Add ARCH_PROVIDES_UDELAY config option

Russell King - ARM Linux wrote:
> On Fri, Apr 30, 2010 at 03:11:20PM -0700, Kevin Hilman wrote:
>> Colin Cross <ccross <at> android.com> writes:
>>
>>> An alternative to this patch would be to add a config option to use
>>> sched_clock() to provide the counter instead of the cycle loop.  The
>>> same loops_per_jiffy calibration could be done to  determine the
>>> sched_clock frequency.  Any machine with an available constant tick
>>> rate counter, which is likely to be used for sched_clock() already,
>>> can enable CONFIG_UDELAY_USES_SCHED_CLOCK.
>> Or even better, why not have an option to use the clocksource which is
>> most likely using the constant tick timer as well.
> 
> We may be running into the same problem which we did with the printk
> clock - that is using a machine provided sched_clock() or clocksource
> requires MMIO accesses, which can only be done after the IO mappings
> have been initialized.
> 
> Let's hope no one ever uses udelay() before the necessary IO mappings
> are present.

Is the patch that uses CONFIG_ARCH_PROVIDES_UDELAY acceptable? I don't 
care much for how each arch decides to implement it, but I think we 
should have this config to let each arch decide how they want to handle 
udelay.

I personally prefer not to use the sched clock source due to the 
unnecessary complexities. If you have a some kind of constant counter, 
it sounds much simpler to just use it instead of adding dependencies 
between udelay and sched clock.
(Continue reading)

Russell King - ARM Linux | 22 May 00:06 2010
Picon

Re: [PATCH V2] [ARM] Add ARCH_PROVIDES_UDELAY config option

On Fri, May 21, 2010 at 03:01:48PM -0700, Saravana Kannan wrote:
> Russell King - ARM Linux wrote:
>> We may be running into the same problem which we did with the printk
>> clock - that is using a machine provided sched_clock() or clocksource
>> requires MMIO accesses, which can only be done after the IO mappings
>> have been initialized.
>>
>> Let's hope no one ever uses udelay() before the necessary IO mappings
>> are present.
>
> Is the patch that uses CONFIG_ARCH_PROVIDES_UDELAY acceptable? I don't  
> care much for how each arch decides to implement it, but I think we  
> should have this config to let each arch decide how they want to handle  
> udelay.
>
> I personally prefer not to use the sched clock source due to the  
> unnecessary complexities. If you have a some kind of constant counter,  
> it sounds much simpler to just use it instead of adding dependencies  
> between udelay and sched clock.

My point is not specific to sched_clock, but to counters which on ARM
are 99.9% always memory mapped, and therefore inaccessible during the
very early kernel boot.  sched_clock was merely an illustration of the
problem.
Saravana Kannan | 22 May 00:10 2010

Re: [PATCH V2] [ARM] Add ARCH_PROVIDES_UDELAY config option

Russell King - ARM Linux wrote:
> On Fri, May 21, 2010 at 03:01:48PM -0700, Saravana Kannan wrote:
>> Russell King - ARM Linux wrote:
>>> We may be running into the same problem which we did with the printk
>>> clock - that is using a machine provided sched_clock() or clocksource
>>> requires MMIO accesses, which can only be done after the IO mappings
>>> have been initialized.
>>>
>>> Let's hope no one ever uses udelay() before the necessary IO mappings
>>> are present.
>> Is the patch that uses CONFIG_ARCH_PROVIDES_UDELAY acceptable? I don't  
>> care much for how each arch decides to implement it, but I think we  
>> should have this config to let each arch decide how they want to handle  
>> udelay.
>>
>> I personally prefer not to use the sched clock source due to the  
>> unnecessary complexities. If you have a some kind of constant counter,  
>> it sounds much simpler to just use it instead of adding dependencies  
>> between udelay and sched clock.
> 
> My point is not specific to sched_clock, but to counters which on ARM
> are 99.9% always memory mapped, and therefore inaccessible during the
> very early kernel boot.  sched_clock was merely an illustration of the
> problem.

Agree with the point about the counters being memory mapped. But does 
any of the really early init code use udelay? AFAIK, udelay is mostly 
used when talking to devices at which point IO mapping needs to have 
been completed to be able to talk to the device in the first place.

(Continue reading)

Saravana Kannan | 28 May 02:41 2010

Re: [PATCH V2] [ARM] Add ARCH_PROVIDES_UDELAY config option

Russell King - ARM Linux wrote:
> My point is not specific to sched_clock, but to counters which on ARM
> are 99.9% always memory mapped, and therefore inaccessible during the
> very early kernel boot.  sched_clock was merely an illustration of the
> problem.

Do you know any code which uses udelay that early? I can't imagine any 
non-driver code using udelay. If it's just drivers using udelay, then 
they need IO mapping to have been completed for the device register 
access to work and hence udelay won't have any problems.

-Saravana
Saravana Kannan | 22 Jun 03:14 2010

Re: [PATCH V2] [ARM] Add ARCH_PROVIDES_UDELAY config option

Russel,

Would appreciate your thoughts on this.

Colin,

Are you continuing to use the approach used in your patch in your 
development tree? I just want to make sure we have a solution that 
everyone can agree to.

Thanks,
Saravana

Saravana Kannan wrote:
> Russell King - ARM Linux wrote:
>> My point is not specific to sched_clock, but to counters which on ARM
>> are 99.9% always memory mapped, and therefore inaccessible during the
>> very early kernel boot.  sched_clock was merely an illustration of the
>> problem.
> 
> Do you know any code which uses udelay that early? I can't imagine any 
> non-driver code using udelay. If it's just drivers using udelay, then 
> they need IO mapping to have been completed for the device register 
> access to work and hence udelay won't have any problems.
> 
> -Saravana
> 
Colin Cross | 28 Jun 04:30 2010

Re: [PATCH V2] [ARM] Add ARCH_PROVIDES_UDELAY config option

I am using this patch in our Tegra development tree.  So far, I
haven't come across any need for supporting udelay before iomap, but
if it is necessary, it could be handled in the per-architecture
implementation by using a busy loop with a worst-case count if a
sched_clock initialization flag is not set.

On Mon, Jun 21, 2010 at 6:14 PM, Saravana Kannan <skannan <at> codeaurora.org> wrote:
> Russel,
>
> Would appreciate your thoughts on this.
>
> Colin,
>
> Are you continuing to use the approach used in your patch in your
> development tree? I just want to make sure we have a solution that everyone
> can agree to.
>
> Thanks,
> Saravana
>
> Saravana Kannan wrote:
>>
>> Russell King - ARM Linux wrote:
>>>
>>> My point is not specific to sched_clock, but to counters which on ARM
>>> are 99.9% always memory mapped, and therefore inaccessible during the
>>> very early kernel boot.  sched_clock was merely an illustration of the
>>> problem.
>>
>> Do you know any code which uses udelay that early? I can't imagine any
(Continue reading)


Gmane