Herbert Valerio Riedel | 6 May 13:29 2013
Picon

Why do remaining HECs busy-wait during unsafe-FFI calls?

Hello,

Recently, I stumbled over E.Z.Yang's "Safety first: FFI and
threading"[1] post and then while experimenting with unsafe-imported FFI
functions I've noticed a somewhat surprising behaviour:

Consider the following contrived program:

--8<---------------cut here---------------start------------->8---
import Foreign.C
import Control.Concurrent
import Control.Monad
import Data.Time.Clock.POSIX (getPOSIXTime)

foreign import ccall unsafe "unistd.h sleep" c_sleep_unsafe :: CUInt -> IO CUInt

main :: IO ()
main = do
    putStrLnTime "main started"
    _ <- forkIO (sleepLoop 10 >> putStrLnTime "sleepLoop finished")
    yield
    putStrLnTime "after forkIO"
    threadDelay (11*1000*1000) -- 11 seconds
    putStrLnTime "end of main"
  where
    putStrLnTime s = do
	t <- getPOSIXTime
        putStrLn $ init (show t) ++ "\t" ++ s

    sleepLoop n = do
(Continue reading)

Andreas Voellmy | 6 May 15:10 2013
Picon

Re: Why do remaining HECs busy-wait during unsafe-FFI calls?

When an unsafe call is made, the OS thread currently running on the HEC makes the call without releasing the HEC. If the main thread was on the run queue of the HEC making the foreign unsafe call when the foreign call was made, then no other HECs will pick up the main thread. Hence the two sleep calls in your program happen sequentially instead of concurrently.

I'm not completely sure what is causing the busy wait, but here is one guess: when a GC is triggered on one HEC, it signals to all the other HECs to stop the mutator and run the collection.  This waiting may be a busy wait, because the wait is typically brief.  If this is true, then since one thread is off in a unsafe foreign call, there is one HEC that refuses to start the GC and all the other HECs are busy-waiting for the signal.  The GC could be triggered by a period of inactivity.  Again, this is just a guess - you might try to verify this by turning off the periodic triggering of GC and checking whether the start GC barrier is a busy-wait.



On Mon, May 6, 2013 at 7:29 AM, Herbert Valerio Riedel <hvr <at> gnu.org> wrote:
Hello,

Recently, I stumbled over E.Z.Yang's "Safety first: FFI and
threading"[1] post and then while experimenting with unsafe-imported FFI
functions I've noticed a somewhat surprising behaviour:

Consider the following contrived program:

--8<---------------cut here---------------start------------->8---
import Foreign.C
import Control.Concurrent
import Control.Monad
import Data.Time.Clock.POSIX (getPOSIXTime)

foreign import ccall unsafe "unistd.h sleep" c_sleep_unsafe :: CUInt -> IO CUInt

main :: IO ()
main = do
    putStrLnTime "main started"
    _ <- forkIO (sleepLoop 10 >> putStrLnTime "sleepLoop finished")
    yield
    putStrLnTime "after forkIO"
    threadDelay (11*1000*1000) -- 11 seconds
    putStrLnTime "end of main"
  where
    putStrLnTime s = do
        t <- getPOSIXTime
        putStrLn $ init (show t) ++ "\t" ++ s

    sleepLoop n = do
        n' <- c_sleep_unsafe n
        unless (n' == 0) $ do
            putStrLnTime "c_sleep_unsafe got interrupted"
            sleepLoop n'

--8<---------------cut here---------------end--------------->8---

When compiled with GHC-7.6.3/linux/amd64 with "-O2 -threaded" and
executed with "+RTS -N4", the following output is emitted:

 1367838802.137419      main started
 1367838812.137727      after forkIO
 1367838812.137783      sleepLoop finished
 1367838823.148733      end of main

which shows that the forkIO of the unsafe ccall effectively blocks the
main thread;

Moreover, when looking at the process table, I saw that 3 threads were
occupying 100% CPU time each for 10 seconds until the 'after forkIO' was
emitted.

So what is happening here exactly, why do the 3 remaining HECs busy-wait
during that FFI call instead of continuing the execution of the main
thread?

Do *all* foreign unsafe ccalls (even short ones) cause N-1 HECs to spend
time in some kind of busy looping?


 [1]: http://blog.ezyang.com/2010/07/safety-first-ffi-and-threading/

Cheers,
  hvr

_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users <at> haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users <at> haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Herbert Valerio Riedel | 6 May 16:26 2013
Picon

Re: Why do remaining HECs busy-wait during unsafe-FFI calls?

Andreas Voellmy <andreas.voellmy <at> gmail.com> writes:

> When an unsafe call is made, the OS thread currently running on the HEC
> makes the call without releasing the HEC. If the main thread was on the run
> queue of the HEC making the foreign unsafe call when the foreign call was
> made, then no other HECs will pick up the main thread. Hence the two sleep
> calls in your program happen sequentially instead of concurrently.

Is this the bound-main-thread issue? That is, would wrapping the main
thread in 'runInUnboundThread' help here?

> I'm not completely sure what is causing the busy wait, but here is one
> guess: when a GC is triggered on one HEC, it signals to all the other HECs
> to stop the mutator and run the collection.  This waiting may be a busy
> wait, because the wait is typically brief.  If this is true, then since one
> thread is off in a unsafe foreign call, there is one HEC that refuses to
> start the GC and all the other HECs are busy-waiting for the signal.  The
> GC could be triggered by a period of inactivity.  Again, this is just a
> guess - you might try to verify this by turning off the periodic triggering
> of GC and checking whether the start GC barrier is a busy-wait.

that seems to be a rather good guess: I inhibited the GC by disabling
the idle-timer using "+RTS -N4 -I0" and with that the HEC
busy-waiting is gone;

So actually this isn't FFI-specific at all, as I could trigger the very
same effect by using a non-allocating/tight-loop evaluation such as the
following:

  do
    _ <- forkIO (evaluate (busyfun 0 0) >> putStrLnTime "busyfun finished")
  where
    busyfun :: Int -> Int -> Int
    busyfun !n !m = if m < 0 then n else busyfun (n+1) (m+1)

cheers,
  hvr

Gmane