Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: API to allow setting thread pinning for Isolates #46943

Open
maks opened this issue Aug 19, 2021 · 13 comments
Open

Feature request: API to allow setting thread pinning for Isolates #46943

maks opened this issue Aug 19, 2021 · 13 comments
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-isolate type-enhancement A request for a change that isn't a bug

Comments

@maks
Copy link

maks commented Aug 19, 2021

Following on from discussion in #44228 I wanted to propose an addition to the API for Isolate spawning which would allow setting the thread priority of a thread allocated to the Isolate or new Isolate group.

I'm not familiar with the implementation details of the thread pool and how threads from it are assigned to Isolates, so I'm not sure if there is currently a means to have a specific thread be assigned or tied to a specific Isolate or Isolate group.

My use case for this is I'm trying to use Dart (on Linux) for audio DSP applications (eg, synthezisers, audio fx processing, etc) and it seems the recommended way to do low latency audio output applications is to use a high priority thread to feed the OS's audio api while trying to ensure that thread never blocks. Given its currently not possible to use FFI with async callbacks into Dart Isolates and that most cross-platform audio libraries (libsoundio, jack, miniaudio) use callbacks from their own high priority "RT" thread, my thought was the next best thing could be to be able to provide a high priority thread for a single Dart Isolates group, which would then have at least 1 isolate calling a sync audio (eg ALSA) api via FFI with audio samples generated in other isolates supplied via its SendPort or FFI pointers if the sendport mechanism itself is too slow.

@maks
Copy link
Author

maks commented Aug 19, 2021

Actually I just realised that with FFI, if it's the same thread that does FFI that also runs my Dart code in the Isolate that calls the FFI functions, then I could be a bit checky and just use pthreads api in native code to elevate the priority of that thread myself. But this also would be assuming that the same thread keeps getting scheduled to run the code in my worker "high priority" Isolate, which I'm not clear if that would be the case, even if i only had main + 1 other worker Isolate?

@mkustermann would you have any thoughts on if this could work or is am I just on a fools errand?

@mkustermann mkustermann added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-isolate type-enhancement A request for a change that isn't a bug labels Aug 19, 2021
@mkustermann
Copy link
Member

There's a few things to dissect here:

  • Thread pinning: Is there a need for the Dart code as well as any possible C code it calls via FFI to always run on the same thread for the existence of that isolate. This is sometimes needed if a C library makes use of TLS (thread local storage) - for example: Dart calls C which then sets up a TLS value calls back into Dart which might call back into a C callback (which accesses TLS key).
    If that was implemented one could use FFI to call out to pthread API to change the priority of a thread and it would stick (since the isolate stays on that thread. One unexpected side-effect of this is that any threads spawned from the former (e.g. Dart spawns garbage collection threads) would inherit the priority.

  • Isolate prioritization: Does a Dart isolate need to run at a specific priority. Firstly it's not always possible to increase a threads priority - it might require special (e.g. root) privileges for the process (e.g. setpriority() on linux allows lowering but not increasing priorities by-default). Most likely one would want to run the code anyway - making this prioritization only a best effort (might or might not apply the priority) solution.

  • Low latency: Running Dart code (together with some FFI calls) might or might not fit your latency requirements. For example the VM can decide to collect garbage: a) young-generation collections can take 20+ ms b) in low-memory situations the VM can decide to perform a compaction (could take 100+ms). So by running Dart code you give up control over latency requirements to some extend already.

  • Async communication: An isolate can communicate with a C thread asynchronously via ports. So one can decide to have a dedicated C thread, which sets it's thread priority and pings Dart code when it wants to send it some message or wants an answer for some request (in which case it'd need to wait for the answer - which can increase pause times).

@maks
Copy link
Author

maks commented Aug 19, 2021

Thanks for looking at this so quickly @mkustermann !

You make very good points about low latency and I have thought myself if this is even feasible in Dart. But I got some hope that this might be possible by comparing to my entry into Dart: Flutter, which has somewhat similar low latency requirements to meet the 16ms deadlines for 60fps, though I guess that they do that by giving the UI thread higher priority via the embedding.

In regards to thread pinning, I think you are spot on and I think this is what this feature request boils down to as with that it would be up to the user to make use of any OS specific APIs to raise thread priority rather than requiring any of that functionality from Dart itself. I guess this would be the minimal amount of change required in Dart to allow for this functionality and leave it up to any user to do whats best of their use case, in my particular case for example, I'm intending to run on "single-purpose" RPI's, so for instance potentially running my Dart process at elevated, even root user level is not necessarily an issue.

In regards to GC, yes any pauses could well blow out the time budget, but from my reading it seems that even for native code, its recommended not to do any allocation on the high priority audio thread (malloc et al) so for my particular use case I'd be following a similar strategy and avoiding runtime memory allocation in my Dart code as much as possible. Avoiding GC (as much as possible at least) would also I think alleviate issues of a GC thread inheriting the high priority of the thread in the Isolate that spawned it.

So I guess should I change the title of this issue to reflect the change in the requested feature?

@mkustermann
Copy link
Member

So I guess should I change the title of this issue to reflect the change in the requested feature?

Either that or file a new issue specifically about thread pinning. We can then also merge #38315 into it - which fundamentally is about the same problem.

@maks maks changed the title Feature request: API to allow setting thread priority for new Isolates Feature request: API to allow setting thread pinning for Isolates Aug 31, 2021
@maks
Copy link
Author

maks commented Aug 31, 2021

Thanks @mkustermann !
I've changed the title of this issue as I think it would help having the original context here along with the explanation in your comment. I'm still learning how Core Audio on iOS and MacOS works but suspect thread pinning would also be needed there to use the realtime audio render thread.

@a-siva
Copy link
Contributor

a-siva commented Nov 17, 2021

Work is being done to allow embedders to control spawning of worker threads instead of having the Dart VM do it (see #44228), would that solve the issues raised here (controlling thread priorities, thread pinning etc.).

@a-siva
Copy link
Contributor

a-siva commented Nov 17, 2021

@maks I saw the comment posted on the other thread stating the use case presented here is not for a custom embedder but rather having this functionality in the regular Dart SDK.

@maks
Copy link
Author

maks commented Nov 17, 2021

Thanks @a-siva yes sorry I should have been more specific, the request here is for using the regular SDK, eg for command line applications

@gaaclarke
Copy link
Contributor

I also ran into this problem. My use case is calling into JNI with dart:ffi. You can't call into JNI from an arbitrary thread, it has to be registered with the JavaVM via AttachCurrentThread.

@maks to your original problem in the description I've written a few synths and this is how you want to setup your code, the fact that Dart doesn't execute on a dedicated thread shouldn't be a problem for you.

  1. Write your synth in a non-garbage collected language like C/C++. You don't want garbage collection interrupting your audio callback and as you make your synthesizer more complex you will be starved for more CPU so you'll want a language that is a thin abstraction over the machine code to make it easier to optimize.
  2. Create a lockless ring buffer that will be used to send control data to the synthesizer that is running on the audio callback thread, even using mutexs on that thread will potentially cause buffer underruns (audio glitches resulting from the CPU not responding fast enough).
  3. Call from dart with dart:ffi to C/C++ code that will register the audio callback and supply it with the synth it will use to fill the audio buffer and the lockless data buffer that will receive control data
  4. When you what to change a parameter for the synthesizer you will use dart:ffi to call into C which will place the data on the lockless ring buffer. It doesn't matter what thread executes this, it only matters that calls to write to it are serial.
  5. In the audio callback read from the lockless ring buffer to modify the synth before calculating the buffer.

Here's some pseudo code:

struct Data {
  Synth* synth;
  LocklessRingBuffer* controlBuffer;
}

struct ControlChange {
  int parameter;
  int value;
}

// Called on some high priority audio thread
static int AudioCallback(float* buffer, int length, void* voidData) {
  Data* data = (Data*)voidData;
  while(LocklessRingBufferHasData()) {
    ControlChange cc = LocklessRingBufferPop(data->controlBuffer);
    SynthSetControlValue(data->synth, cc.parameter, cc.value);
  }
  return SynthFillBuffer(data->synth, buffer, length);
}

// Called on some Dart thread via dart:ffi.
Data* StartSynth() {
  Data* data = CreateSynth();
  AudioApiStartAudioCallback(AudioCallback, data);
  return data;
}

// Called on some Dart thread via dart:ffi.
void KeyDown(Data* data, int key) {
  LocklessRingBufferPush(data->controlBuffer, (ControlChange) {
    .parameter = KEY_DOWN,
    .value = key
  });
}

@maks
Copy link
Author

maks commented Jan 27, 2022

Thanks for taking the time to write such a detailed reply @gaaclarke ! very much appreciated and very nice to meet another synth developer!

The approach you outline is roughly what I found is done by the flutter sequencer package.

And yes I've gradually come to the realisation that using the method you've outlined above is really the only viable way at the moment with Dart.

I've read the seminal "time waits for nothing" about the need to avoid not just GC but even manual mem alloc to avoid anything that might block. So I had initially thought I could work around Darts use of GC by having a separate Isolate where I didn't do any runtime allocation doing the audio synth and output to the platform audio api, using a lockless ring buffer to transfer data to the Isolate via FFI pre-allocated memory, hence filing this issue (thanks to mraleph via twitter for suggesting to me that trick to avoid any slowness due to using SendPorts).

But I've since learnt from feedback from @mkustermann that when GC happens it effects all Isolates 😞 plus mraleph in another issue (about compiling Dart to Wasm) said that even if I didn't do any explicit object creation at runtime myself, I might trigger such alloc that Dart would do behind the scenes and hence cause GC.

So I've now (sadly) given up on trying to have the audio synth/effects code in Dart which I was trying to do in this project and instead switch over to having the WASM synth in that project output its audio data frames to the system audio api (alsa/pulseaudio/jack etc) all in C/Rust/Zig/etc code rather than routing it via a Dart FFI binding to a audiolib.

Last year I revived and put back on Play the
Raph Levien's original "msfa" Dx7 fm synth app
so I'm likely to try to use your recommended approach first with its C code, as I have planned to convert it into a Flutter app.

Given the above I don't really have a use case anymore to request the thread pinning feature, but it sounds like there maybe others that would, but if thats not the case, I'm happy for the issue to be closed.

@dcharkes
Copy link
Contributor

dcharkes commented Jun 7, 2022

  • Thread pinning: Is there a need for the Dart code as well as any possible C code it calls via FFI to always run on the same thread for the existence of that isolate. This is sometimes needed if a C library makes use of TLS (thread local storage) - for example: Dart calls C which then sets up a TLS value calls back into Dart which might call back into a C callback (which accesses TLS key).

We also run into this with using JNI. For example a thread needs to be attached to the JVM AttachCurrentThread, and JNIEnv*s are only valid for the thread for which it was obtained. cc @mahesh-hegde

@antonbashir
Copy link

I also run into thread pinning issue. #49463
My case has not dependencies with JNI but I use native library which uses thread local values and I need pin Isolate to thread.

@maks
Copy link
Author

maks commented Jan 25, 2023

While the title of this issue is about thread pinning, my original usecase for opening this issue was in trying to use Dart via FFI for low latency audio and it seems at least on that front, while not a complete solution, the new Dart_PerformanceMode_Latency in the embedder API may be of some help there for future use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-isolate type-enhancement A request for a change that isn't a bug
Projects
None yet
Development

No branches or pull requests

6 participants