21 Feb 2024 • sem_postmany

On Windows, ReleaseSemaphore lets you raise/post a semaphore n times with a single call, the idea being that the OS can implement it more efficiently than n syscalls in a loop. Indeed on Linux the underlying syscall has that functionality, futex can (amongst other things) make a thread wait for some token to be signalled, and wake some number of threads blocking on that token, so naturally they never updated the userspace API and you can only call sem_post in a loop.

Some years ago I decided to fill the gap myself. I had the delusions that both someone might want to pay money for this and that I would be able to find them and sell it to them, so I didn't read the glibc sources and instead figured it out with printf/strace/musl sources, making this code not GPL. That said you probably don't want to use this anyway because if you're using semaphores you already don't care about supermax performance.

struct GlibcSemaphore {
        // musl sets this to -1 if count == 0 && waiters > 0
        s32 saved_wakes;

        s32 waiters;

        // if pshared in sem_init is 0 then shared is 0. otherwise 128
        // musl is the other way around
        // used to set FUTEX_PRIVATE_FLAG, which was introduced in Linux 2.6.22
        s32 shared;

void sem_postmany( sem_t * sem, int n ) {
        GlibcSemaphore * gsem = ( GlibcSemaphore * ) sem;
        int old = __atomic_fetch_add( &gsem->saved_wakes, n, __ATOMIC_ACQ_REL );
        int waiters = gsem->waiters;
        int extra_wakes = Min2( waiters - old, n );
        if( extra_wakes > 0 ) {
                int op = gsem->shared == 0 ? FUTEX_WAKE_PRIVATE : FUTEX_WAKE;
                syscall( SYS_futex, &gsem->saved_wakes, op, extra_wakes );