Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions ChangeLog.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,11 @@ See docs/process.md for more on how version tagging works.
to handle these types of spurious wakeups / interruptions. (#26659, #26735)
- Attempting to use PTHREAD_PROCESS_SHARED when creating pthread primitives such
as locks and condvars will now fail with ENOTSUP. (#26743)
- When building code with both Wasm Workers and pthreads (hybrid mode) it is now
possible to call most of the core pthread APIs (e.g. lock, condvar, etc) from
a Wasm Worker. This mode increases the memory used by each Wasm Worker by
~500 bytes (in the same way that declaring ~500 bytes of TLS data would).
(#26757)

5.0.6 - 04/14/26
----------------
Expand Down
36 changes: 19 additions & 17 deletions docs/design/02-wasm-worker-pthread-compat.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Design Doc: Wasm Worker Pthread Compatibility

- **Status**: Draft
- **Status**: Completed
- **Bug**: https://github.com/emscripten-core/emscripten/issues/26631

## Context
Expand All @@ -14,31 +14,33 @@ This is not an issue in pure Wasm Workers programs but we also support hybrid
programs that run both pthreads and Wasm Workers. In this cases the pthread
API is available, but will fail in undefined ways if called from Wasm Workers.

This document proposes a plan to improve the hybrid mode by adding the pthread
This document describes the implementation to improve the hybrid mode by adding the pthread
metadata (`struct pthread`) to each Wasm Worker, allowing the pthread API (or at
least some subset of it) APIs to used from Wasm Workers.

## Proposed Changes
## Implemented Changes

### 1. Memory Layout

Currently, Wasm Workers allocate space for TLS and stack: `[TLS data] [Stack]`.
We propose to change this to: `[struct pthread] [TLS data] [Stack]`.
Normally, Wasm Workers allocate space for only TLS and stack: `[TLS data] [Stack]`.
For hybrid mode (when pthreed are enabled as well as Wasm Workers) we changed
this to also include pthread-specific data: `[struct
pthread] [TSD pointers] [TLS data] [Stack]`.

The `struct pthread` will be located at the very beginning of the allocated
The `struct pthread` is located at the very beginning of the allocated
memory block for each Wasm Worker.

### 2. `struct pthread` Initialization

The `struct pthread` will be initialized by the creator thread in `emscripten_create_wasm_worker` (or `emscripten_malloc_wasm_worker`).
The `struct pthread` is initialized by the creator thread in `emscripten_create_wasm_worker` (or `emscripten_malloc_wasm_worker`).
This includes:
- Zero-initializing the structure.
- Setting the `self` pointer to the start of the `struct pthread`.
- Initializing essential fields like `tid`.

On the worker thread side, `_emscripten_wasm_worker_initialize` will need to set
the thread-local pointer (returned by `__get_tp()`) to the `struct pthread`
location.
On the worker thread side, initialization is completed by calling
`__set_thread_state` (via JS `___set_thread_state` in `libwasm_worker.js`) to
set the thread pointer, making it available to `__get_tp`.

### 3. `__get_tp` Support

Expand All @@ -62,14 +64,14 @@ APIs that will NOT be supported (or will have limited support):
- `pthread_cancel()`: Not supported in Wasm workers.
- `pthread_kill()`: Not supported in Wasm workers.

## Implementation Plan
## Implementation Details

1. Modify `emscripten_create_wasm_worker` in `system/lib/wasm_worker/library_wasm_worker.c` to account for `sizeof(struct pthread)` in memory allocation and initialize the structure.
2. Update `_emscripten_wasm_worker_initialize` in `system/lib/wasm_worker/wasm_worker_initialize.S` to set the thread pointer.
3. Modify `system/lib/pthread/emscripten_thread_state.S` to enable `__get_tp` for Wasm workers.
4. Review and test essential pthread functions (like TSD) in Wasm workers.
5. Document the supported and unsupported pthread APIs for Wasm workers.
1. Modified `emscripten_create_wasm_worker` in `system/lib/wasm_worker/library_wasm_worker.c` to account for `sizeof(struct pthread)` in memory allocation and initialize the structure.
2. Updated `$_wasmWorkerInitializeRuntime` in `src/lib/libwasm_worker.js` to call `___set_thread_state` to set the thread pointer.
3. Verified that essential pthread functions (like `pthread_self()`) work in Wasm workers in hybrid mode.

## Verification
- Add a new test that makes use of `pthread_self()` and low level synchronization APIs from a Wasm Worker.
- New tests added to ensure `pthread_self()` and low level synchronization APIs
work when called from a Wasm Worker.
- Verify that existing Wasm worker tests still pass.
- Verify no extra overhead for regular (non-hybrid) Wasm Worker builds.
33 changes: 25 additions & 8 deletions site/source/docs/api_reference/wasm_workers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -169,14 +169,31 @@ Note that support for nested Workers varies across browsers. As of 02/2022, nest
supported in Safari <https://webkit.org/b/22723>`_. See `here
<https://github.com/johanholmerin/nested-worker>`_ for a polyfill.

Pthreads can use the Wasm Worker synchronization API, but not vice versa
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Pthreads can use the Wasm Worker synchronization API
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The multithreading synchronization primitives offered in ``emscripten/wasm_worker.h``
(``emscripten_lock_*``, ``emscripten_semaphore_*``, ``emscripten_condvar_*``) can be freely invoked
from within pthreads if one so wishes, but Wasm Workers cannot utilize any of the synchronization
functionality in the Pthread API (``pthread_mutex_*``, ``pthread_cond_``, ``pthread_rwlock_*``, etc),
since they lack the needed pthread runtime.
from within pthreads if one so wishes.

Wasm Worker cannot use the pthread API, unless building in hybrid mode
Comment thread
sbc100 marked this conversation as resolved.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

By default, Wasm Workers cannot call any pthread APIs (``pthread_mutex_*``,
``pthread_cond_*``, ``pthread_self()``, etc.), since they lack the needed
thread metadata.

However, when building in **hybrid mode** (linking with both ``-sWASM_WORKERS``
and ``-pthread``), Wasm Workers are created with the additional metadata needed.
In this mode, Wasm Workers support a subset of the pthread API:

- ``pthread_self()`` and ``pthread_equal()``
- Thread Specific Data (TSD) via ``pthread_getspecific()`` and ``pthread_setspecific()``
- Thread synchronization API such was `pthread_mutex_*` and `pthread_cond_*`.

APIs that manage the thread lifecycle (like ``pthread_create``,
``pthread_join``, ``pthread_detach``, ``pthread_cancel``) are still not
supported for Wasm Workers, even in hybrid mode.

Pthreads have a "thread main" function and atexit handlers
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -288,7 +305,7 @@ table.

<tr><td class='cellborder'>Thread ID</td>
<td class='cellborder'>Creating a pthread obtains its ID. Call <pre>pthread_self()</pre> to acquire ID of calling thread.</td>
<td class='cellborder'>Creating a Worker obtains its ID. Call <pre>emscripten_wasm_worker_self_id()</pre> acquire ID of calling thread.</td></tr>
<td class='cellborder'>Creating a Worker obtains its ID. Call <pre>emscripten_wasm_worker_self_id()</pre> acquire ID of calling thread.<br>In hybrid mode, <pre>pthread_self()</pre> also works.</td></tr>

<tr><td class='cellborder'>High resolution timer</td>
<td class='cellborder'>``emscripten_get_now()``</td>
Expand Down Expand Up @@ -331,7 +348,7 @@ table.

<tr><td class='cellborder'>Recursive mutex</td>
<td class='cellborder'><pre>pthread_mutex_*</pre></td>
<td class='cellborder'>N/A</td></tr>
<td class='cellborder'>Supported in hybrid mode</td></tr>

<tr><td class='cellborder'>Semaphores</td>
<td class='cellborder'>N/A</td>
Expand All @@ -343,7 +360,7 @@ table.

<tr><td class='cellborder'>Read-Write locks</td>
<td class='cellborder'><pre>pthread_rwlock_*</pre></td>
<td class='cellborder'>N/A</td></tr>
<td class='cellborder'>Supported in hybrid mode</td></tr>

<tr><td class='cellborder'>Spinlocks</td>
<td class='cellborder'><pre>pthread_spin_*</pre></td>
Expand Down
2 changes: 1 addition & 1 deletion src/lib/libsigs.js
Original file line number Diff line number Diff line change
Expand Up @@ -314,7 +314,7 @@ sigs = {
_embind_register_value_object_field__sig: 'vpppppppppp',
_embind_register_void__sig: 'vpp',
_emscripten_create_audio_worklet__sig: 'viipipp',
_emscripten_create_wasm_worker__sig: 'iipi',
_emscripten_create_wasm_worker__sig: 'iipip',
_emscripten_dlopen_js__sig: 'vpppp',
_emscripten_dlsync_threads__sig: 'v',
_emscripten_fetch_get_response_headers__sig: 'pipp',
Expand Down
16 changes: 10 additions & 6 deletions src/lib/libwasm_worker.js
Original file line number Diff line number Diff line change
Expand Up @@ -89,14 +89,15 @@ addToLibrary({
'_emscripten_wasm_worker_initialize',
#if PTHREADS
'__set_thread_state',
'$alignMemory',
#endif
],
$_wasmWorkerInitializeRuntime: () => {
#if ASSERTIONS
assert(wwParams);
assert(wwParams.wwID);
assert(wwParams.stackLowestAddress % 16 == 0);
assert(wwParams.stackSize % 16 == 0);
assert(wwParams.stackLowestAddress % {{{ STACK_ALIGN }}} == 0);
assert(wwParams.stackSize % {{{ STACK_ALIGN }}} == 0);
#endif
#if RUNTIME_DEBUG
dbg("wasmWorkerInitializeRuntime wwID:", wwParams.wwID);
Expand All @@ -123,7 +124,7 @@ addToLibrary({
#if PTHREADS
// Record the pthread configuration, and whether this Wasm Worker supports synchronous blocking in emscripten_futex_wait().
// (regular Wasm Workers do, AudioWorklets don't)
___set_thread_state(/*thread_ptr=*/0, /*is_main_thread=*/0, /*is_runtime_thread=*/0, /*supports_wait=*/ {{{ workerSupportsFutexWait() }}});
___set_thread_state(wwParams.pthreadPtr ?? 0, /*is_main_thread=*/0, /*is_runtime_thread=*/0, /*supports_wait=*/ {{{ workerSupportsFutexWait() }}});
#endif
#if STACK_OVERFLOW_CHECK >= 2
// Fix up stack base. (TLS frame is created at the bottom address end of the stack)
Expand Down Expand Up @@ -178,7 +179,7 @@ if (ENVIRONMENT_IS_WASM_WORKER
_wasmWorkers[0] = globalThis;
addEventListener("message", _wasmWorkerAppendToQueue);
}`,
_emscripten_create_wasm_worker: (wwID, stackLowestAddress, stackSize) => {
_emscripten_create_wasm_worker: (wwID, stackLowestAddress, stackSize, pthreadPtr) => {
#if ASSERTIONS
if (!_emscripten_has_threading_support()) {
err('create_wasm_worker: environment does not support SharedArrayBuffer, wasm workers are not available');
Expand All @@ -203,8 +204,11 @@ if (ENVIRONMENT_IS_WASM_WORKER
wwID,
wasm: wasmModule,
wasmMemory,
stackLowestAddress, // sb = stack bottom (lowest stack address, SP points at this when stack is full)
stackSize, // sz = stack size
stackLowestAddress,
stackSize,
#if PTHREADS
pthreadPtr,
#endif
});
worker.onmessage = _wasmWorkerRunPostMessage;
#if ENVIRONMENT_MAY_BE_NODE
Expand Down
2 changes: 1 addition & 1 deletion system/lib/libc/emscripten_internal.h
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ void emscripten_fetch_free(unsigned int);

// Internal implementation function in JavaScript side that emscripten_create_wasm_worker() calls to
// to perform the wasm worker creation.
bool _emscripten_create_wasm_worker(emscripten_wasm_worker_t wwID, void *stackLowestAddress, uint32_t stackSize);
bool _emscripten_create_wasm_worker(emscripten_wasm_worker_t wwID, void *stackLowestAddress, uint32_t stackSize, void* pthreadPtr);

void _emscripten_create_audio_worklet(emscripten_wasm_worker_t wwID, EMSCRIPTEN_WEBAUDIO_T audioContext, void *stackLowestAddress, uint32_t stackSize, EmscriptenStartWebAudioWorkletCallback callback, void *userData2);

Expand Down
2 changes: 1 addition & 1 deletion system/lib/libc/musl/src/thread/pthread_mutex_consistent.c
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ int pthread_mutex_consistent(pthread_mutex_t *m)
int own = old & 0x3fffffff;
if (!(m->_m_type & 4) || !own || !(old & 0x40000000))
return EINVAL;
if (own != CURRENT_THREAD_ID)
if (own != __pthread_self()->tid)
return EPERM;
a_and(&m->_m_lock, ~0x40000000);
return 0;
Expand Down
4 changes: 2 additions & 2 deletions system/lib/libc/musl/src/thread/pthread_mutex_timedlock.c
Original file line number Diff line number Diff line change
Expand Up @@ -87,12 +87,12 @@ int __pthread_mutex_timedlock(pthread_mutex_t *restrict m, const struct timespec
if (!own && (!r || (type&4)))
continue;
if ((type&3) == PTHREAD_MUTEX_ERRORCHECK
&& own == CURRENT_THREAD_ID)
&& own == __pthread_self()->tid)
return EDEADLK;
#if defined(__EMSCRIPTEN__) && !defined(NDEBUG)
// Extra check for deadlock in debug builds, but only if we would block
// forever (at == NULL).
assert(at || own != CURRENT_THREAD_ID && "pthread mutex deadlock detected");
assert(at || own != __pthread_self()->tid && "pthread mutex deadlock detected");
#endif

a_inc(&m->_m_waiters);
Expand Down
2 changes: 1 addition & 1 deletion system/lib/libc/musl/src/thread/pthread_mutex_trylock.c
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ int __pthread_mutex_trylock_owner(pthread_mutex_t *m)
int old, own;
int type = m->_m_type;
pthread_t self = __pthread_self();
int tid = CURRENT_THREAD_ID;
int tid = self->tid;
volatile void *next;

old = m->_m_lock;
Expand Down
85 changes: 84 additions & 1 deletion system/lib/wasm_worker/library_wasm_worker.c
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,22 @@
#include <malloc.h>
#include <sys/param.h> // For MAX()

#ifdef __EMSCRIPTEN_PTHREADS__
#include "pthread_impl.h"
#endif

#ifndef __EMSCRIPTEN_WASM_WORKERS__
#error __EMSCRIPTEN_WASM_WORKERS__ should be defined when building this file!
#endif

// Comment this line to enable tracing of thread creation and destruction:
// #define PTHREAD_DEBUG
#ifdef PTHREAD_DEBUG
#define dbg(fmt, ...) emscripten_dbgf(fmt, ##__VA_ARGS__)
#else
#define dbg(fmt, ...)
#endif

#define ROUND_UP(x, ALIGNMENT) (((x)+ALIGNMENT-1)&-ALIGNMENT)
#define SBRK_ALIGN (__alignof__(max_align_t))
#define STACK_ALIGN __BIGGEST_ALIGNMENT__
Expand Down Expand Up @@ -58,6 +70,73 @@ static void init_file_lock(FILE *f) {
if (f && f->lock<0) f->lock = 0;
}

#ifdef __EMSCRIPTEN_PTHREADS__
/* pthread_key_create.c overrides this */
static volatile size_t dummy = 0;
weak_alias(dummy, __pthread_tsd_size);

#define TSD_ALIGN (sizeof(void*))

/**
* The layout of the wasm worker stack space in hybrid mode is as follow.
* [ struct pthread ] [ pthread TSD slots ] [ TLS data ] [ ... stack ]
*
* As opposed to the layout for regular Wasm Workers which is just:
* [ TLS data ] [ ... stack ]
*/
static void* init_pthread_struct(void *stackPlusTLSAddress, pid_t tid, size_t* stackPlusTLSSize) {
// TODO: Remove duplication with pthread_create

pthread_t self = pthread_self();
pthread_t new = (pthread_t)stackPlusTLSAddress;

uintptr_t base = (uintptr_t)stackPlusTLSAddress;
uintptr_t offset = base;

// 3. tsd slots
if (__pthread_tsd_size) {
offset = ROUND_UP(offset, TSD_ALIGN);
new->tsd = (void*)offset;
offset += __pthread_tsd_size;
}

// Calculate updated stack size
offset = ROUND_UP(offset, STACK_ALIGN);
size_t new_stack_size = *stackPlusTLSSize - (offset - base);
assert(new_stack_size % STACK_ALIGN == 0);
assert(new_stack_size > 0);

new->self = new;
new->tid = tid;
new->map_base = stackPlusTLSAddress;
new->map_size = *stackPlusTLSSize;
new->stack = (void*)(base + *stackPlusTLSSize);
new->stack_size = new_stack_size;
new->guard_size = __default_guardsize;
new->detach_state = DT_DETACHED;
new->robust_list.head = &new->robust_list.head;

__tl_lock();

new->next = self->next;
new->prev = self;
new->next->prev = new;
new->prev->next = new;

__tl_unlock();

dbg("init_pthread_struct: base=%#lx, end=%#lx, used=%zu "
"stackold=%zu stacknew=%zu",
base,
base + *stackPlusTLSSize,
offset - base,
*stackPlusTLSSize,
new_stack_size);
*stackPlusTLSSize = new_stack_size;
return (void*)offset;
}
#endif

emscripten_wasm_worker_t emscripten_create_wasm_worker(void *stackPlusTLSAddress, size_t stackPlusTLSSize) {
assert(stackPlusTLSAddress != 0);
assert((uintptr_t)stackPlusTLSAddress % STACK_ALIGN == 0);
Expand Down Expand Up @@ -103,7 +182,11 @@ emscripten_wasm_worker_t emscripten_create_wasm_worker(void *stackPlusTLSAddress
if (!libc.threads_minus_1++) libc.need_locks = 1;

emscripten_wasm_worker_t wwID = _emscripten_get_next_tid();
if (!_emscripten_create_wasm_worker(wwID, stackPlusTLSAddress, stackPlusTLSSize))
void* pthreadPtr = stackPlusTLSAddress;
#ifdef __EMSCRIPTEN_PTHREADS__
stackPlusTLSAddress = init_pthread_struct(stackPlusTLSAddress, wwID, &stackPlusTLSSize);
#endif
if (!_emscripten_create_wasm_worker(wwID, stackPlusTLSAddress, stackPlusTLSSize, pthreadPtr))
return 0;
return wwID;
}
Expand Down
8 changes: 4 additions & 4 deletions test/codesize/test_codesize_minimal_pthreads.json
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
{
"a.out.js": 7323,
"a.out.js.gz": 3573,
"a.out.nodebug.wasm": 19053,
"a.out.nodebug.wasm.gz": 8794,
"total": 26376,
"total_gz": 12367,
"a.out.nodebug.wasm": 19051,
"a.out.nodebug.wasm.gz": 8796,
"total": 26374,
"total_gz": 12369,
"sent": [
"a (memory)",
"b (exit)",
Expand Down
8 changes: 4 additions & 4 deletions test/codesize/test_codesize_minimal_pthreads_memgrowth.json
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
{
"a.out.js": 7726,
"a.out.js.gz": 3779,
"a.out.nodebug.wasm": 19054,
"a.out.nodebug.wasm.gz": 8796,
"total": 26780,
"total_gz": 12575,
"a.out.nodebug.wasm": 19052,
"a.out.nodebug.wasm.gz": 8797,
"total": 26778,
"total_gz": 12576,
"sent": [
"a (memory)",
"b (exit)",
Expand Down
Loading
Loading