Version: 1.1
Date: 2026-03-26
Status: Design Proposal (Technically Reviewed)
Supersedes: threads.md, fork.md, multiplicity.md (consolidates all three)
This document presents a unified, realistic plan for concurrency in PerlOnJava.
It replaces the separate threads.md, fork.md, and multiplicity.md documents
with a single plan grounded in what JRuby, Jython, and TruffleRuby have actually
shipped — not theoretical designs.
Key insight from research: Every successful JVM language implementation chose one primary concurrency model and made it work well, rather than trying to faithfully replicate every native-language concurrency mechanism.
- JRuby model (proven): Map Perl threads directly to JVM threads; make the
runtime thread-safe; accept that
fork()cannot work. - ThreadLocal-based runtime context instead of classloader isolation (Jython learned this the hard way — ThreadLocal is simpler and faster).
- Incremental de-static-ification of global state, starting with the smallest subsystems.
- No GIL/global lock — follow JRuby's lead: make core data structures safe, document what isn't safe, let users synchronize where needed.
threads (ithreads): Perl's official mechanism for parallel execution within a
single process. Each Perl thread runs in its own cloned interpreter (not sharing
memory by default). All variables are deep-copied at thread creation time. Shared
variables require explicit :shared attribute. This provides "fork-like semantics
without forking" — isolation by default. Note: The use of interpreter-based threads
is officially "discouraged" in Perl 5 documentation, but it remains the only
officially supported threading model.
fork(): Process-level parallelism by creating a child process with copied memory. Uses copy-on-write semantics for efficiency. Cannot be implemented on the JVM — JRuby confirmed this over 20 years of attempts. The JVM's GC can crash between fork and exec, and JVM state cannot be safely split across processes.
multiplicity: Running multiple Perl interpreter instances within the same
process. In Perl 5, this is the internal mechanism (perl_clone()) that enables
ithreads. In PerlOnJava, this is achieved through PerlRuntime instances. Use
cases include web server request isolation (mod_perl model) and JSR-223 embedding.
Threading model:
- Ruby threads map 1:1 to JVM threads (real OS threads)
- No Global VM Lock — true parallel execution
- The JRuby runtime itself is thread-safe for structural operations: defining methods, modifying constants, modifying global variables
- Core mutable data structures (String, Array, Hash) are not thread-safe; users must synchronize explicitly
volatilewrites for method tables, constant tables, class variables, global variables- Non-volatile: instance variables, local variables in closures
Fork handling:
Kernel#forkraisesNotImplementedErroron JRuby- JRuby tried POSIX
fork+execvia JNA forsystem()/backticks but abandoned it as too dangerous (JVM GC can crash between fork and exec) - Recommendation: use
Process.spawn,system(), or threads instead - This is universally accepted by the JRuby community
Key lesson: Don't waste time on fork emulation. Focus on making threads work well. The ecosystem adapts.
Threading model:
- Python threads map 1:1 to JVM threads
- No GIL (unlike CPython) — true parallel execution
dict,list,setare thread-safe (no corruption, but updates can be lost)- All Python attribute get/set are volatile (backed by ConcurrentHashMap)
- Sequential consistency for Python code
ThreadState problem:
- Jython used ThreadLocal to store execution context (
ThreadState) - They later recognized this was a design mistake that "slows things down and unnecessarily limits what a given thread can do"
- Planned refactoring to remove ThreadState but never completed it (project stalled before this happened)
Key lesson: ThreadLocal for runtime context is a pragmatic starting point but has performance costs. Plan for eventual migration to explicit parameter passing — but don't let that block the initial implementation.
Threading model:
- True parallel threads via GraalVM's managed threading
- Each
Contextis single-threaded; multi-threading requires multiple contexts or explicit opt-in viaallowAllAccess - Polyglot contexts provide natural isolation
Key lesson: Context-per-thread is the cleanest isolation model. PerlOnJava's
eventual PerlRuntime instances serve the same role.
How it works in CPython Perl:
- Each thread gets a cloned copy of the entire interpreter
- All variables are copied by default (deep clone at thread creation)
- Shared variables require explicit
:sharedattribute - The
threads::sharedmodule mediates all cross-thread access CLONEmethod called on objects during thread creation- Heavy — thread creation is expensive (full interpreter clone)
What mod_perl 2.0 does:
- Pre-creates a pool of Perl interpreter instances
- Each Apache worker thread gets its own interpreter
- Same pattern as the multiplicity.md "runtime pool" idea
The following mutable static state must be addressed before any concurrency:
| Class | Mutable Static Fields | Size | Risk |
|---|---|---|---|
GlobalVariable |
globalVariables, globalArrays, globalHashes, globalCodeRefs, globalIORefs, globalFormatRefs, pinnedCodeRefs, stashAliases, globAliases, globalGlobs, isSubs, packageExistsCache, globalClassLoader |
~13 maps + 1 classloader | Critical — all Perl variables live here |
RuntimeIO |
stdout, stderr, stdin |
3 fields | High — I/O misdirection under concurrency |
CallerStack |
callerStack (List) |
1 list | High — stack traces cross threads |
SpecialBlock |
endBlocks, initBlocks, checkBlocks |
3 arrays | Medium — phase blocks interfere |
DynamicVariableManager |
variableStack (Deque) |
1 deque | High — local scoping is per-execution-context |
RuntimeScalar |
dynamicStateStack (Stack) |
1 stack | High — dynamic state per thread |
RuntimeCode |
evalBeginIds, methodHandleCache, evalRuntimeContext (ThreadLocal), argsStack (ThreadLocal) |
2 maps + 2 ThreadLocals | Medium — some already ThreadLocal |
PerlLanguageProvider |
globalInitialized |
1 boolean | Low — initialization guard |
InheritanceResolver |
methodCache, linearizedClassesCache, overloadContextCache, isaStateCache, packageMRO |
5 maps | Medium — stale under concurrent class modification |
RuntimeCode (additional) |
anonSubs, interpretedSubs, evalContext, methodHandleCache |
4 maps | Medium — compiler/eval caches |
RuntimeRegex |
regex state, special vars | state | Medium — $1, $2 etc. must be per-thread |
Already ThreadLocal: RuntimeCode.evalRuntimeContext, RuntimeCode.argsStack
Pure functions (safe): RuntimeScalarCache, most operator methods, Base64Util
Read-only after init (safe): Configuration, ParserTables
The multiplicity.md document proposed classloader-based isolation first. After reviewing JRuby and Jython's experience, this is not recommended as the primary approach:
- Type barriers make inter-runtime communication impossible without reflection
- Memory overhead — duplicate class loading for every runtime instance
- Debugging nightmare — class identity issues, serialization failures
- JRuby abandoned this pattern for user-facing isolation (only used internally for OSGi compatibility)
- Jython never used it — went straight to ThreadLocal
public final class PerlRuntime {
private static final ThreadLocal<PerlRuntime> CURRENT = new ThreadLocal<>();
// Per-runtime mutable state (moved from static fields)
final VariableStorage variables; // was GlobalVariable statics
final IOSystem io; // was RuntimeIO statics
final CallerStackManager callStack; // was CallerStack statics
final DynamicScopeManager dynamicScope; // was DynamicVariableManager + RuntimeScalar statics
final SpecialBlockManager blocks; // was SpecialBlock statics
final RegexState regexState; // $1, $2, match state
public static PerlRuntime current() {
return CURRENT.get();
}
public static void setCurrent(PerlRuntime rt) {
CURRENT.set(rt);
}
}Migration path: Replace GlobalVariable.getGlobalVariable("main::x") with
PerlRuntime.current().variables.getGlobalVariable("main::x"). This can be done
incrementally, subsystem by subsystem.
This is the only concurrency model that Perl 5 officially supports via CPAN. It maps naturally to JVM threads.
use threads;
use threads::shared;
my $counter :shared = 0;
my @threads;
for (1..4) {
push @threads, threads->create(sub {
lock($counter);
$counter++;
});
}
$_->join() for @threads;
print "Counter: $counter\n"; # 4Implementation:
public static RuntimeScalar threadsCreate(RuntimeScalar codeRef, RuntimeArray args) {
PerlRuntime parent = PerlRuntime.current();
// Clone runtime state for new thread (deep copy by default)
PerlRuntime child = parent.clone(CloneMode.THREAD);
// Copy shared variable references (not values) to child
child.linkSharedVariables(parent);
Thread javaThread = Thread.ofVirtual().start(() -> {
PerlRuntime.setCurrent(child);
try {
RuntimeCode.apply(codeRef, args, RuntimeContextType.VOID);
} finally {
PerlRuntime.setCurrent(null);
}
});
return new RuntimeScalar(new PerlThread(javaThread, child));
}Shared variables use a wrapper that synchronizes access:
public class SharedScalar extends RuntimeScalar {
private final ReentrantLock lock = new ReentrantLock();
@Override
public RuntimeScalar set(RuntimeScalar value) {
lock.lock();
try {
return super.set(value);
} finally {
lock.unlock();
}
}
@Override
public String toString() {
lock.lock();
try {
return super.toString();
} finally {
lock.unlock();
}
}
}Virtual Threads (Java 21+): Since PerlOnJava requires Java 22+, we can use
virtual threads (Thread.ofVirtual()) for lightweight thread creation. This is a
significant advantage over native Perl's heavy interpreter-cloning ithreads.
Virtual Thread Pinning Caveat: In Java 21-24, virtual threads become "pinned"
to their carrier OS thread when executing inside a synchronized block or native
method. This means blocking I/O inside synchronized blocks will block the
carrier thread rather than unmounting the virtual thread. JEP 491 ("Synchronize
Virtual Threads without Pinning") addresses this and will be available in future
Java releases. For now, SharedScalar/SharedArray/SharedHash use ReentrantLock
instead of synchronized — this is critical because ReentrantLock uses
LockSupport.park() which properly unmounts virtual threads.
Following JRuby's proven approach:
public static RuntimeScalar fork() {
// Set $! to explain why
WarnDie.warn("fork() is not available on the JVM. " +
"Use threads, system(), or ProcessBuilder instead.");
GlobalVariable.getGlobalVariable("main::!").set("Function not implemented");
return RuntimeScalarCache.scalarUndef; // returns undef
}Rationale: JRuby has shipped NotImplementedError for fork for 20 years.
The community adapted. Most fork usage falls into patterns that have alternatives:
| Perl Pattern | JVM Alternative |
|---|---|
fork + exec |
system(), ProcessBuilder |
fork for parallelism |
threads |
fork for daemon |
Thread + process management |
fork in test harness |
Thread-based test runner |
open(PIPE, "-|") |
ProcessBuilder with piped streams |
For JSR-223 and web server use cases (the most immediately useful):
public class PerlRuntimePool {
private final BlockingQueue<PerlRuntime> pool;
private final PerlRuntime template;
private final int maxPoolSize;
private final int minPoolSize;
private final long maxIdleTimeMs;
public PerlRuntimePool(int size) {
this(size, size, 0); // Default: fixed pool, no idle timeout
}
public PerlRuntimePool(int minSize, int maxSize, long maxIdleTimeMs) {
this.minPoolSize = minSize;
this.maxPoolSize = maxSize;
this.maxIdleTimeMs = maxIdleTimeMs;
this.template = PerlRuntime.createDefault();
this.pool = new ArrayBlockingQueue<>(maxSize);
for (int i = 0; i < minSize; i++) {
pool.add(template.clone(CloneMode.INDEPENDENT));
}
}
public PerlRuntime acquire() throws InterruptedException {
return pool.take();
}
public void release(PerlRuntime rt) {
rt.reset(); // Clear request-scoped state
pool.offer(rt);
}
}Configuration options (exposed via system properties or API):
minPoolSize: Minimum runtimes to keep warm (default: 1)maxPoolSize: Maximum concurrent runtimes (default: CPU cores × 2)maxIdleTimeMs: Time before idle runtimes are destroyed (0 = never)
When threads->create() is called, a new PerlRuntime is created by deep-cloning
the parent. This section details the cloning semantics for each category of
runtime state — this is the most technically challenging part of the implementation.
PerlOnJava has two closure representations, both of which capture lexical variables by Java object reference:
JVM backend: Closures are compiled JVM classes where captured lexicals
become constructor parameters stored as instance fields on the generated class
object (RuntimeCode.codeObject). The RuntimeCode.methodHandle is bound to
this specific object instance.
Interpreter backend: Closures store captures in an explicit array:
// InterpretedCode.java
public final RuntimeBase[] capturedVars; // Closure support (captured from outer scope)Cloning rules:
| What | Clone? | Rationale |
|---|---|---|
| Compiled JVM bytecode / MethodHandle | No — share | Immutable code, safe to share across threads |
InterpretedCode.bytecode / constants / stringPool |
No — share | Immutable arrays, safe to share |
RuntimeCode.codeObject (JVM backend) |
Yes — deep copy | Holds mutable captured variable references |
InterpretedCode.capturedVars (interpreter) |
Yes — deep copy | Holds mutable captured variable references |
Each captured RuntimeScalar / RuntimeArray / RuntimeHash |
Yes — deep copy | New thread gets independent values (ithreads semantics) |
RuntimeCode.stateVariable / stateArray / stateHash |
Yes — deep copy | state variables are per-closure-instance |
Variables marked :shared |
No — share reference | Both threads get a reference to the same SharedScalar wrapper |
Reference fixup challenge: After deep-copying all closures, captured variables that point to globals must be redirected to the child's copies of those globals, not the parent's originals. This requires a two-pass clone:
- Pass 1: Deep-copy all
VariableStorage(globals) and allRuntimeCodeobjects, building an identity map:parentObject → childObject. - Pass 2: Walk all closure captures in the child. For each capture that appears in the identity map, replace it with the child's copy.
This is the same approach Perl 5 uses internally (perl_clone() + pointer fixup
table in sv.c).
Perl 5 calls CLONE($package) on every package that defines it whenever a new
thread is created. This is the mechanism by which modules handle thread safety.
DBI example (Perl 5 behavior):
package DBI;
sub CLONE {
# Called in the NEW thread's context after cloning
# Invalidate all cached database handles — they point to
# the parent's JDBC Connection and are unusable here
# Child must call DBI->connect() again
}Implementation plan:
- After cloning
PerlRuntime, walk all packages in the child's symbol table. - For each package that has a
CLONEsubroutine defined, call it with the package name as argument. - The call happens in the child thread's context (after
PerlRuntime.setCurrent(child)).
External resource handling rules:
| Resource Type | Action in Child | Rationale |
|---|---|---|
JDBC Connection (via DBI) |
Invalidate → undef | Cannot share connections across threads |
File handles (RuntimeIO) |
Clone with independent position | Perl 5 duplicates file descriptors |
| Sockets | Invalidate → undef | OS socket FDs can't be shared |
| Java objects (generic) | Shallow copy (reference) | User's responsibility via CLONE |
| Tied variables | Clone the tie object | CLONE on tie class handles specifics |
Safety net: During clone, any RuntimeScalar whose value wraps a Java object
that implements java.io.Closeable or java.sql.Connection should be set to
undef in the child by default. The module's CLONE method can then reconnect
if needed. This prevents silent use of stale connections.
The InheritanceResolver maintains several static caches that are critical for
performance but dangerous under concurrency:
// InheritanceResolver.java — all static, all mutable
static final Map<String, List<String>> linearizedClassesCache; // MRO linearization
private static final Map<String, RuntimeScalar> methodCache; // method resolution
private static final Map<Integer, OverloadContext> overloadContextCache;
private static final Map<String, List<String>> isaStateCache; // @ISA change detectionAdditionally, RuntimeCode has:
private static final Map<Class<?>, MethodHandle> methodHandleCache; // MethodHandle lookupThread creation policy:
| Cache | Action at Thread Creation | Rationale |
|---|---|---|
methodCache |
Clear in child | Child may modify @ISA independently |
linearizedClassesCache |
Clear in child | Depends on @ISA which is cloned |
overloadContextCache |
Clear in child | Overload resolution depends on stash state |
isaStateCache |
Clear in child | Must detect changes relative to child's @ISA |
methodHandleCache |
Share (read-only) | Maps Class<?> → MethodHandle; immutable after creation |
Clearing is always correct because caches are just performance optimizations. The child pays a cold-cache penalty on first method resolution but is guaranteed correct behavior.
Long-term: Once these caches are moved into PerlRuntime (Phase 5), each
runtime has its own caches and no cross-thread invalidation is needed.
JRuby parallel: JRuby uses volatile writes for method table modifications
and call-site invalidation on structural changes. We follow the same principle:
structural changes (method definition, @ISA modification) invalidate caches.
Complete sequence for threads->create():
| Step | What | How |
|---|---|---|
| 1 | Clone VariableStorage |
Deep-copy all global scalars/arrays/hashes/code refs |
| 2 | Clone closure captures | Deep-copy capturedVars/codeObject in every RuntimeCode |
| 3 | Fixup capture references | Redirect cloned captures to child's globals (identity map) |
| 4 | Handle :shared vars |
Don't clone; wrap in SharedScalar/SharedArray/SharedHash |
| 5 | Invalidate external resources | Set Closeable/Connection-backed scalars to undef |
| 6 | Clear method caches | Clear methodCache, linearizedClassesCache, overloadContextCache, isaStateCache |
| 7 | Clone I/O state | Fresh stdout/stderr/stdin per IOSystem |
| 8 | Clone execution context | Fresh CallerStack, DynamicScope, SpecialBlocks, RegexState |
| 9 | Start JVM thread | Thread.ofVirtual().start(...) with PerlRuntime.setCurrent(child) |
| 10 | Call CLONE($pkg) |
Walk all packages in child, call CLONE where defined |
| 11 | Execute user code | Apply the code reference with provided arguments |
Steps 2-3 (closure capture cloning with reference fixup) are the most technically
challenging part. The identity-map approach from Perl 5's perl_clone() is the
proven solution.
(See Section 5 for detailed thread-creation semantics that inform all phases.)
Before any new features, make the existing single-threaded runtime safe for
the case where multiple JSR-223 eval() calls might overlap:
- Add
synchronizedtoGlobalVariablemap operations - Make
RuntimeIO.stdout/stderr/stdinvolatile - Add
synchronizedtoCallerStackoperations - Document what is and isn't thread-safe
Deliverable: Existing functionality works under a global lock. No new APIs.
Create the PerlRuntime class as a thin shell around existing statics:
- Create
PerlRuntimewithThreadLocal<PerlRuntime> CURRENT - Add
PerlRuntime.current()accessor - Keep all existing static fields — just route through
PerlRuntime.current() - Wire up
PerlLanguageProviderto create and set a defaultPerlRuntime
Key constraint: No behavior change. Every existing test must still pass.
The PerlRuntime is just a new entry point that delegates to existing statics.
Deliverable: PerlRuntime.current() works; JSR-223 can create isolated
contexts (even if they still share state internally).
Move RuntimeIO.stdout/stderr/stdin into PerlRuntime:
// Before: RuntimeIO.stdout
// After: PerlRuntime.current().io.stdout
public class IOSystem {
public RuntimeIO stdout;
public RuntimeIO stderr;
public RuntimeIO stdin;
}This is the smallest subsystem and provides immediate value for JSR-223: each ScriptEngine can redirect I/O independently.
Deliverable: JSR-223 ScriptContext.getWriter() works correctly per-context.
Move CallerStack.callerStack and DynamicVariableManager.variableStack and
RuntimeScalar.dynamicStateStack into PerlRuntime:
public class ExecutionContext {
final List<Object> callerStack = new ArrayList<>();
final Deque<DynamicState> dynamicScope = new ArrayDeque<>();
final Stack<RuntimeScalar> dynamicStateStack = new Stack<>();
}Deliverable: Call stacks and local variables are per-runtime.
Move SpecialBlock.endBlocks/initBlocks/checkBlocks and regex match state
($1, $2, $&, etc.) into PerlRuntime.
Regex State Isolation Detail:
The following regex-related variables must be per-runtime (not per-thread-global):
| Variable | Description | Storage Location |
|---|---|---|
$1, $2, ... $N |
Capture groups | PerlRuntime.regexState.captures |
$& |
Entire matched string | PerlRuntime.regexState.lastMatch |
| `$`` | String before match | PerlRuntime.regexState.priorMatch |
$' |
String after match | PerlRuntime.regexState.postMatch |
$+ |
Last bracket matched | PerlRuntime.regexState.lastBracket |
@- |
Match start positions | PerlRuntime.regexState.matchStarts |
@+ |
Match end positions | PerlRuntime.regexState.matchEnds |
public class RegexState {
final RuntimeArray captures = new RuntimeArray(); // $1, $2, etc.
RuntimeScalar lastMatch; // $&
RuntimeScalar priorMatch; // $`
RuntimeScalar postMatch; // $'
RuntimeScalar lastBracket; // $+
final RuntimeArray matchStarts = new RuntimeArray(); // @-
final RuntimeArray matchEnds = new RuntimeArray(); // @+
}Implementation: Modify RuntimeRegex to store results in PerlRuntime.current().regexState
instead of static fields. The ScalarSpecialVariable class for capture variables must
look up values from the current runtime's regex state.
Deliverable: Regex captures and END blocks don't leak between runtimes.
This is the largest and most invasive change. Move all 13+ maps from
GlobalVariable into PerlRuntime.variables:
public class VariableStorage {
final Map<String, RuntimeScalar> globalVariables = new HashMap<>();
final Map<String, RuntimeArray> globalArrays = new HashMap<>();
final Map<String, RuntimeHash> globalHashes = new HashMap<>();
final Map<String, RuntimeScalar> globalCodeRefs = new HashMap<>();
final Map<String, RuntimeGlob> globalIORefs = new HashMap<>();
final Map<String, RuntimeFormat> globalFormatRefs = new HashMap<>();
// ... etc
}Strategy: Use IDE "Find Usages" on each static field. Replace with
PerlRuntime.current().variables.xxx. Compile, test, repeat.
This is pure mechanical refactoring but touches hundreds of call sites.
Deliverable: Full runtime isolation. Multiple PerlRuntime instances
can run concurrently without interference.
With runtime isolation complete, implement the threads Perl module.
See Section 5 for the full thread-creation cloning protocol.
Core API:
threads->create(\&sub, @args)— clone runtime (Section 5.4) + start JVM threadthreads->join()— wait for completion, return valuethreads->detach()— fire and forgetthreads->list()— list running threadsthreads->self()— current thread objectthreads->tid()— thread ID (unique integer, main thread = 0)threads->yield()— hint to scheduler (maps toThread.yield())threads->exit($status)— exit current thread (see below)
threads::shared API:
9. share($var) / :shared attribute — SharedScalar/SharedArray/SharedHash wrappers
10. lock($var) — mutex on shared variables
11. cond_wait()/cond_signal()/cond_broadcast() — condition variables
12. is_shared($var) — check if variable is shared
Internal callbacks:
13. CLONE($package) callback — called in child after cloning (Section 5.2)
14. Closure capture cloning with reference fixup (Section 5.1)
15. Method cache invalidation in child (Section 5.3)
threads->exit() Semantics:
In Perl 5 ithreads, exit() in a non-main thread exits only that thread, not the
entire process. This must be implemented carefully:
public static void threadsExit(int status) {
PerlThread current = PerlThread.current();
if (current.isMainThread()) {
// Main thread: exit the process
System.exit(status);
} else {
// Worker thread: exit only this thread
current.setExitStatus(status);
throw new ThreadExitException(status);
}
}The ThreadExitException should be caught by the thread wrapper created in
threads->create() and should NOT propagate to the main thread or cause
"Perl exited with active threads" warnings.
Virtual Threads note: Use Thread.ofVirtual() by default. Perl's
ithreads are already "heavy" (full interpreter clone), so making the actual
JVM thread lightweight is a net win.
Timeline Note: This phase is estimated at 4 weeks but may take 6 weeks due to the complexity of closure capture cloning with reference fixup (the most technically challenging part of the entire implementation).
- Implement
PerlRuntimePoolfor web server use cases - Declare
THREADINGparameter as"THREAD-ISOLATED"in ScriptEngineFactory - Update JSR-223 to use pool automatically
- Benchmark: target 1000+ concurrent requests without interference
| Feature | Reason | Alternative |
|---|---|---|
True fork() |
JVM cannot split processes (JRuby proved this) | system(), threads, ProcessBuilder |
| Classloader isolation | Type barriers, memory overhead, debugging pain (Jython lesson) | ThreadLocal + PerlRuntime instances |
| Copy-on-write cloning | JVM has no COW memory; deep copy is the only option | Full clone at thread creation (like Perl 5 ithreads) |
fork() in test harnesses |
Many CPAN test suites use fork | Thread-based test runner (jprove) |
open(PIPE, "-|") |
Requires fork | ProcessBuilder wrapper |
ThreadLocal access is ~2-5ns per call on modern JVMs. With an estimated 10-50 ThreadLocal lookups per Perl statement, overhead is 20-250ns per statement. Perl statement execution is typically 100ns-10µs, so overhead is 0.25-25% — acceptable for the concurrency it enables.
Optimization: Hot paths (like GlobalVariable.getGlobalVariable) can
cache the PerlRuntime reference in a local variable at method entry.
Perl 5 ithreads clone the entire interpreter (~100ms for a complex program). PerlOnJava thread creation involves:
- Deep-copying
VariableStoragemaps (~1-10ms depending on variable count) - Deep-copying closure captures + reference fixup (~0.5-5ms)
- Clearing method caches in child (~0.1ms)
- Calling
CLONEon all packages (~0.1-1ms) - Creating a virtual thread (~0.1ms)
- Linking shared variables (~0.1ms)
Expected: 2-15ms per thread creation — still significantly faster than Perl 5.
Each PerlRuntime holds copies of all global variables. For a typical program
with ~1000 global variables, each runtime adds ~100KB-1MB of memory. This is
comparable to Perl 5 interpreter clones.
- Runtime isolation: create 2 runtimes, modify globals in one, verify other is unchanged
- SharedVariable: concurrent increment from N threads, verify final count
- CallerStack isolation: nested calls in separate runtimes don't interfere
- Closure capture isolation: modify captured lexical in child, verify parent is unchanged
- Method cache isolation: modify
@ISAin child, verify parent's method resolution unchanged
threads->create()with arguments and return valuesthreads::sharedwith lock/cond_wait/cond_signal- JSR-223 concurrent eval from multiple Java threads
- Existing test suite passes unchanged (regression)
- CLONE method called correctly: DBI handle invalidation in child
- Closure with
:sharedvariable: both threads see same value - Nested closures: captured-of-captured variables cloned correctly
- 100 concurrent threads modifying shared variables
- 1000 concurrent JSR-223 eval calls
- Thread creation/destruction churn
- Run
perl5_t/t/op/threads.t(adapted for PerlOnJava) - Run representative CPAN module tests that use threads
- ✅ Multiple
PerlRuntimeinstances run concurrently without interference - ✅
threads->create()andthreads->join()work for basic patterns - ✅
threads::sharedvariables are correctly synchronized - ✅ JSR-223 can handle concurrent
eval()calls - ✅
fork()returns undef with clear error message (not crash) - ✅ No regression in existing single-threaded functionality
- ✅ Thread creation < 10ms for typical programs
- ✅ < 5% performance overhead for single-threaded programs
| Phase | Duration | Deliverable | Risk |
|---|---|---|---|
| 0: Thread-Safety Audit | 2 weeks | Global lock safety | Low |
| 1: PerlRuntime Shell | 4 weeks | ThreadLocal routing | Low |
| 2: De-static I/O | 3 weeks | JSR-223 I/O isolation | Low |
| 3: De-static CallerStack + DynamicScope | 4 weeks | Per-runtime call stacks | Low |
| 4: De-static SpecialBlocks + Regex | 2 weeks | Per-runtime regex/blocks | Low |
| 5: De-static GlobalVariable | 8-12 weeks | Full runtime isolation | High |
| 6: threads Module | 4-6 weeks | Perl use threads works |
Medium |
| 7: Runtime Pool + JSR-223 | 3 weeks | Production web server | Low |
| Total | 30-38 weeks |
Risk Notes:
- Phase 5 is the critical path bottleneck. The estimate of 8 weeks is optimistic; it touches "hundreds of call sites" and requires careful testing. Budget 10-12 weeks.
- Phase 6 closure capture cloning with reference fixup is the most technically challenging part of the implementation. May require additional time for edge cases.
- Phases 0-2 provide immediate value for JSR-223 embedding use cases and should be prioritized.
- Phase 6 is only possible after Phase 5 is complete.
-
Virtual threads vs platform threads: Should
threads->create()default to virtual threads? Virtual threads are cheap but have caveats withsynchronizedblocks (pinning). Test both and choose based on benchmarks. Update: JEP 491 will resolve the pinning issue in future Java releases. Recommend defaulting to virtual threads for I/O-bound Perl code. -
Clone depth: Should thread creation clone
@INC,%INC, loaded modules? Perl 5 clones everything. We could optimize by sharing read-only state. -
Signal handling:
$SIG{INT}etc. — should signals be per-runtime or process-wide? Perl 5 delivers to the main thread. -
Reference fixup performance: The two-pass clone (deep copy + identity map fixup) may be expensive for programs with many closures. Profile Perl 5's
perl_clone()for comparison benchmarks. -
Nested closures: A closure that captures a variable which is itself captured from an outer scope creates chains. The fixup pass must follow these chains transitively. Verify with test cases.
-
Runtime Pool Configuration: For web server scenarios, should expose configurable limits:
maxPoolSize,minPoolSize,maxIdleTime.
- CLONE method priority: Yes, implement in Phase 6. Required for DBI, Storable, and any module that holds external resources. (See Section 5.2)
- Method cache handling: Clear all caches in child at thread creation. Move to per-runtime in Phase 5. (See Section 5.3)
- DBI handles in child: Invalidate automatically for
Closeable/Connectionobjects; module's CLONE method reconnects if needed. (See Section 5.2) exit()semantics: Resolved in Phase 6 design.exit()in a non-main thread exits only that thread viaThreadExitException. (See Section 6)
threads.md— Original threading specification (superseded by this document)fork.md— Fork analysis (superseded by this document)multiplicity.md— Multiplicity design (superseded by this document)jsr223-perlonjava-web.md— Web integration (benefits from Phases 2 and 7)
- JRuby Concurrency Wiki
- Jython Concurrency Guide
- perlthrtut — Perl threading tutorial
- threads — Perl threads module documentation
- threads::shared — Perl shared variable documentation
- JEP 444: Virtual Threads — Java virtual threads specification
- JEP 491: Synchronize Virtual Threads without Pinning — Future fix for synchronized pinning
- Oracle Virtual Threads Guide — Java 21 virtual threads documentation
Review Date: 2026-03-26
Status: APPROVED with minor recommendations
- Strategy is sound: The JRuby-inspired ThreadLocal + PerlRuntime model is correct.
- Global state inventory verified: All static mutable state correctly identified.
- Perl 5 compatibility achievable: ithreads semantics can be faithfully implemented.
- Virtual threads recommended: Use
Thread.ofVirtual()withReentrantLock(notsynchronized). - fork() approach proven: JRuby's 20-year track record validates returning undef.
- Thread creation: 2-15ms (5-50x faster than Perl 5)
- Single-threaded overhead: 0.25-25% from ThreadLocal routing (acceptable)
- Memory per runtime: ~100KB-1MB (comparable to Perl 5)
- Added
threads->exit()semantics (Section 6) - Added
threads->yield()andthreads->tid()to API (Section 6) - Added regex state isolation detail (Section 4, Phase 4)
- Updated virtual thread pinning caveat with JEP 491 reference
- Updated timeline with risk assessment
- Moved resolved questions out of open questions