This document explores implementation approaches for perl -d style debugging in PerlOnJava, supporting traditional Perl debugger semantics including breakpoints, single-stepping, variable inspection, and the standard perl5db.pl debugger.
When Perl is invoked with -d, the interpreter enables special debugging hooks:
-
$ENV{PERL5DB}injection - Before the first line, Perl inserts:BEGIN { require 'perl5db.pl' }
-
Source line storage -
@{"_<$filename"}holds source lines; values are magical (non-zero = breakable) -
Breakpoint hash -
%{"_<$filename"}stores breakpoints/actions keyed by line number -
Subroutine tracking -
%DB::submapssubname → "filename:startline-endline" -
DB::DB() hook - Called before each executable statement when
$DB::trace,$DB::single, or$DB::signalis true -
DB::sub() hook - All subroutine calls are routed through
&DB::subwith$DB::subidentifying the target -
@DB::args - When
caller()is called from package DB, args are copied here
| Bit | Value | Meaning |
|---|---|---|
| 0x01 | 1 | Debug subroutine enter/exit |
| 0x02 | 2 | Line-by-line debugging (call DB::DB) |
| 0x04 | 4 | Switch off optimizations |
| 0x08 | 8 | Preserve data for inspection |
| 0x10 | 16 | Keep sub definition line info |
| 0x20 | 32 | Start with single-step on |
| 0x80 | 128 | Report goto &sub |
| 0x100 | 256 | Informative eval "file" names |
| 0x200 | 512 | Informative anonymous sub names |
| 0x400 | 1024 | Save source lines |
Concept: When -d is used, force interpreter mode and inject DEBUG opcodes at statement boundaries. All debugger logic lives in the DEBUG opcode handler - no changes to the interpreter loop itself.
┌─────────────────────────────────────────────────────────────────┐
│ -d flag │
│ │ │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ Set global debugMode flag │ │
│ │ Force --interpreter mode │ │
│ └─────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ Parser │ │
│ │ - Store source lines │ │
│ │ - Mark breakable lines │ │
│ │ - Add debug info to AST │ │
│ └─────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ BytecodeCompiler │ │
│ │ - Emit DEBUG opcode at │ │
│ │ each statement boundary │ │
│ └─────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ BytecodeInterpreter │ │
│ │ (NO changes to loop) │ │
│ │ │ │
│ │ case DEBUG: │ │
│ │ DebugHooks.debug(...) │ │
│ └─────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ DebugHooks.java │ │
│ │ - Check breakpoint table │ │
│ │ - Check $DB::single/trace │ │
│ │ - Call DB::DB() if needed │ │
│ │ - Handle step/next/cont │ │
│ └─────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
-
Global static flag:
DebugState.debugMode- checked only at startup to decide compilation mode -
Force interpreter mode: All code runs through interpreter when debugging (simpler, single debug implementation)
-
DEBUG opcode: Emitted by
BytecodeCompilerat statement boundaries whendebugModeis true -
No interpreter loop changes: DEBUG is just another opcode - all logic in its handler
-
Breakpoint table:
DebugState.breakpoints- aSet<String>of"file:line"checked inside DEBUG handler -
Zero overhead when not debugging: No DEBUG opcodes emitted, no checks in hot path
Opcodes.java:
public class Opcodes {
// ... existing opcodes ...
public static final byte DEBUG = (byte) 0xFE; // High value, rare collision
// Format: DEBUG [fileIndex:2] [line:4]
}BytecodeCompiler.java (AST → interpreter bytecode):
public void visitStatement(StatementNode node) {
if (DebugState.debugMode) {
// Emit DEBUG opcode before each statement
emit(Opcodes.DEBUG);
emitShort(getFileIndex(node.getFilename()));
emitInt(node.getLine());
}
// ... emit normal statement bytecode ...
}BytecodeInterpreter.java (just add case, no loop changes):
switch (opcode) {
// ... existing cases ...
case Opcodes.DEBUG:
int fileIdx = readShort(bytecode, pc); pc += 2;
int line = readInt(bytecode, pc); pc += 4;
DebugHooks.debug(code, registers, fileIdx, line);
break;
}DebugHooks.java (all debugger logic here):
public class DebugHooks {
public static void debug(InterpretedCode code, RuntimeBase[] registers,
int fileIdx, int line) {
String file = code.stringPool[fileIdx];
String key = file + ":" + line;
// Fast path: no debugging active
if (!DebugState.single && !DebugState.trace && !DebugState.signal
&& !DebugState.breakpoints.contains(key)) {
return;
}
// Set up DB:: variables
GlobalVariable.getGlobalScalar("DB::filename").set(file);
GlobalVariable.getGlobalScalar("DB::line").set(line);
// Check breakpoint condition if any
if (DebugState.breakpoints.contains(key)) {
String condition = DebugState.breakpointConditions.get(key);
if (condition != null && !evalCondition(condition)) {
return;
}
}
// Call DB::DB()
callDbDb();
}
private static void callDbDb() {
// Look up &DB::DB and call it
RuntimeScalar dbdb = GlobalVariable.getGlobalCodeRef("DB::DB");
if (dbdb.getDefinedBoolean()) {
RuntimeCode.apply(dbdb, "", new RuntimeArray(), RuntimeContextType.VOID);
}
}
}DebugState.java:
public class DebugState {
// Set at startup, controls compilation mode
public static boolean debugMode = false;
// Runtime debug flags (set by Perl code via $DB::single etc.)
public static volatile boolean single = false;
public static volatile boolean trace = false;
public static volatile boolean signal = false;
// Breakpoint table: "file:line" -> true
public static final Set<String> breakpoints = ConcurrentHashMap.newKeySet();
// Conditional breakpoints: "file:line" -> "condition_expr"
public static final Map<String, String> breakpointConditions = new ConcurrentHashMap<>();
// Source lines: filename -> String[]
public static final Map<String, String[]> sourceLines = new ConcurrentHashMap<>();
// Breakable lines: filename -> Set<Integer>
public static final Map<String, Set<Integer>> breakableLines = new ConcurrentHashMap<>();
}Recommended: org.perlonjava.runtime.debugger
src/main/java/org/perlonjava/runtime/debugger/
├── DebugState.java # Flags, breakpoints, source storage
├── DebugHooks.java # debug() method, DB::DB() calls
└── SourceLineArray.java # Magical @{"_<$filename"} implementation
- Clean separation: All debug logic in
DebugHooks, interpreter stays simple - Zero overhead: When not debugging, no DEBUG opcodes exist
- Easy breakpoints: Just check
breakpoints.contains(key)in DEBUG handler - Reusable for JVM: Later, JVM backend can call same
DebugHooks.debug()method - Single implementation: Only interpreter needs debugging, one code path to maintain
When adding debug support to JVM backend, emit calls to same DebugHooks.debug():
// In EmitStatement.java (JVM codegen)
if (DebugState.debugMode) {
mv.visitLdcInsn(fileIndex);
mv.visitLdcInsn(line);
mv.visitVarInsn(ALOAD, codeRef);
mv.visitVarInsn(ALOAD, registers);
mv.visitMethodInsn(INVOKESTATIC, "org/perlonjava/runtime/debugger/DebugHooks",
"debug", "(Lorg/perlonjava/backend/bytecode/InterpretedCode;[Lorg/perlonjava/runtime/runtimetypes/RuntimeBase;II)V", false);
}Same debugger, both backends.
Concept: At compile time, inject Perl code that implements the debugging hooks without modifying the Java runtime.
How it works:
-
Source line storage: During parsing, populate
@{"_<$filename"}with source lines -
Statement instrumentation: After each statement, conditionally call
DB::DB():# Original: $x = foo(); # Instrumented: $x = foo(); DB::DB() if $DB::single || $DB::trace || $DB::signal;
-
Subroutine wrapping: Replace all sub calls with routing through
DB::sub:# Original: foo($arg); # Instrumented: do { local $DB::sub = 'main::foo'; &DB::sub($arg) };
Advantages:
- No Java changes required
- Works with existing
perl5db.pl
Disadvantages:
- Performance overhead (extra code per statement)
- Complex AST transformation
Concept: Leverage Java's Debug Wire Protocol for IDE integration.
Current state: Documented in dev/design/jdwp_debugger.md - requires line number tables in generated bytecode.
Limitations:
- Debugs at Java level, not Perl level
- Variable inspection shows Java objects, not Perl semantics
- No Perl debugger command language
Concept: Create a clean Java API for debugging that can be consumed by multiple frontends.
public interface DebugEventListener {
void onStatementBegin(String file, int line, RuntimeScalar[] locals);
void onSubroutineEnter(String name, RuntimeArray args);
void onSubroutineExit(String name, RuntimeList result);
void onBreakpoint(String file, int line);
void onException(RuntimeScalar error);
}
public class DebugController {
public void setBreakpoint(String file, int line, String condition);
public void removeBreakpoint(String file, int line);
public void stepInto();
public void stepOver();
public void stepOut();
public void resume();
public RuntimeScalar evaluate(String expr);
public Map<String, RuntimeScalar> getLocals();
public List<StackFrame> getCallStack();
}Frontends:
perl5db.plcompatibility layer- Custom CLI debugger
- IDE plugins via Debug Adapter Protocol (DAP)
- Web-based debugger UI
-
Parse
-dflag inArgumentParser.java- Set
DebugState.debugMode = true - Force
--interpretermode - Set
$^Pappropriately
- Set
-
Create
DebugState.javawith debug flags and breakpoint tables -
Create
DebugHooks.javawith minimaldebug()method -
Add DEBUG opcode to
Opcodes.java -
Emit DEBUG opcode in
BytecodeCompiler.javaat statement boundaries -
Handle DEBUG opcode in
BytecodeInterpreter.java(single case statement)
- Store source lines during parsing into
DebugState.sourceLines - Track breakable lines (statements vs. comments/blank lines)
- Implement
@{"_<$filename"}as magical array backed byDebugState - Implement
%{"_<$filename"}for breakpoint storage
- Implement
$DB::single,$DB::trace,$DB::signalas special tied variables - Implement
$DB::filename,$DB::line(set by DEBUG opcode) - Implement
@DB::argssupport incaller() - Implement
%DB::subfor subroutine location tracking
-
Test with minimal debugger:
sub DB::DB { print "At $DB::filename:$DB::line\n"; my $cmd = <STDIN>; $DB::single = 1 if $cmd =~ /^s/; # step }
-
Test breakpoints: Setting via
%{"_<$filename"} -
Test stepping:
$DB::singlecontrol flow
- Inject
BEGIN { require 'perl5db.pl' }when-dis used - Implement
DB::sub()routing for subroutine tracing - Test with actual perl5db.pl
- Fix any missing features
- Document differences/limitations
- Emit
DebugHooks.debug()calls inEmitStatement.java - Share same
DebugStateandDebugHooksclasses - Allow mixed interpreted/compiled debugging
Global Java statics in DebugState.java:
debugMode- set once at startup, controls opcode emissionsingle,trace,signal- volatile, modified by Perl code at runtimebreakpoints-ConcurrentHashMap.newKeySet()for thread-safety
Inside DEBUG opcode handler:
// Fast path: single check for common case
if (!DebugState.single && !DebugState.trace && !DebugState.signal
&& !DebugState.breakpoints.contains(key)) {
return; // No-op, minimal overhead
}When debugging is active but no breakpoint at this line, still very fast (hash lookup).
Implement as special tied array:
- String value: source line text
- Numeric context: 0 for non-breakable, non-zero (line number or address) for breakable
- Populated by parser during lexing
public class SourceLineArray extends RuntimeArray {
private String filename;
@Override
public RuntimeScalar get(int index) {
String text = DebugState.sourceLines.get(filename)[index];
boolean breakable = DebugState.breakableLines.get(filename).contains(index);
return new MagicalSourceLine(text, breakable ? index : 0);
}
}In BytecodeCompiler, emit DEBUG opcode when visiting:
StatementNode(most statements)BlockNode(entering blocks)- NOT inside expressions (only at statement level)
This matches Perl's behavior where breakpoints are only valid on statement-starting lines.
- Unit tests for debug state management
- Integration tests with minimal DB::DB
- Compatibility tests with perl5db.pl commands:
n(next),s(step),c(continue)b(breakpoint),B(delete breakpoint)p(print),x(dump)l(list),v(view)T(stack trace)
- Regression tests ensuring debug mode doesn't break normal execution
-
Eval debugging:
eval "string"creates dynamic source - need to store in@{"_<(eval N)"}and emit DEBUG opcodes. The interpreter already handles eval, so this should work naturally. -
DB::sub routing: Should all subroutine calls go through
DB::sub()when debugging? This is needed for subroutine enter/exit tracing but adds overhead. Could be a separate$^Pbit. -
Profiling integration: The DEBUG opcode infrastructure could support profiling (like Devel::NYTProf) by tracking time between DEBUG calls. Consider adding optional timing hooks.
-
Remote debugging: perl5db.pl's
RemotePortoption - may need socket I/O support in the debugger. Lower priority. -
Step over/out: Implementing
n(next) andr(return) requires tracking call depth. AddDebugState.stepOverDepthto skip DEBUG calls until returning to target depth.
-
Phase 1: Infrastructure (complete)
- DEBUG opcode (376) in
Opcodes.java -dflag inArgumentParser.javasetsdebugMode=true, forces interpreterBytecodeCompileremits DEBUG at statement boundaries whendebugMode=trueBytecodeInterpreterhandles DEBUG opcode, callsDebugHooks.debug()DebugState.java- global debug flags, breakpoints, source storageDebugHooks.java- command loop with n/s/c/q/l/b/B/L/h commands- Source line extraction from tokens (
ErrorMessageUtil.extractSourceLines()) lcommand shows source with==>current line marker- Compile-time statements (
use/no) correctly skipped viacompileTimeOnlyannotation - Infrastructure nodes in BEGIN blocks skipped via
skipDebugannotation
- DEBUG opcode (376) in
-
Phase 2: Source Line Support (partially complete, 2024-03-10)
- Store source lines during parsing
- Skip compile-time statements (use/no)
- Display subroutine names when stepping into code (e.g.,
main::foo(/file:line))- Added
subNameStackinDebugStatefor tracking current subroutine - Modified
RuntimeCode.apply()to track subroutine entry/exit (zero overhead when not debugging) - Uses
NameNormalizer.normalizeVariableName()for consistent name formatting
- Added
- Track breakable lines (statements vs comments)
- Implement
@{"_<$filename"}magical array - Implement
%{"_<$filename"}for breakpoint storage
| Command | Description |
|---|---|
n |
Next (step over) |
s |
Step into |
r |
Return (step out) |
c [line] |
Continue (optionally to line) |
q |
Quit |
l [range] |
List source (l 10-20 or l 15) |
. |
Show current line |
b [line] |
Set breakpoint |
B [line] |
Delete breakpoint (B * = all) |
L |
List breakpoints |
T |
Stack trace |
p expr |
Print expression |
x expr |
Dump expression |
h |
Help |
-
Phase 2 completion:
- Implement
@{"_<$filename"}magical array - Track breakable lines
- Implement
-
Phase 3: Debug Variables
$DB::single,$DB::trace,$DB::signalas tied variables (partially done - synced from Java)@DB::argssupport incaller()(done)%DB::subfor subroutine location tracking (done)
-
Phase 5: perl5db.pl Compatibility
- Test with actual perl5db.pl
- Package scoping after block-local packages: After
{ package Foo; ... }block ends, the debugger display may showFoo::instead ofmain::for subsequent statements. This is a package scoping issue in the interpreter, not debugger-specific.
perldoc perldebug- User documentationperldoc perldebguts- Implementation detailsperldoc perldebtut- Tutorialperl5/lib/perl5db.pl- Reference implementation (~10,000 lines)dev/design/jdwp_debugger.md- Existing JDWP notesdev/design/interpreter.md- Interpreter architecture