The || operator in Perl has complex context propagation semantics that differ from most other operators. When used in list context, it exhibits asymmetric behavior that's challenging to implement correctly in bytecode.
# From t/op/list.t line 119
$x = 666;
@a = ($x == 12345 || (1,2,3));
# Expected: @a = (1, 2, 3)
# Got: @a = (3) # Only last element@a = (0 || (1,2,3)); # Returns (1,2,3) - list from RHS
@a = (1 || (1,2,3)); # Returns (1) - scalar from LHS- Left side: Always evaluated in scalar context (needed for boolean test)
- Right side: Evaluated in caller's context (list or scalar)
- Result: Depends on which branch is taken
- If LHS is true: returns LHS value (scalar)
- If LHS is false: returns RHS value (in caller's context)
At the merge point (endLabel), the stack must have consistent types:
// Branch 1 (LHS true):
node.left.accept(SCALAR); // Stack: [RuntimeScalar]
goto endLabel;
// Branch 2 (LHS false):
node.right.accept(LIST); // Stack: [RuntimeList] ❌ TYPE MISMATCH!
endLabel:
// Stack type must be consistent!When we tried to fix this by using caller's context for RHS:
node.right.accept(emitterVisitor); // Use caller's contextResult: JVM VerifyError - "Expecting a stackmap frame at branch target"
- The JVM requires both branches to produce the same stack type
- RuntimeScalar vs RuntimeList causes verification failure
node.right.accept(emitterVisitor); // Use caller's contextProblem: Stack type mismatch at merge point causes VerifyError
// Pseudo-code:
if (context == LIST) {
// Convert scalar to list if needed
mv.visitMethodInsn("scalarToList");
}Problem: Complex, requires context-aware conversion logic
// Always return RuntimeBase, let caller convert
node.left.accept(emitterVisitor); // Returns RuntimeBase
node.right.accept(emitterVisitor); // Returns RuntimeBase
// Caller handles getList() or getScalar()Problem: Requires refactoring entire operator emission system
The fundamental issue is that Perl's || operator violates the principle of uniform context propagation:
- Most operators propagate context uniformly to all operands
||has asymmetric context: LHS always scalar, RHS depends on caller
This asymmetry is difficult to express in statically-typed bytecode where branch merge points require type consistency.
Label endLabel = new Label();
Label convertLabel = new Label();
// Evaluate LHS in scalar context
node.left.accept(emitterVisitor.with(SCALAR));
mv.visitInsn(DUP);
mv.visitMethodInsn("getBoolean");
mv.visitJumpInsn(compareOpcode, convertLabel);
// LHS false: evaluate RHS in caller's context
mv.visitInsn(POP);
node.right.accept(emitterVisitor); // Caller's context
mv.visitJumpInsn(GOTO, endLabel);
// LHS true: convert scalar to caller's context if needed
convertLabel:
if (emitterVisitor.context == LIST) {
// Convert RuntimeScalar to RuntimeList
mv.visitMethodInsn("scalarToList");
}
endLabel:
// Stack now has consistent type// Return RuntimeBase (superclass of both Scalar and List)
// Let the caller extract the appropriate type
node.left.accept(emitterVisitor.with(SCALAR));
// ... boolean test ...
node.right.accept(emitterVisitor.with(SCALAR)); // Always scalar
endLabel:
// Stack: [RuntimeBase]
// Caller converts: result.getList() or result.getScalar()// Transform at AST level:
// @a = (COND || LIST)
// becomes:
// @a = COND ? (COND_VALUE) : LIST
// This makes the context explicit in the ASTOption A (Context Conversion at Merge) is the most correct approach:
-
Pros:
- Matches Perl semantics exactly
- No AST transformation needed
- Localized to EmitLogicalOperator
-
Cons:
- Requires implementing
scalarToList()conversion - Slightly more complex bytecode
- Requires implementing
-
Implementation:
static void emitLogicalOperator(EmitterVisitor emitterVisitor, BinaryOperatorNode node, int compareOpcode, String getBoolean) { MethodVisitor mv = emitterVisitor.ctx.mv; Label endLabel = new Label(); Label convertLabel = new Label(); RuntimeContextType callerContext = emitterVisitor.context; // LHS always scalar (for boolean test) node.left.accept(emitterVisitor.with(SCALAR)); mv.visitInsn(DUP); mv.visitMethodInsn(INVOKEVIRTUAL, "org/perlonjava/runtimetypes/RuntimeBase", getBoolean, "()Z", false); mv.visitJumpInsn(compareOpcode, convertLabel); // LHS false: evaluate RHS in caller's context mv.visitInsn(POP); node.right.accept(emitterVisitor.with(callerContext)); mv.visitJumpInsn(GOTO, endLabel); // LHS true: convert to caller's context if needed mv.visitLabel(convertLabel); if (callerContext == RuntimeContextType.LIST) { // Convert scalar to single-element list mv.visitMethodInsn(INVOKEVIRTUAL, "org/perlonjava/runtimetypes/RuntimeScalar", "scalarToList", "()Lorg/perlonjava/runtimetypes/RuntimeList;", false); } mv.visitLabel(endLabel); EmitOperator.handleVoidContext(emitterVisitor); }
Add to RuntimeScalar.java:
/**
* Convert a scalar to a single-element list.
* Used for context conversion in logical operators.
*/
public RuntimeList scalarToList() {
RuntimeList list = new RuntimeList();
list.elements.add(this);
return list;
}# Test cases to verify:
@a = (0 || (1,2,3)); # Should be (1,2,3)
@a = (1 || (1,2,3)); # Should be (1)
@a = ("" || (1,2,3)); # Should be (1,2,3)
@a = ("x" || (1,2,3)); # Should be ("x")
# Edge cases:
@a = (0 || ()); # Should be ()
@a = (1 || ()); # Should be (1)
@a = (0 || (1)); # Should be (1)- Affected operators:
||,&&,//,or,and(all useemitLogicalOperator) - Test files:
t/op/list.t(test 39), potentially others - Risk: Medium - requires careful bytecode generation and testing
OPEN - Requires implementation of Option A (Context Conversion at Merge)
- Similar issue may exist in ternary operator
? : - May affect other short-circuit operators
- Test file:
t/op/list.tline 119 - Code:
EmitLogicalOperator.javaline 98 - Perl docs:
perlop- "Logical Or"