Skip to content

Rework the position handling in the Crucible code.#3291

Draft
sauclovian-g wants to merge 14 commits into
masterfrom
3116-crucible-positions
Draft

Rework the position handling in the Crucible code.#3291
sauclovian-g wants to merge 14 commits into
masterfrom
3116-crucible-positions

Conversation

@sauclovian-g

Copy link
Copy Markdown
Contributor

See the individual commit messages (some of which are rather lengthy) for details.

Roughly speaking: nearly all the positions the Crucible code was carrying around were the position of the llvm_verify call, which isn't super helpful. This fixes up many of them.

Part of #3116; alas, only part.

This is not done yet; I'm pushing this much so people can start looking at it. Most of what's here should be pretty much stable at this point. I've been considering maybe splitting the "interpreter: Add hacks for overall crucible-setup source positions" commit in two, one part to provide the positions and the other to apply them into the Crucible code, but unless anyone thinks it's specifically worthwhile it's probably not going to happen.

You'll almost certainly want to read it one commit at a time.

Note that we have almost zero test coverage of the messages these positions flow into. Something should be done about that too...

Comment on lines +233 to +236
-- | The caller when working in an override spec.
-- Note: because we don't apply overrides when _in_ overrides,
-- this should never need to have more than one thing in it.
conditionContext :: Maybe Text

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that this is Nothing if you're not in an override spec? If so, worth stating that here in the Haddocks.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that would follow from the text, but ok...

deriving (Show, Eq, Ord)
deriving (Show, Eq)

-- | Insert a `conditionContext` value. There shouuld not already be one.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-- | Insert a `conditionContext` value. There shouuld not already be one.
-- | Insert a `conditionContext` value. There should not already be one.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

let proofgoal = ProofGoal
{ goalNum = n
, goalType = MS.conditionType md
, goalType = Text.unpack $ MS.conditionType md

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since conditionType is now a Text value, goalType should probably be Text as well...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in this branch, which is already too big... will get to it eventually

Comment on lines -900 to +901
liftIO $ W4.setCurrentProgramLoc sym loc
liftIO $ W4.setCurrentProgramLoc sym saveW4Loc

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we are no longer using loc here, do we need to pass it as an argument to this function? I see that loc is used further down, but I wonder if it would be more consistent to use saveW4Loc there instead.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not at all sure it's the same position. I don't have a good idea of what gets stuffed into What4's position global when, and there's a reasonably good chance it's not the one we want, so I've been trying to avoid relying on it.

...or actually I'm pretty sure it is not going to be the one we want. srcPos is the interpreter-provided position of the setup do-block, and there is no point where that's been knowingly inserted into the W4 global.


ghost_value ::
Pos.Pos ->
Text ->

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth documenting here what purpose this Text argument serves.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

[Lemma] {- ^ overrides -} ->
Bool {- ^ path sat checking -} ->
JVMSetupM () ->
(Pos.WithPos (JVMSetupM ())) ->

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parentheses around Pos.WithPos ... here (and elsewhere in this commit) are redundant.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what triggered that. Fixed

Comment on lines +533 to 534
Pos {- ^ Source position for the spec -} ->
LLVMCrucibleSetupM () {- ^ Boundary specification -} ->

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to use WithPos (LLVMCrucibleSetupM ()) here like llvm_verify and friends do?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really; I guess because it's an interpreter-induced hack I wanted to eliminate it right at all the entry points from the interpreter. I guess I could change it to pass setupWithPos through and unpack it once. Do you feel strongly about it?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strongly about it. My main motivation was to consolidate the number of arguments that we pass to this function, but we could just as well pass two arguments instead of one.

-- | Get `csSourcePos` with `csExecFunc` as a `W4.ProgramLoc`.
--
-- This is not a lens because it's not an invertible transformation.
csSourceLoc:: CrucibleMethodSpecIR ext -> W4.ProgramLoc

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor style nit pick:

-csSourceLoc:: CrucibleMethodSpecIR ext -> W4.ProgramLoc
+csSourceLoc :: CrucibleMethodSpecIR ext -> W4.ProgramLoc

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops

enforceDisjointness ::
MIRCrucibleContext -> W4.ProgramLoc -> StateSpec -> OverrideMatcher MIR w ()
enforceDisjointness cc loc ss =
MIRCrucibleContext -> Pos.Pos -> Text -> StateSpec -> OverrideMatcher MIR w ()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should just pass the CrucibleMethodSpec as an argument (and extract its csSourcePos/csExecFunc fields in the body of the function) rather than passing Pos/Text as separate arguments. Generally, I'm nervous about passing random Text values as arguments with no explanation as to what they do, as it requires some staring at the surrounding code to figure out what purpose they serve.

Similar comments for enforeCompleteSubstitution.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The names of the arguments are at least some explanation... but then it'll be clearer which position we're using, so maybe it's better that way... will think about it.

SetupValue {- ^ object -} ->
Text {- ^ field name -} ->
Maybe SetupValue {- ^ field value -} ->
Text ->

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some Haddocks describing what this Text argument is (and elsewhere), especially since it's possible to conflate this with another Text argument that this function takes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, speak up if you see more

By conventions, it should be called "ppPrePost" (not "stateCond", not
even sure what that meant tbh), and return Text rather than String.

As often happens, this led to a bunch of old ratty error messages
using it so I fixed a bunch of them up.
Add haddocks, remove unused Ord instance. NFCI
The conditionContext element of ConditionMetadata holds a string
describing the caller when working in an override spec. Change
it to Maybe Text instead of being a String that's usually "".

I believe it to be impossible to ever get more than one caller,
because we should never be trying to apply an override while already
in one. If that turns out to be possible somehow, it'll now panic.

This also fixes an accidental loop in prettyConditionMetadata, which
at least in principle could have caused crux-mir-comp to issue
infinitely many spaces when printing a method spec.
When adding assumptions from the postcondition of an override, use the
position from the ConditionMetadata of each entry, not the position
stored in the OverrideMatcher.

The latter is, under the best of circumstances, the position of the
entire override spec, or (worse) possibly the position of the spec
whose function called the override; however, in practice it's just the
current llvm/jvm/mir_verify call site, which isn't useful at all.

The former is probably also that in practice currently, but at least
in principle it could be more accurate.
In addition to not containing particularly useful information (the
location of the current llvm/jvm/mir_verify call), after the previous
fix it is not actually used for anything any more.
While passing through.

(It's a freeform string that reflects something about what kind of
assertion we have. Ideally it would be more structured, but that will
require a pile of machinery and doesn't seem worthwhile at the
moment.)
When checking that allocations are disjoint in the LLVM code, we
temporarily set the What4 library's idea of current position to the
position of the condition we're looking at, then afterwards set it to
some random position passed in. Set it back to what it was instead.

This probably makes no difference for the moment, but I want to be
able to change what the passed-in position is without affecting
unspecified other things happening under the covers in What4.
Now it comes through as "llvm_ghost_value" instead of just
"ghost_value". Update test2049.
Digging through the Crucible code reveals that it uses the (runtime)
position of the llvm/jvm/mir_verify call for all sorts of things that
should be source positions. Some of these can be gotten, approximately
or closely enough, by calling "getPosition" in the implementation of
SAWScript builtins in the various setup monads, saving the results,
and then actually using them instead of just reaching for the closest
available thing.

However, quite a few aren't really specific to any one thing in the
setup block, for example the checks when matching overrides that
allocations are disjoint in the pattern expected. For these we really
want the source position of the whole setup do-block.

So add some hacks to the SAWScript interpreter to make it provide such
positions. The core of this consists of three extra FromValue
instances, one for each kind of setup block, that extract the position
as well as the monadic SetupValue. However, there are some headaches:

   - Those instances can't just be for a pair of position and
     Haskell-level monadic value, because there's already an instance
     for ordinary pairs. It needs to be its own type. It turns out
     we have an otherwise unused WithPos type that does this well
     enough.

   - It turns out the Crucible code requires the position _before_ it
     evaluates the setup block. This means that the position that we
     produce into the V*Setup value when we evaluate setup blocks
     isn't any use. Instead we have to extract a position from an
     unevaluated monadic value. This irritatingly requires adding
     positions to a couple value constructors that didn't previously
     need it.

It's possible that eventually the Crucible code can be improved to not
need an overall source position for whole setup blocks. Alternatively,
it's possible that it can be fixed to not demand the source position
before evaluating the setup block. In either case we shouldn't
hesitate to simplify or remove these hacks.

With a mismatched return value we now get the position of the
LLVMSetup block instead of the position of the llvm_verify call, which
is not really what we want (it should not be using the overall source
position for this) but a step forward of sorts. Update the test output
in test_llvm_errors.

Also, adding this logic revealed that the remote API server just sets
all positions to 'PosInternal "SAWServer"' and thus all errors coming
back from inside the Crucible code (and many other places) will just
say that instead of reporting anything useful. This is not ideal.
(note that these position types need maintenance now that we have
SAWSupport.Position, but I don't want to get into it just now)
Carry around both a Position and a current function name, that being
loosely equivalent to the previous W4.ProgramLoc. Lower to
W4.ProgramLoc when needed. This preserves the additional info in our
positions for some error paths.

Also use the SAWScript builtin name systematically as the function
name, instead of making things up or using internal function names
that don't mean anything to the user.

With this and the other recent changes, sometimes we can now get a
real source position (e.g. "foo.saw:6:10-15") and a real execution
function (e.g. "llvm_verify") and they'll end up printed together
("foo.saw:6:10-15 in llvm_verify") even though the llvm_verify code is
probably nowhere near line 6. This is not ideal. However, it beats
printing just the location of the llvm_verify call and losing the
original source position, or printing wrong or made-up information
instead of the real positions.

In the long run we shouldn't combine source positions with current
execution positions, and probably this should be handled by teaching
What4 to not insist on having a function name. But we should
systematize the error handling a lot more before taking such steps.
- Fix a few places that were using the whole MethodSpec position
  (csSourcePos/csSourceLoc) instead of the position of a single
  condition from an available ConditionMetadata (conditionLoc).

- Leave notes about other places that are using the whole MethodSpec
  position and probably shouldn't be in the long run. If fixing one
  of these, please be sure to fix all three instances of it...

- Also fix some places that should have been using csSourceLoc but had
  instead inlined its definition. It hadn't been written in the first
  iteration of all this and some of the places got missed in the
  initial sweep.
(related to or adjacent to; basically this is stuff that got fixed up
while passing through doing the rest of the changes)
@sauclovian-g sauclovian-g force-pushed the 3116-crucible-positions branch from ad08537 to f46b78c Compare June 4, 2026 01:48
@sauclovian-g

Copy link
Copy Markdown
Contributor Author

That force-push rebases on master

@sauclovian-g sauclovian-g force-pushed the 3116-crucible-positions branch from f46b78c to 3a42c62 Compare June 4, 2026 01:49
@sauclovian-g

Copy link
Copy Markdown
Contributor Author

...and this one addresses most of Ryan's first-round comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants