Rework the position handling in the Crucible code. by sauclovian-g · Pull Request #3291 · GaloisInc/saw-script

sauclovian-g · 2026-06-03T06:29:21Z

See the individual commit messages (some of which are rather lengthy) for details.

Roughly speaking: nearly all the positions the Crucible code was carrying around were the position of the llvm_verify call, which isn't super helpful. This fixes up many of them.

Part of #3116; alas, only part.

This is not done yet; I'm pushing this much so people can start looking at it. Most of what's here should be pretty much stable at this point. I've been considering maybe splitting the "interpreter: Add hacks for overall crucible-setup source positions" commit in two, one part to provide the positions and the other to apply them into the Crucible code, but unless anyone thinks it's specifically worthwhile it's probably not going to happen.

You'll almost certainly want to read it one commit at a time.

Note that we have almost zero test coverage of the messages these positions flow into. Something should be done about that too...

RyanGlScott · 2026-06-03T20:07:24Z

+    -- | The caller when working in an override spec.
+    --   Note: because we don't apply overrides when _in_ overrides,
+    --   this should never need to have more than one thing in it.
+    conditionContext :: Maybe Text


I assume that this is Nothing if you're not in an override spec? If so, worth stating that here in the Haddocks.

I thought that would follow from the text, but ok...

RyanGlScott · 2026-06-03T20:07:46Z

- deriving (Show, Eq, Ord)
+ deriving (Show, Eq)
+
+-- | Insert a `conditionContext` value. There shouuld not already be one.


Suggested change

-- | Insert a `conditionContext` value. There shouuld not already be one.

-- | Insert a `conditionContext` value. There should not already be one.

RyanGlScott · 2026-06-03T20:11:52Z

       let proofgoal = ProofGoal
                       { goalNum  = n
-                       , goalType = MS.conditionType md
+                       , goalType = Text.unpack $ MS.conditionType md


Since conditionType is now a Text value, goalType should probably be Text as well...

Not in this branch, which is already too big... will get to it eventually

RyanGlScott · 2026-06-03T20:14:57Z

-     liftIO $ W4.setCurrentProgramLoc sym loc
+     liftIO $ W4.setCurrentProgramLoc sym saveW4Loc


Now that we are no longer using loc here, do we need to pass it as an argument to this function? I see that loc is used further down, but I wonder if it would be more consistent to use saveW4Loc there instead.

I am not at all sure it's the same position. I don't have a good idea of what gets stuffed into What4's position global when, and there's a reasonably good chance it's not the one we want, so I've been trying to avoid relying on it.

...or actually I'm pretty sure it is not going to be the one we want. srcPos is the interpreter-provided position of the setup do-block, and there is no point where that's been knowingly inserted into the W4 global.

RyanGlScott · 2026-06-03T20:16:10Z


 ghost_value ::
+  Pos.Pos ->
+  Text ->


Worth documenting here what purpose this Text argument serves.

RyanGlScott · 2026-06-03T20:20:25Z

  [Lemma] {- ^ overrides -} ->
  Bool {- ^ path sat checking -} ->
-  JVMSetupM () ->
+  (Pos.WithPos (JVMSetupM ())) ->


The parentheses around Pos.WithPos ... here (and elsewhere in this commit) are redundant.

Not sure what triggered that. Fixed

RyanGlScott · 2026-06-03T20:32:08Z

+  Pos                   {- ^ Source position for the spec -} ->
  LLVMCrucibleSetupM () {- ^ Boundary specification -} ->


Is there a reason not to use WithPos (LLVMCrucibleSetupM ()) here like llvm_verify and friends do?

Not really; I guess because it's an interpreter-induced hack I wanted to eliminate it right at all the entry points from the interpreter. I guess I could change it to pass setupWithPos through and unpack it once. Do you feel strongly about it?

I don't feel strongly about it. My main motivation was to consolidate the number of arguments that we pass to this function, but we could just as well pass two arguments instead of one.

RyanGlScott · 2026-06-03T20:34:54Z

+-- | Get `csSourcePos` with `csExecFunc` as a `W4.ProgramLoc`.
+--
+--   This is not a lens because it's not an invertible transformation.
+csSourceLoc:: CrucibleMethodSpecIR ext -> W4.ProgramLoc


Minor style nit pick:

-csSourceLoc:: CrucibleMethodSpecIR ext -> W4.ProgramLoc +csSourceLoc :: CrucibleMethodSpecIR ext -> W4.ProgramLoc

RyanGlScott · 2026-06-03T20:40:42Z

 enforceDisjointness ::
-  MIRCrucibleContext -> W4.ProgramLoc -> StateSpec -> OverrideMatcher MIR w ()
-enforceDisjointness cc loc ss =
+  MIRCrucibleContext -> Pos.Pos -> Text -> StateSpec -> OverrideMatcher MIR w ()


I wonder if we should just pass the CrucibleMethodSpec as an argument (and extract its csSourcePos/csExecFunc fields in the body of the function) rather than passing Pos/Text as separate arguments. Generally, I'm nervous about passing random Text values as arguments with no explanation as to what they do, as it requires some staring at the surrounding code to figure out what purpose they serve.

Similar comments for enforeCompleteSubstitution.

The names of the arguments are at least some explanation... but then it'll be clearer which position we're using, so maybe it's better that way... will think about it.

RyanGlScott · 2026-06-03T20:43:20Z

  SetupValue {- ^ object -} ->
  Text {- ^ field name -} ->
  Maybe SetupValue {- ^ field value -} ->
+  Text ->


Please add some Haddocks describing what this Text argument is (and elsewhere), especially since it's possible to conflate this with another Text argument that this function takes.

Fixed, speak up if you see more

By conventions, it should be called "ppPrePost" (not "stateCond", not even sure what that meant tbh), and return Text rather than String. As often happens, this led to a bunch of old ratty error messages using it so I fixed a bunch of them up.

Add haddocks, remove unused Ord instance. NFCI

The conditionContext element of ConditionMetadata holds a string describing the caller when working in an override spec. Change it to Maybe Text instead of being a String that's usually "". I believe it to be impossible to ever get more than one caller, because we should never be trying to apply an override while already in one. If that turns out to be possible somehow, it'll now panic. This also fixes an accidental loop in prettyConditionMetadata, which at least in principle could have caused crux-mir-comp to issue infinitely many spaces when printing a method spec.

When adding assumptions from the postcondition of an override, use the position from the ConditionMetadata of each entry, not the position stored in the OverrideMatcher. The latter is, under the best of circumstances, the position of the entire override spec, or (worse) possibly the position of the spec whose function called the override; however, in practice it's just the current llvm/jvm/mir_verify call site, which isn't useful at all. The former is probably also that in practice currently, but at least in principle it could be more accurate.

In addition to not containing particularly useful information (the location of the current llvm/jvm/mir_verify call), after the previous fix it is not actually used for anything any more.

While passing through. (It's a freeform string that reflects something about what kind of assertion we have. Ideally it would be more structured, but that will require a pile of machinery and doesn't seem worthwhile at the moment.)

When checking that allocations are disjoint in the LLVM code, we temporarily set the What4 library's idea of current position to the position of the condition we're looking at, then afterwards set it to some random position passed in. Set it back to what it was instead. This probably makes no difference for the moment, but I want to be able to change what the passed-in position is without affecting unspecified other things happening under the covers in What4.

Now it comes through as "llvm_ghost_value" instead of just "ghost_value". Update test2049.

Digging through the Crucible code reveals that it uses the (runtime) position of the llvm/jvm/mir_verify call for all sorts of things that should be source positions. Some of these can be gotten, approximately or closely enough, by calling "getPosition" in the implementation of SAWScript builtins in the various setup monads, saving the results, and then actually using them instead of just reaching for the closest available thing. However, quite a few aren't really specific to any one thing in the setup block, for example the checks when matching overrides that allocations are disjoint in the pattern expected. For these we really want the source position of the whole setup do-block. So add some hacks to the SAWScript interpreter to make it provide such positions. The core of this consists of three extra FromValue instances, one for each kind of setup block, that extract the position as well as the monadic SetupValue. However, there are some headaches: - Those instances can't just be for a pair of position and Haskell-level monadic value, because there's already an instance for ordinary pairs. It needs to be its own type. It turns out we have an otherwise unused WithPos type that does this well enough. - It turns out the Crucible code requires the position _before_ it evaluates the setup block. This means that the position that we produce into the V*Setup value when we evaluate setup blocks isn't any use. Instead we have to extract a position from an unevaluated monadic value. This irritatingly requires adding positions to a couple value constructors that didn't previously need it. It's possible that eventually the Crucible code can be improved to not need an overall source position for whole setup blocks. Alternatively, it's possible that it can be fixed to not demand the source position before evaluating the setup block. In either case we shouldn't hesitate to simplify or remove these hacks. With a mismatched return value we now get the position of the LLVMSetup block instead of the position of the llvm_verify call, which is not really what we want (it should not be using the overall source position for this) but a step forward of sorts. Update the test output in test_llvm_errors. Also, adding this logic revealed that the remote API server just sets all positions to 'PosInternal "SAWServer"' and thus all errors coming back from inside the Crucible code (and many other places) will just say that instead of reporting anything useful. This is not ideal.

(note that these position types need maintenance now that we have SAWSupport.Position, but I don't want to get into it just now)

Carry around both a Position and a current function name, that being loosely equivalent to the previous W4.ProgramLoc. Lower to W4.ProgramLoc when needed. This preserves the additional info in our positions for some error paths. Also use the SAWScript builtin name systematically as the function name, instead of making things up or using internal function names that don't mean anything to the user. With this and the other recent changes, sometimes we can now get a real source position (e.g. "foo.saw:6:10-15") and a real execution function (e.g. "llvm_verify") and they'll end up printed together ("foo.saw:6:10-15 in llvm_verify") even though the llvm_verify code is probably nowhere near line 6. This is not ideal. However, it beats printing just the location of the llvm_verify call and losing the original source position, or printing wrong or made-up information instead of the real positions. In the long run we shouldn't combine source positions with current execution positions, and probably this should be handled by teaching What4 to not insist on having a function name. But we should systematize the error handling a lot more before taking such steps.

- Fix a few places that were using the whole MethodSpec position (csSourcePos/csSourceLoc) instead of the position of a single condition from an available ConditionMetadata (conditionLoc). - Leave notes about other places that are using the whole MethodSpec position and probably shouldn't be in the long run. If fixing one of these, please be sure to fix all three instances of it... - Also fix some places that should have been using csSourceLoc but had instead inlined its definition. It hadn't been written in the first iteration of all this and some of the places got missed in the initial sweep.

(related to or adjacent to; basically this is stuff that got fixed up while passing through doing the rest of the changes)

sauclovian-g · 2026-06-04T01:49:00Z

That force-push rebases on master

sauclovian-g · 2026-06-04T01:49:44Z

...and this one addresses most of Ryan's first-round comments

sauclovian-g requested review from RyanGlScott, brianhuffman, chathhorn-galois, danmatichuk and qsctr June 3, 2026 06:29

sauclovian-g marked this pull request as draft June 3, 2026 06:51

sauclovian-g mentioned this pull request Jun 3, 2026

crucible-mir-comp uses what4 getCurrentProgramLoc but not setCurrentProgramLoc #3286

Closed

RyanGlScott reviewed Jun 3, 2026

View reviewed changes

sauclovian-g added 14 commits June 3, 2026 21:46

crucible: Tidy ConditionMetadata type.

f5e783a

Add haddocks, remove unused Ord instance. NFCI

crucible: Remove osLocation from OverrideMatcher state.

ad60531

In addition to not containing particularly useful information (the location of the current llvm/jvm/mir_verify call), after the previous fix it is not actually used for anything any more.

crucible: Fix the function names attached to ghost value assertions.

954c187

Now it comes through as "llvm_ghost_value" instead of just "ghost_value". Update test2049.

Add a FunctionOnlyPos to saw-script/saw-central positions.

8866e15

(note that these position types need maintenance now that we have SAWSupport.Position, but I don't want to get into it just now)

CHANGES: Position reporting in the verification summaries has improved.

3773ebd

crucible: Assorted other minor tidyups related to position handling.

3a42c62

(related to or adjacent to; basically this is stuff that got fixed up while passing through doing the rest of the changes)

sauclovian-g force-pushed the 3116-crucible-positions branch from ad08537 to f46b78c Compare June 4, 2026 01:48

sauclovian-g force-pushed the 3116-crucible-positions branch from f46b78c to 3a42c62 Compare June 4, 2026 01:49

	-- \| Insert a `conditionContext` value. There shouuld not already be one.
	-- \| Insert a `conditionContext` value. There should not already be one.

		liftIO $ W4.setCurrentProgramLoc sym loc
		liftIO $ W4.setCurrentProgramLoc sym saveW4Loc

		Pos {- ^ Source position for the spec -} ->
		LLVMCrucibleSetupM () {- ^ Boundary specification -} ->

Uh oh!

Conversation

sauclovian-g commented Jun 3, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sauclovian-g commented Jun 4, 2026

Uh oh!

sauclovian-g commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants