Rework the position handling in the Crucible code.#3291
Conversation
| -- | The caller when working in an override spec. | ||
| -- Note: because we don't apply overrides when _in_ overrides, | ||
| -- this should never need to have more than one thing in it. | ||
| conditionContext :: Maybe Text |
There was a problem hiding this comment.
I assume that this is Nothing if you're not in an override spec? If so, worth stating that here in the Haddocks.
There was a problem hiding this comment.
I thought that would follow from the text, but ok...
| deriving (Show, Eq, Ord) | ||
| deriving (Show, Eq) | ||
|
|
||
| -- | Insert a `conditionContext` value. There shouuld not already be one. |
There was a problem hiding this comment.
| -- | Insert a `conditionContext` value. There shouuld not already be one. | |
| -- | Insert a `conditionContext` value. There should not already be one. |
| let proofgoal = ProofGoal | ||
| { goalNum = n | ||
| , goalType = MS.conditionType md | ||
| , goalType = Text.unpack $ MS.conditionType md |
There was a problem hiding this comment.
Since conditionType is now a Text value, goalType should probably be Text as well...
There was a problem hiding this comment.
Not in this branch, which is already too big... will get to it eventually
| liftIO $ W4.setCurrentProgramLoc sym loc | ||
| liftIO $ W4.setCurrentProgramLoc sym saveW4Loc |
There was a problem hiding this comment.
Now that we are no longer using loc here, do we need to pass it as an argument to this function? I see that loc is used further down, but I wonder if it would be more consistent to use saveW4Loc there instead.
There was a problem hiding this comment.
I am not at all sure it's the same position. I don't have a good idea of what gets stuffed into What4's position global when, and there's a reasonably good chance it's not the one we want, so I've been trying to avoid relying on it.
...or actually I'm pretty sure it is not going to be the one we want. srcPos is the interpreter-provided position of the setup do-block, and there is no point where that's been knowingly inserted into the W4 global.
|
|
||
| ghost_value :: | ||
| Pos.Pos -> | ||
| Text -> |
There was a problem hiding this comment.
Worth documenting here what purpose this Text argument serves.
| [Lemma] {- ^ overrides -} -> | ||
| Bool {- ^ path sat checking -} -> | ||
| JVMSetupM () -> | ||
| (Pos.WithPos (JVMSetupM ())) -> |
There was a problem hiding this comment.
The parentheses around Pos.WithPos ... here (and elsewhere in this commit) are redundant.
There was a problem hiding this comment.
Not sure what triggered that. Fixed
| Pos {- ^ Source position for the spec -} -> | ||
| LLVMCrucibleSetupM () {- ^ Boundary specification -} -> |
There was a problem hiding this comment.
Is there a reason not to use WithPos (LLVMCrucibleSetupM ()) here like llvm_verify and friends do?
There was a problem hiding this comment.
Not really; I guess because it's an interpreter-induced hack I wanted to eliminate it right at all the entry points from the interpreter. I guess I could change it to pass setupWithPos through and unpack it once. Do you feel strongly about it?
There was a problem hiding this comment.
I don't feel strongly about it. My main motivation was to consolidate the number of arguments that we pass to this function, but we could just as well pass two arguments instead of one.
| -- | Get `csSourcePos` with `csExecFunc` as a `W4.ProgramLoc`. | ||
| -- | ||
| -- This is not a lens because it's not an invertible transformation. | ||
| csSourceLoc:: CrucibleMethodSpecIR ext -> W4.ProgramLoc |
There was a problem hiding this comment.
Minor style nit pick:
-csSourceLoc:: CrucibleMethodSpecIR ext -> W4.ProgramLoc
+csSourceLoc :: CrucibleMethodSpecIR ext -> W4.ProgramLoc| enforceDisjointness :: | ||
| MIRCrucibleContext -> W4.ProgramLoc -> StateSpec -> OverrideMatcher MIR w () | ||
| enforceDisjointness cc loc ss = | ||
| MIRCrucibleContext -> Pos.Pos -> Text -> StateSpec -> OverrideMatcher MIR w () |
There was a problem hiding this comment.
I wonder if we should just pass the CrucibleMethodSpec as an argument (and extract its csSourcePos/csExecFunc fields in the body of the function) rather than passing Pos/Text as separate arguments. Generally, I'm nervous about passing random Text values as arguments with no explanation as to what they do, as it requires some staring at the surrounding code to figure out what purpose they serve.
Similar comments for enforeCompleteSubstitution.
There was a problem hiding this comment.
The names of the arguments are at least some explanation... but then it'll be clearer which position we're using, so maybe it's better that way... will think about it.
| SetupValue {- ^ object -} -> | ||
| Text {- ^ field name -} -> | ||
| Maybe SetupValue {- ^ field value -} -> | ||
| Text -> |
There was a problem hiding this comment.
Please add some Haddocks describing what this Text argument is (and elsewhere), especially since it's possible to conflate this with another Text argument that this function takes.
There was a problem hiding this comment.
Fixed, speak up if you see more
By conventions, it should be called "ppPrePost" (not "stateCond", not even sure what that meant tbh), and return Text rather than String. As often happens, this led to a bunch of old ratty error messages using it so I fixed a bunch of them up.
Add haddocks, remove unused Ord instance. NFCI
The conditionContext element of ConditionMetadata holds a string describing the caller when working in an override spec. Change it to Maybe Text instead of being a String that's usually "". I believe it to be impossible to ever get more than one caller, because we should never be trying to apply an override while already in one. If that turns out to be possible somehow, it'll now panic. This also fixes an accidental loop in prettyConditionMetadata, which at least in principle could have caused crux-mir-comp to issue infinitely many spaces when printing a method spec.
When adding assumptions from the postcondition of an override, use the position from the ConditionMetadata of each entry, not the position stored in the OverrideMatcher. The latter is, under the best of circumstances, the position of the entire override spec, or (worse) possibly the position of the spec whose function called the override; however, in practice it's just the current llvm/jvm/mir_verify call site, which isn't useful at all. The former is probably also that in practice currently, but at least in principle it could be more accurate.
In addition to not containing particularly useful information (the location of the current llvm/jvm/mir_verify call), after the previous fix it is not actually used for anything any more.
While passing through. (It's a freeform string that reflects something about what kind of assertion we have. Ideally it would be more structured, but that will require a pile of machinery and doesn't seem worthwhile at the moment.)
When checking that allocations are disjoint in the LLVM code, we temporarily set the What4 library's idea of current position to the position of the condition we're looking at, then afterwards set it to some random position passed in. Set it back to what it was instead. This probably makes no difference for the moment, but I want to be able to change what the passed-in position is without affecting unspecified other things happening under the covers in What4.
Now it comes through as "llvm_ghost_value" instead of just "ghost_value". Update test2049.
Digging through the Crucible code reveals that it uses the (runtime)
position of the llvm/jvm/mir_verify call for all sorts of things that
should be source positions. Some of these can be gotten, approximately
or closely enough, by calling "getPosition" in the implementation of
SAWScript builtins in the various setup monads, saving the results,
and then actually using them instead of just reaching for the closest
available thing.
However, quite a few aren't really specific to any one thing in the
setup block, for example the checks when matching overrides that
allocations are disjoint in the pattern expected. For these we really
want the source position of the whole setup do-block.
So add some hacks to the SAWScript interpreter to make it provide such
positions. The core of this consists of three extra FromValue
instances, one for each kind of setup block, that extract the position
as well as the monadic SetupValue. However, there are some headaches:
- Those instances can't just be for a pair of position and
Haskell-level monadic value, because there's already an instance
for ordinary pairs. It needs to be its own type. It turns out
we have an otherwise unused WithPos type that does this well
enough.
- It turns out the Crucible code requires the position _before_ it
evaluates the setup block. This means that the position that we
produce into the V*Setup value when we evaluate setup blocks
isn't any use. Instead we have to extract a position from an
unevaluated monadic value. This irritatingly requires adding
positions to a couple value constructors that didn't previously
need it.
It's possible that eventually the Crucible code can be improved to not
need an overall source position for whole setup blocks. Alternatively,
it's possible that it can be fixed to not demand the source position
before evaluating the setup block. In either case we shouldn't
hesitate to simplify or remove these hacks.
With a mismatched return value we now get the position of the
LLVMSetup block instead of the position of the llvm_verify call, which
is not really what we want (it should not be using the overall source
position for this) but a step forward of sorts. Update the test output
in test_llvm_errors.
Also, adding this logic revealed that the remote API server just sets
all positions to 'PosInternal "SAWServer"' and thus all errors coming
back from inside the Crucible code (and many other places) will just
say that instead of reporting anything useful. This is not ideal.
(note that these position types need maintenance now that we have SAWSupport.Position, but I don't want to get into it just now)
Carry around both a Position and a current function name, that being
loosely equivalent to the previous W4.ProgramLoc. Lower to
W4.ProgramLoc when needed. This preserves the additional info in our
positions for some error paths.
Also use the SAWScript builtin name systematically as the function
name, instead of making things up or using internal function names
that don't mean anything to the user.
With this and the other recent changes, sometimes we can now get a
real source position (e.g. "foo.saw:6:10-15") and a real execution
function (e.g. "llvm_verify") and they'll end up printed together
("foo.saw:6:10-15 in llvm_verify") even though the llvm_verify code is
probably nowhere near line 6. This is not ideal. However, it beats
printing just the location of the llvm_verify call and losing the
original source position, or printing wrong or made-up information
instead of the real positions.
In the long run we shouldn't combine source positions with current
execution positions, and probably this should be handled by teaching
What4 to not insist on having a function name. But we should
systematize the error handling a lot more before taking such steps.
- Fix a few places that were using the whole MethodSpec position (csSourcePos/csSourceLoc) instead of the position of a single condition from an available ConditionMetadata (conditionLoc). - Leave notes about other places that are using the whole MethodSpec position and probably shouldn't be in the long run. If fixing one of these, please be sure to fix all three instances of it... - Also fix some places that should have been using csSourceLoc but had instead inlined its definition. It hadn't been written in the first iteration of all this and some of the places got missed in the initial sweep.
(related to or adjacent to; basically this is stuff that got fixed up while passing through doing the rest of the changes)
ad08537 to
f46b78c
Compare
|
That force-push rebases on master |
f46b78c to
3a42c62
Compare
|
...and this one addresses most of Ryan's first-round comments |
See the individual commit messages (some of which are rather lengthy) for details.
Roughly speaking: nearly all the positions the Crucible code was carrying around were the position of the
llvm_verifycall, which isn't super helpful. This fixes up many of them.Part of #3116; alas, only part.
This is not done yet; I'm pushing this much so people can start looking at it. Most of what's here should be pretty much stable at this point. I've been considering maybe splitting the "interpreter: Add hacks for overall crucible-setup source positions" commit in two, one part to provide the positions and the other to apply them into the Crucible code, but unless anyone thinks it's specifically worthwhile it's probably not going to happen.
You'll almost certainly want to read it one commit at a time.
Note that we have almost zero test coverage of the messages these positions flow into. Something should be done about that too...