Preserve cached mutation results on rerun of unchanged source by percy-raskova · Pull Request #471 · boxed/mutmut

percy-raskova · 2026-02-25T00:39:07Z

Summary

create_mutants_for_file() correctly detects unchanged source files (source_mtime < mutant_mtime) and skips regenerating mutations, but also resets all exit_code_by_key values to None. This causes every subsequent mutmut run to re-test all mutants from scratch, defeating the incremental caching introduced in bf3a2e8.

Root cause: Lines 296-301 in __main__.py load the .meta file and set every exit code to None before returning unmodified=True. The skip logic at line 1273 (if result is not None: continue) then sees all-None results and re-tests everything.

Fix: Remove the reset loop so cached results survive across runs.

Changes

Bug fix: Remove the exit code reset in the source_mtime < mutant_mtime branch of create_mutants_for_file()
Test: Add e2e test that injects sentinel exit codes into .meta files and verifies they survive a second run on unchanged source
Lint: Fix pre-existing ruff errors (unused imports Iterable/libcst.matchers, unused variable tests_dir, f-strings without placeholders)

Test plan

New e2e test test_rerun_preserves_cached_results — RED without fix, GREEN with fix
All 179 existing tests pass
All 6 e2e tests pass (including existing snapshot tests)
ruff check clean on changed files

create_mutants_for_file() correctly detected unchanged source files (source_mtime < mutant_mtime) and skipped regenerating mutations, but also reset all exit_code_by_key values to None. This caused every subsequent run to re-test all mutants from scratch, defeating the incremental caching introduced in bf3a2e8. Remove the reset loop so cached results survive across runs. The skip logic at line 1273 (`if result is not None: continue`) can now actually skip already-tested mutants. Also fix pre-existing ruff lint errors (unused imports, unused variable, f-strings without placeholders).

Copilot

Pull request overview

Fixes incremental mutation-test caching so reruns on unchanged source keep previously recorded mutant exit codes, preventing unnecessary re-testing and restoring the intended behavior of the cache.

Changes:

Stop resetting exit_code_by_key to None when the source file is unchanged (mtime comparison) so cached results persist.
Add an e2e regression test that injects sentinel exit codes into .meta files and verifies they survive a second run.
Clean up a few ruff issues (unused imports/vars, f-strings without placeholders).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`src/mutmut/__main__.py`	Preserves cached `.meta` results when skipping mutant regeneration for unchanged files; includes minor lint cleanups.
`tests/e2e/test_e2e_incremental_cache.py`	Adds an e2e test verifying cached exit codes are preserved across reruns on unchanged source.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/e2e/test_e2e_incremental_cache.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Otto-AA · 2026-02-25T06:13:07Z

Hi, @percy-raskova . This was an explicit decision as stated in the #470 PR:

When running mutmut run a second time, it will now only mutate modified source files. It will run all mutants nonetheless, because a change in some source file A can influence if a mutant of source file B would survive.

The purpose of this PR was only to reduce the time spent in generating mutants, not (yet) to reduce the time testing these mutants.

I do think such a functionality could be useful, but I would tend towards making it opt-in as it can "cache" mutant results that would change in a second run. I haven't thought a lot about it yet, open to discussions. Also somehow related to #418

We could also try to make it sound, by doing something like testmon for each mutant (e.g. collecting coverage for each test separately; then check which tests are affected by file changes; then only rerun mutants that are touched in those tests). Would need some thinking how sound this actually is, and how we'd like to implement this. This seems like some interesting, but non-trivial work..

percy-raskova · 2026-02-25T06:30:36Z

Thank you for the explanation! I submitted this because on a project I am working on, I have a lot of tests and noticed mutmut was mutating all of the source files every time, which was inconvenient for my specific use case. I wasn't aware of the other PRs so my thought process to try submitting an upstream PR in case others had the same issue. But what you said makes sense as well. Thanks for taking the time to spell all that out!

boxed · 2026-02-25T07:15:10Z

@percy-raskova maybe you closed this a bit prematurely. There could be a third option here:

We could mark these potentially stale mutants as potentially stale and put them last in the queue. The odds of already tested mutants being killed is probably high, but strictly speaking they should be retested as explained above.

What do you think about that?

percy-raskova · 2026-02-25T08:15:13Z

Oh yes I like this idea and would like to be involved in this process! This is my first attempt at a PR to an open source project, so I am a bit new to the usual practices and customs 😅

boxed · 2026-02-25T08:27:15Z

No problem at all. This project has a lot of history too, and it's understood that new contributors won't have that background knowledge.

If you want to work on an idea like this it would be very welcome. I think it makes a lot of sense. I've been thinking also about having priority for different types of mutants, which would fit nicely into this idea. For example, a deletion of an entire line is obviously a more severe mutant than changing < to <= so logically all those mutants should be run first.

Even though I'm the original author of mutmut I seem to be more of an vision/idea man at this point, with Otto-AA doing the actual work on the system and he has a much better living understanding of the tradeoffs and implementation than me now :P

nicklafleur · 2026-02-27T18:40:51Z

Hi there, just catching up on this thread, but I've actually been working on some very similar optimizations as in this PR. We're at a spot internally where we are ready to start merging to the upstream (first PR can be found here #473 with some housekeeping items). We're seeing something in the neighbourhood of 95-99% runtime improvements with our caching strategy so just wanted to reach out in this thread to see if there was any interest in holding off on this PR a bit while I start to open source our work and open the PRs for us all to discuss.

percy-raskova · 2026-02-28T00:03:11Z

No problem!

Otto-AA · 2026-02-28T08:58:20Z

Hey @percy-raskova , I did not intend to say that your wokr is not useful, sorry if it came across this way. My main point was that I think it should be opt-in (as this type of incremental testing can skip relevant mutants) and that I frankly haven't thought through this problem yet and don't know what's a reasonable approach. I think now it makes sense to wait for how @nicklafleur approached it and decide which way we want to go. It's also interesting to compare to other tools for inspiration (e.g. https://stryker-mutator.io/docs/stryker-js/incremental/ and https://infection.github.io/guide/command-line-options.html#git-diff-filter).

If you want to work on something else in the meantime, feel free to take a look at the issue tracker and ask for pointers or explanations on how the code base works. For instance, #414 would be a rather small change (renaming paths_to_mutate because the name can be confusing), or #47 if you want to go deeper into how we use libCST to parse code. Or as @boxed explained, you could also take a look at shadowing in #367.

If you have any questions, just ask :) (but it can take some days until I reply)

Otto-AA · 2026-02-28T09:03:18Z

Even though I'm the original author of mutmut I seem to be more of an vision/idea man at this point, with Otto-AA doing the actual work on the system and he has a much better living understanding of the tradeoffs and implementation than me now :P

I have only used mutmut on side projects and there's not much user feedback, so my UX tradeoffs are probably not the best 🙃 I also feel like string mutations got a bit much, we probably want to enable/disable some of them via settings and per default. But I enjoy thinking about the problems and how things could be improved, and want to get it working on some production code bases at some point.

percy-raskova · 2026-03-01T00:01:57Z

@Otto-AA no problem! if other people have been working on this problem for longer and have a more robust and performant solution id rather defer to that. my PR was just "Hey I fixed this for myself on my own project, why not submit an upstream PR?" 😛 Ill take a look at some of the other issues you brought up and try one out

nicklafleur · 2026-03-01T15:21:32Z

I've pushed my current work to https://github.com/lyft/mutmut/tree/nicklafleur/oss-complete, the commit with this solution specifically is lyft@92a5d62, though I will be rebasing on top of some review changes so if/when that happens you can look for the feat: add function hashing for incremental mutation testing commit (tl;dr: it hashes the AST to get a syntactically-stable representation of the source and uses that for change detection). The one after also handles a better cache invalidation strategy so looking at them together isn't a bad idea (they will end up in a pr together anyways).

I did end up folding in the mtime solution as that is a nice local optimization, but for our uses (running this across large teams/large codebases) we needed something that could share the gains across many machines.

Also worth noting, while the totality of those changes are a pretty significant rewrite of the application, the current behaviour is still the default (I'm pretty sure at least), and it's all very negotiable so I'm happy to refactor and make changes based on feedback.

Copilot AI review requested due to automatic review settings February 25, 2026 00:39

Copilot started reviewing on behalf of percy-raskova February 25, 2026 00:39 View session

Copilot AI reviewed Feb 25, 2026

View reviewed changes

tests/e2e/test_e2e_incremental_cache.py Outdated Show resolved Hide resolved

Update tests/e2e/test_e2e_incremental_cache.py

113d4c7

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

percy-raskova closed this Feb 25, 2026

percy-raskova reopened this Feb 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserve cached mutation results on rerun of unchanged source#471

Preserve cached mutation results on rerun of unchanged source#471
percy-raskova wants to merge 2 commits intoboxed:mainfrom
percy-raskova:fix/preserve-cached-mutation-results

percy-raskova commented Feb 25, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Otto-AA commented Feb 25, 2026

Uh oh!

percy-raskova commented Feb 25, 2026

Uh oh!

boxed commented Feb 25, 2026

Uh oh!

percy-raskova commented Feb 25, 2026 •

edited

Loading

Uh oh!

boxed commented Feb 25, 2026

Uh oh!

nicklafleur commented Feb 27, 2026 •

edited

Loading

Uh oh!

percy-raskova commented Feb 28, 2026

Uh oh!

Otto-AA commented Feb 28, 2026

Uh oh!

Otto-AA commented Feb 28, 2026

Uh oh!

percy-raskova commented Mar 1, 2026

Uh oh!

nicklafleur commented Mar 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

percy-raskova commented Feb 25, 2026

Summary

Changes

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Otto-AA commented Feb 25, 2026

Uh oh!

percy-raskova commented Feb 25, 2026

Uh oh!

boxed commented Feb 25, 2026

Uh oh!

percy-raskova commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

boxed commented Feb 25, 2026

Uh oh!

nicklafleur commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

percy-raskova commented Feb 28, 2026

Uh oh!

Otto-AA commented Feb 28, 2026

Uh oh!

Otto-AA commented Feb 28, 2026

Uh oh!

percy-raskova commented Mar 1, 2026

Uh oh!

nicklafleur commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

percy-raskova commented Feb 25, 2026 •

edited

Loading

nicklafleur commented Feb 27, 2026 •

edited

Loading

nicklafleur commented Mar 1, 2026 •

edited

Loading