KVM Dirty log ring interface#344
Conversation
roypat
left a comment
There was a problem hiding this comment.
please squash your commits so that we do not have later commits be fixups for previous commits.
|
Also, do we need to do anything about KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP? |
As for this, Now, whether Send and Sync will actually be useful is a different matter, because I'm assuming you want them to be able to harvest the dirty ring while the vcpu is still running, but with the current API in this PR, you can only get a &mut reference to the ring buffer structure, which you cannot send to another thread anyway. So in this case, you'd need some API that lets you take ownership of the KvmDirtyRingLog structure (maybe store the one that gets created at vcpu creation time in an option, and then have a function that just wraps .take() on that option, keeping in mind that we must never allow safe code to get two owned versions of it for the same vcpu). |
|
Also sorry for the late response, we were all preparing for / traveling to KVM Forum last week! |
Turns out we do, and if it is supported, we also need to enable KVM_CAP_DIRTY_LOG_RING_ACQ_REL prior to enabling _WITH_BITMAP. Correct me if I'm wrong. |
66f0d63 to
362f0f5
Compare
|
Why is enable_cap not available in aarch64? |
11c85a5 to
ec0da8b
Compare
|
b00ead3 to
483342a
Compare
|
The enable_cap doctest uses KVM_CAP_SPLIT_IRQCHIP as an example, which is x86 only. This breaks the test and I disabled the enable_cap call in the test on other arches. |
I needed Send and Sync to move the Vcpus to separate threads, but I only implemented what the rust compiler suggested and am still looking for a better solution for this unrelated problem. |
6ccb6da to
1d09123
Compare
a7fcd12 to
c07248c
Compare
f9cf849 to
85f01fe
Compare
|
When will this be merged? |
|
I've just trigged dependabot, which should give us a PR for fixing the CI failure, and then we can merge this immediately after! Sorry for the long wait :( |
|
Thanks. No worries, just making sure it gets merged eventually :) |
ludfjig
left a comment
There was a problem hiding this comment.
lgtm (the latest main branch merge commit should maybe have been a rebase instead?)
Implement dirty log ring interface with `enable_dirty_log_ring` and `dirty_log_ring_iter` methods. Enable `VmFd` `enable_cap` and ioctl imports on all architectures. Add memory fences in iterator for proper synchronization on weak memory consistency architectures. Signed-off-by: David Kleymann <david@asymmetric.re>
Head branch was pushed to by a user without write access
1b25571
|
Rebased and removed an unneccessary #[allow(unused)]. |
This adds the basic plumbing for the DirtyLogWorker which will fetch the dirty log asynchronously in background and aggregate the effective MemoryRangeTable with dirtied memory. # Motivation - Performance: Fetching the dirty log, parsing the dirty bitmap, and aggregating the corresponding data structures is fairly costly. I just ran a vM with a active working set of 5 GiB (with 4 workers) and the measured overhead per iteration was 10-20ms. Given that we want to have as small downtimes as possible, we want that overhead to be close to zero for the final iteration. - Accurate dirty rate: This way, we have a more fine-grained sampling of the dirty rate (dirties 4k pages per second) which is an interesting metric to observe the current workload (regarding memory writes). # Bigger Picture / Outlook to KVMs Dirty Ring Interface The most robust and performant version which Cloud Hypervisor should use to get dirtied pages in the future is KVM's dirty ring interface [0]. This requires [1] to be merged first in rust-vmm/kvm. Experience showed that bumping any of the rust-vmm crates is a major challenge as all of them are highly interdependent and developed in individual repositories. So it will take some time before we can even consider starting the work of that feature in CHV. That being said: This design improves the current situation significantly without blocking any future refactorings or replacements with KVM's dirty ring interface. # Design I actively decided against Arc<Mutex<Vm>> in the DirtyLogWorker as this would be very invasive, make the migration code overly complicated (many locks and unlocks at the right times) and lastly, be a very big change to only call `vm.dirty_log()` in the thread. Note that the latter is just a thin wrapper around calling `cpu_manager.dirty_log()` and `memory_manager.dirty_log()`. [0] https://lwn.net/Articles/833206/ [1] rust-vmm/kvm#344 On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>
Summary of the PR
Add support for dirty page tracking via the Dirty ring interface. Adds KvmDirtyLogRing structure for keeping track of the indices and the base pointer to the shared memory buffer. Implements iterating over dirty pages, thereby harvesting them. Implements reset_dirty_rings on VmFd to trigger recycling of dirty ring buffer elements by the kernel after processing. Adds the dirty_log_ring field to VcpuFd.
This is a draft that needs some review and improvements, I'm hereby asking for suggestions for improving the following remaining weaknesses:
More info on the interface:
https://www.kernel.org/doc/html/latest/virt/kvm/api.html#kvm-cap-dirty-log-ring-kvm-cap-dirty-log-ring-acq-rel
Requirements
Before submitting your PR, please make sure you addressed the following
requirements:
git commit -s), and the commit message has max 60 characters for thesummary and max 75 characters for each description line.
test.
Release" section of CHANGELOG.md (if no such section exists, please create one).
unsafecode is properly documented.