[PM-29717] Fix event duplication by replacing KVStore checkpoint with solnlib#189
Draft
AlexRubik wants to merge 2 commits into
Draft
[PM-29717] Fix event duplication by replacing KVStore checkpoint with solnlib#189AlexRubik wants to merge 2 commits into
AlexRubik wants to merge 2 commits into
Conversation
The custom KVStore checkpoint code fails in Splunk Cloud environments because the collection may not initialize and nested object serialization doesn't round-trip correctly. Replaces with solnlib's KVStoreCheckpointer which handles initialization, error recovery, and Cloud compatibility. Flattens checkpoint data structure to avoid nested object issues. Removes key_id from EventLogsCheckpoint model (now managed by library). Includes migration from legacy 'eventsapi' KVStore collection. Adds checkpoint serialization roundtrip tests. [PM-29717]
The read_events loop previously made API calls as fast as the network allowed (~10/sec), triggering Bitwarden's 429 rate limit. Adds a 200ms delay between paginated calls (5 calls/sec, under the 400/min limit). [PM-29717]
|
New Issues (9)Checkmarx found the following issues in this Pull Request
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


🎟️ Tracking
https://bitwarden.atlassian.net/browse/PM-29717
📔 Objective
The Bitwarden Splunk app duplicates event data on every scheduled run because the KVStore-based checkpoint mechanism fails to persist in Splunk Cloud environments. The checkpoint either doesn't initialize (collection not created) or doesn't persist data (nested object serialization issues), causing every run to re-fetch and re-index the same events from the original time window.
This PR:
solnlib.modular_input.checkpointer.KVStoreCheckpointer, a battle-tested library (already bundled viasplunktaucclib) that handles collection initialization, error recovery, and Splunk Cloud compatibilityWhat changed
src/config.pyKVStoreCheckpointer; flattened checkpoint data (no nested objects); added legacy migration from oldeventsapicollectionsrc/models.pykey_idfield fromEventLogsCheckpoint(now managed by solnlib)src/event_logs.pytime.sleep(0.2)between paginated API calls for rate limitingtests/test_config.pyRelated issues
Out of scope — follow-up items
The following improvements are planned but not included in this PR to keep the change focused:
App.run()— If checkpoint update fails after events are already indexed, stop immediately instead of continuing to write uncheckpointed events.eventsapiKVStore collection fromcollections.confonce migration is confirmed working.Completed
eventsapicollectionStill needed
solnlib.KVStoreCheckpointercorrectly auto-creates the collection and persists data in an actual Splunk Cloud deployment.📸 Screenshots
Not applicable — no UI changes.