[PM-34909] Serve only validated report files from get-latest and self-hosted file endpoints#7465
[PM-34909] Serve only validated report files from get-latest and self-hosted file endpoints#7465AlexRubik wants to merge 9 commits into
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #7465 +/- ##
==========================================
+ Coverage 59.98% 64.87% +4.88%
==========================================
Files 2133 2140 +7
Lines 93731 94665 +934
Branches 8311 8448 +137
==========================================
+ Hits 56226 61412 +5186
+ Misses 35527 31157 -4370
- Partials 1978 2096 +118 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
New Issues (9)Checkmarx found the following issues in this Pull Request
Fixed Issues (5)Great job! The following issues were fixed in this Pull Request
|
efa1321 to
a8cac6a
Compare
|
| * | ||
| FROM [dbo].[OrganizationReportView] | ||
| WHERE [OrganizationId] = @OrganizationId | ||
| AND ( |
There was a problem hiding this comment.
This is an anti-pattern in SQL and could cause performance issues and won't scale well. There are a couple better options available:
- Create a new stored proc without the
@FilterByValidatedand have it do the filtering logic all of time (then call the appropriate one in code) - Re-write the logic to perform separate queries based on the value of
@FilterByValidated. Something like this:
CREATE OR ALTER PROCEDURE [dbo].[OrganizationReport_GetLatestByOrganizationId]
@OrganizationId UNIQUEIDENTIFIER,
@FilterByValidated BIT = 0
AS
BEGIN
SET NOCOUNT ON;
IF @FilterByValidated = 1
BEGIN
SELECT TOP 1
*
FROM
[dbo].[OrganizationReportView]
WHERE
[OrganizationId] = @OrganizationId
AND JSON_VALUE([ReportFile], '$.Validated') = 'true'
ORDER BY
[RevisionDate] DESC;
END
ELSE
BEGIN
SELECT TOP 1
*
FROM
[dbo].[OrganizationReportView]
WHERE
[OrganizationId] = @OrganizationId
ORDER BY
[RevisionDate] DESC;
END
END
GOThis will allow SQL to create separate execution plans for each SELECT statement and will perform better overall.
NOTE: This stored procedure is named incorrectly, and should be named something like OrganizationReport_ReadLatestByOrganizationId. However, since you are modifying an existing proc, it's fine if you don't want to rename it. If you opt for creating a new stored procedure, it should be named something like OrganizationReport_ReadLatestByOrganizationIdFilterByValidated
There was a problem hiding this comment.
Conditional WHERE is gone: OrganizationReport_ReadLatestByOrganizationId (OrganizationReport_ReadLatestByOrganizationId.sql) always filters by validated file, and OrganizationReport_GetLatestByOrganizationId (OrganizationReport_GetLatestByOrganizationId.sql) now filters [ReportData] <> '' to skip V2 file-storage rows. Each SP gets its own execution plan. Thanks for flagging! Shipped in d69fab54a.
| if (latestReport == null) | ||
| { | ||
| throw new NotFoundException(); | ||
| } |
There was a problem hiding this comment.
There was a problem hiding this comment.
Null check moved down a layer. Both GetLatestOrganizationReportAsync and ReadLatestOrganizationReportAsync in GetOrganizationReportQuery (.../Reports/ReportFeatures/GetOrganizationReportQuery.cs) throw NotFoundException on null, covered by _NullResult_ThrowsNotFoundException tests for each in GetOrganizationReportQueryTests.cs. So the controller never sees a null latestReport. Shipped in c50e572d8.
| var response = new OrganizationReportResponseModel(latestReport); | ||
|
|
||
| if (_featureService.IsEnabled(FeatureFlagKeys.AccessIntelligenceVersion2)) | ||
| if (filterByValidated) |
There was a problem hiding this comment.
filterByValidated can be confusing when reviewing why this if else statement exists. The main logic on this branch is to handle report files differently than reports stored in the database. I suggest reverting this or creating a boolean variables for the feature flag to drive this branch.
There was a problem hiding this comment.
No longer applies after the scope change. No shared bool threaded through anymore: the controller (OrganizationReportsController.cs) dispatches by method name based on IsEnabled(AccessIntelligenceVersion2). Shipped in c50e572d8.
| if (filterByValidated) | ||
| { | ||
| var fileData = latestReport.GetReportFile(); | ||
| if (fileData is { Validated: true }) |
There was a problem hiding this comment.
⛏️ We can remove the check for validation since the query now returns only validated files.
There was a problem hiding this comment.
Dropped this and kept fileData != null in OrganizationReportsController.cs . Shipped in c50e572d8.
| throw new NotFoundException(); | ||
| } | ||
|
|
||
| if (_featureService.IsEnabled(FeatureFlagKeys.AccessIntelligenceVersion2) && !fileData.Validated) |
There was a problem hiding this comment.
There was a problem hiding this comment.
We look the report row up by report id via GetAuthorizedReportAsync which doesn't filter by validated (only the new ReadLatestByOrganizationIdAsync in the repo layer does, and that's the latest-by-org path, not by-id). The route (DownloadReportFileAsync) also has no [RequireFeature] attribute, just [SelfHosted(SelfHostedOnly = true)]. So with the flag ON, this guard is the only block against streaming an unvalidated file.
Guard shipped in 091d50362.
| if (filterByValidated) | ||
| { | ||
| query = query.Where(p => p.ReportFile != null | ||
| && p.ReportFile.Contains("\"Validated\":true")); | ||
| } |
There was a problem hiding this comment.
OrganizationReport_GetLatestByOrganizationId.sql, it would be more clear to have two separate query statements rather than modifying one. Not certain that it's an anti-pattern in this situation, but it's not easy to read.
There was a problem hiding this comment.
Each method in OrganizationReportRepository.cs is now a single query mirroring its sproc (GetLatestByOrganizationIdAsync and ReadLatestByOrganizationIdAsync). No more conditional Where.
Adds two feature-flag-gated guards to the self-hosted report-file endpoints under pm-31920-access-intelligence-azure-file-storage, implementing the Download and Upload endpoint sections of the PM-34909 acceptance criteria. The Get-latest read-path work from the same AC is deferred to the follow-up commits on this branch. Download guard -------------- In DownloadReportFileAsync, after the existing fileData == null check, throw NotFoundException when the V2 flag is ON and the ReportFile's Validated flag is false. The endpoint is reachable by report ID, independent of the filtered GetLatest read path introduced in later commits, so this guard is load-bearing: a caller cannot bypass it by hitting download directly. Upload guard ------------ In UploadReportFileAsync, extend the existing nullable / ID- mismatch guard with || fileData.Validated. Once the size- validation pass flips Validated to true, the report's file slot is locked. A second upload attempt to the same report ID throws NotFoundException, preventing overwrite of a confirmed file. Mirrors RenewFileUploadUrlAsync, which already refuses to mint a fresh upload URL for a validated file. Note: the guard checks the per-report Validated flag, not file content. Uploads to a distinct report ID are unaffected. Flag-off behavior ----------------- When pm-31920-access-intelligence-azure-file-storage is OFF, both endpoints behave exactly as on main: the download endpoint serves whatever file is on the row, and the upload endpoint accepts the upload regardless of the Validated flag. Rollout is revert-safe. Test coverage ------------- Four xUnit tests added to OrganizationReportsControllerTests, mapping 1:1 to the AC bullets: - DownloadReportFileAsync_FlagOn_UnvalidatedFile_ThrowsNotFoundException - DownloadReportFileAsync_FlagOn_ValidatedFile_ReturnsFile - DownloadReportFileAsync_FlagOff_UnvalidatedFile_ReturnsFile - UploadReportFileAsync_AlreadyValidated_ThrowsNotFoundException All four use NSubstitute + SutProvider per the existing patterns in the test file. Microsoft.AspNetCore.Http is added as a using directive for DefaultHttpContext, which the upload test needs to set ContentType = "multipart/form-data" on the request. References ---------- PR review thread on whether the download guard is needed (this commit keeps it; reasoning above): #7465 (comment) Feature flag: pm-31920-access-intelligence-azure-file-storage Prerequisite (already merged): #7228 [PM-34909]
Introduces a two-SP design for serving the latest organization report
under the V2 file-storage feature flag. Implements the Get-latest
endpoint section of the PM-34909 acceptance criteria. The controller
dispatch that wires the V2 flag to the new SP is deferred to the next
commit on this branch.
New SP: OrganizationReport_ReadLatestByOrganizationId
-----------------------------------------------------
Always filters by JSON_VALUE([ReportFile], '$.Validated') = 'true'.
No parameter switch. Single execution plan. Used by the V2 read path
(once commit 3 wires the controller dispatch to call it).
This addresses mkincaid-bw's PR comment on the SP conditional-WHERE
anti-pattern by structure: each SP has one shape, one execution plan.
mkincaid-bw's own comment predicted this resolution.
V1 SP: OrganizationReport_GetLatestByOrganizationId
---------------------------------------------------
Adds [ReportData] <> '' to the existing WHERE clause. The
OrganizationReport.ReportData column is NVARCHAR(MAX) NOT NULL with
entity default = string.Empty; V2 CreateOrganizationReportCommand
doesn't set ReportData, so V2 file-storage rows are written with
ReportData = '' (empty string, not NULL). The new predicate skips
those rows so the V1 read path can never return an empty-payload
row to a V1 caller after a flag-flip from V2 ON to V2 OFF.
Note: the Jira AC text says "WHERE ReportData IS NOT NULL" but
that's a no-op against a NOT NULL column. Shipping predicate is
<> ''. AC text will be updated post-merge.
DbUp migrations
---------------
Twin scripts mirror each SP body:
- util/Migrator/DbScripts/2026-05-19_00_AddOrganizationReportReadLatestSp.sql
- util/Migrator/DbScripts/2026-05-19_01_UpdateOrganizationReportGetLatestSkipEmptyReportData.sql
Both use CREATE OR ALTER PROCEDURE for idempotency. Rename the date
prefix at push time if the merge date shifts.
Repositories
------------
Both repository implementations get a new ReadLatestByOrganizationIdAsync
method calling the new SP / equivalent LINQ query. The EF Core
implementation also gets the empty-ReportData filter on the existing
GetLatestByOrganizationIdAsync method so Postgres / MySQL / SQLite
deployments share the SQL Server guarantee.
This addresses Banrion's PR comment on the EF Core conditional Where
by structure: each method is a single top-to-bottom query, no shared
IQueryable with conditional chaining.
- src/Infrastructure.Dapper/Dirt/OrganizationReportRepository.cs
- src/Infrastructure.EntityFramework/Dirt/Repositories/OrganizationReportRepository.cs
- src/Core/Dirt/Repositories/IOrganizationReportRepository.cs
What this commit does NOT do
----------------------------
- Does not change the controller dispatch. GetLatestOrganizationReportAsync
still routes through the single existing query method on this
branch; the V2 ternary lands in commit 3 along with the new query
interface member.
- Does not change upload or download endpoints (commit 1 owned those).
- Does not add tests. Controller dispatch tests and query unit
tests land in commit 3. Per the locked test-depth decision, no
repo integration tests are added; the flag-flip regression is
covered by the manual recipe in pr-comments-7465's parent
directory at local-testing-recipe.md section 14.
References
----------
PR review threads addressed by this commit:
- SP conditional WHERE anti-pattern (mkincaid-bw):
#7465 (comment)
- EF Core conditional Where readability (Banrion):
#7465 (comment)
Scope-change rationale (Leslie, 2026-05-14): the flag-flip regression
that motivated the two-SP design rather than a single conditional SP.
Feature flag: pm-31920-access-intelligence-azure-file-storage
Prerequisite (already merged): #7228
[PM-34909]
Wires the controller to the two-SP foundation introduced in the previous commit. When pm-31920-access-intelligence-azure-file-storage is ON, GetLatestOrganizationReportAsync dispatches to a new ReadLatestOrganizationReportAsync query method that calls the validated-file SP. When the flag is OFF, the existing GetLatestOrganizationReportAsync query method continues to call the V1 SP, which (per the previous commit) now skips empty-ReportData rows. Completes the Get-latest endpoint section of the PM-34909 acceptance criteria. Three controller deltas ----------------------- 1. Adds an isAccessIntelligenceV2 local + ternary dispatch. The local is named after the gate, not after a parameter shape, addressing Banrion's review comment on filterByValidated naming: #7465 (comment) 2. Replaces the inner `if (fileData is { Validated: true })` with `if (fileData != null)`. The enclosing block is already gated by `if (isAccessIntelligenceV2)`, so it only runs on the V2 path, and the V2 path always reaches the controller via ReadLatest's SP which has already filtered to Validated = true. Addresses Banrion's redundant-Validated-check comment: #7465 (comment) 3. Removes the controller-level `if (latestReport == null) throw new NotFoundException();` after the query call. The query layer owns the null-to-NotFoundException contract for both methods; double-checking obscures ownership. Addresses Banrion's null- check comment: #7465 (comment) Query layer ----------- IGetOrganizationReportQuery gains a ReadLatestOrganizationReportAsync method declaration. GetOrganizationReportQuery provides the implementation: logs the read with the BypassFiltersEventId marker, calls the new repo method, throws NotFoundException on null. Mirrors the existing GetLatestOrganizationReportAsync exactly except for the repo method called and the log/exception message wording. Test coverage ------------- OrganizationReportsControllerTests: - Existing _WithValidatedFile_ReturnsOkWithDownloadUrl: mock swapped from GetLatestOrganizationReportAsync to ReadLatestOrganizationReportAsync because the test enables V2 and the controller now dispatches there. - NEW _FlagOn_CallsReadLatest: V2 ON, asserts ReadLatest was called once and GetLatest was not called. - NEW _FlagOff_CallsGetLatest: V2 OFF, mirror of the above. GetOrganizationReportQueryTests (NEW file): - ReadLatestOrganizationReportAsync_ReturnsRepoResult - ReadLatestOrganizationReportAsync_NullResult_ThrowsNotFoundException - GetLatestOrganizationReportAsync_ReturnsRepoResult - GetLatestOrganizationReportAsync_NullResult_ThrowsNotFoundException The V1 null-throw test is what backs the response to Banrion's null- check comment: "the query layer owns this contract" is now a provable claim with a test that fails if the throw is removed. No repo integration tests are added per the locked test-depth decision; the flag-flip regression is covered by the manual recipe in local-testing-recipe.md section 14. What this commit does NOT do ---------------------------- - Does not add or modify SPs, migrations, or repos (previous commit owned those). - Does not touch the upload or download endpoints (first commit on this branch owned those). - Does not touch any other controller method. References ---------- Feature flag: pm-31920-access-intelligence-azure-file-storage Prerequisite (already merged): #7228 After this commit the branch is feature-complete: git log --oneline origin/main..HEAD d69fab5 feat(dirt): add ReadLatest SP and skip empty ReportData in V1 SP 091d503 feat(dirt): enforce validated state in download and upload endpoints The new commit will be the third entry above it. [PM-34909]
a8cac6a to
c50e572
Compare
🤖 Bitwarden Claude Code ReviewOverall Assessment: APPROVE This PR closes correctness gaps in the Access Intelligence report file storage endpoints behind the Code Review DetailsNo new findings. All concerns raised in the existing PR threads have been addressed by the author:
Pre-existing V1 |
| .Where(p => p.OrganizationId == organizationId | ||
| && p.ReportFile != null | ||
| && p.ReportFile.Contains("\"Validated\":true")) |
There was a problem hiding this comment.
🎨 SUGGESTED: EF filter relies on a raw JSON substring match, which drifts from the Dapper JSON_VALUE filter and is fragile if serialization options ever change.
Details and fix
The Dapper/MSSQL stored procedure filters with JSON_VALUE([ReportFile], '$.Validated') = 'true', while the EF version uses p.ReportFile.Contains("\"Validated\":true"). The substring match works today only because OrganizationReport.SetReportFile uses JsonHelpers.IgnoreWritingNull (no whitespace, default PascalCase) and ReportFile is the only writer. Any future change to the serializer (e.g., adding WriteIndented, switching to camelCase, or routing through Newtonsoft) would silently make this filter return zero rows — and the dual-ORM bug would only show up on Postgres/MySQL/SQLite.
Consider using EF.Functions.JsonExtract (provider-specific) or pulling the candidates and filtering in memory by deserializing with GetReportFile(). At minimum, anchor the substring with a brace (e.g., "\"Validated\":true," plus "\"Validated\":true}") or add a comment documenting the coupling to SetReportFile's serializer options so future maintainers don't break this implicitly.
There was a problem hiding this comment.
The substring match is our repo's convention for these cases currently. Introducing this new json extract convention is out of scope for this pr but I still made a comment here 8561e00
The EF Core ReadLatestByOrganizationIdAsync method substring-matches "Validated":true against the ReportFile JSON column, while the Dapper/MSSQL counterpart uses JSON_VALUE. The substring approach works because SetReportFile (OrganizationReport.cs) is the sole writer and uses JsonHelpers.IgnoreWritingNull (PascalCase, no whitespace); changing those serializer options would silently return zero rows on non-MSSQL providers without affecting the MSSQL path. Add a brief inline comment near the predicate documenting the coupling and noting the format-agnostic Dapper counterpart. No functional change.
| @@ -0,0 +1,13 @@ | |||
| CREATE PROCEDURE [dbo].[OrganizationReport_ReadLatestByOrganizationId] | |||
There was a problem hiding this comment.
Since OrganizationReport_GetLatestByOrganizationId should have been named OrganizationReport_ReadLatestByOrganizationId, the naming of this new procedure insinuates these two procs return the same data. If this new proc will eventually replace OrganizationReport_GetLatestByOrganizationId, then I'm okay with it; otherwise I think it should be named something else to differentiate that it returns different data, maybe OrganizationReport_ReadLatestValidatedByOrganizationId?
There was a problem hiding this comment.
If this new proc will eventually replace OrganizationReport_GetLatestByOrganizationId, then I'm okay with it
Yes that's the plan, once pm-31920-access-intelligence-azure-file-storage flag is permanently ON, the V1 OrganizationReport_GetLatestByOrganizationId SP becomes dead code and we remove it, leaving OrganizationReport_ReadLatestByOrganizationId as the only latest-by-org SP.
| SELECT TOP 1 | ||
| * | ||
| FROM [dbo].[OrganizationReportView] | ||
| WHERE [OrganizationId] = @OrganizationId | ||
| AND JSON_VALUE([ReportFile], '$.Validated') = 'true' | ||
| ORDER BY [RevisionDate] DESC |
There was a problem hiding this comment.
SQL Keywords should be on their own line
https://contributing.bitwarden.com/contributing/code-style/sql/#select-statements
| SELECT TOP 1 | |
| * | |
| FROM [dbo].[OrganizationReportView] | |
| WHERE [OrganizationId] = @OrganizationId | |
| AND JSON_VALUE([ReportFile], '$.Validated') = 'true' | |
| ORDER BY [RevisionDate] DESC | |
| SELECT | |
| TOP 1 * | |
| FROM | |
| [dbo].[OrganizationReportView] | |
| WHERE | |
| [OrganizationId] = @OrganizationId | |
| AND JSON_VALUE([ReportFile], '$.Validated') = 'true' | |
| ORDER BY | |
| [RevisionDate] DESC |
Place each top-level SQL keyword (SELECT, FROM, WHERE, ORDER BY) on its own line with arguments indented, per the Bitwarden SQL style guide (contributing.bitwarden.com/contributing/code-style/sql/). Reformats all four files this PR touches in the SP/migration pair: - src/Sql/dbo/Dirt/Stored Procedures/OrganizationReport_ReadLatestByOrganizationId.sql - src/Sql/dbo/Dirt/Stored Procedures/OrganizationReport_GetLatestByOrganizationId.sql - util/Migrator/DbScripts/2026-05-19_00_AddOrganizationReportReadLatestSp.sql - util/Migrator/DbScripts/2026-05-19_01_UpdateOrganizationReportGetLatestSkipEmptyReportData.sql No functional change.
|
…tByOrganizationId.sql
…stByOrganizationId.sql
mkincaid-bw
left a comment
There was a problem hiding this comment.
I committed a couple minor formatting changes to correct my incorrect suggestions. Everything else LGTM.
…GetLatestSkipEmptyReportData.sql
mkincaid-bw
left a comment
There was a problem hiding this comment.
Forgot to update the migration files. Should be good now.





🎟️ Tracking
https://bitwarden.atlassian.net/browse/PM-34909
📔 Objective
Closes correctness gaps in the Access Intelligence report file storage endpoints introduced by #7228, gated behind
pm-31920-access-intelligence-azure-file-storage:@FilterByValidatedparameter) was reworked into two separate SPs with controller-level dispatch after review feedback from@mkincaid-bwand a flag-flip regression@Banrionidentified.|| fileData.Validated) to prevent overwriting an already-validated file, matching the existing pattern inRenewFileUploadUrlAsync.When the flag is OFF, all three endpoints behave exactly as they do today.
Scope change (2026-05-14)
Two pieces of review feedback during iteration 1 prompted a redesign of the GetLatest read path:
@mkincaid-bwflagged the SP conditional WHERE as a parameter-sniffing anti-pattern and proposed creating a second SP that does the filtering logic all the time.@Banrionmade the parallel observation about the EF Core conditionalWhere.@Banrionseparately identified a flag-flip regression: with V2 ON,CreateOrganizationReportCommandsaves a row with the file blob attached but no inlineReportData. If the flag is later turned OFF, the existing V1 SP would return that empty-payload row to a V1 caller that expects an encrypted payload.The revised design:
OrganizationReport_GetLatestByOrganizationIdis updated to skip empty-ReportDatarows. Used when the flag is OFF.OrganizationReport_ReadLatestByOrganizationIdalways filters byJSON_VALUE([ReportFile], '$.Validated') = 'true'. Used when the flag is ON.boolparameter on the existing method. EF Core's V1 method also gets the empty-ReportDatafilter so Postgres / MySQL / SQLite deployments share the SQL Server guarantee.isAccessIntelligenceV2).is { Validated: true }pattern dropped: the new SP guarantees the validated state, so the inner check is redundant.Predicate discrepancy worth calling out
The Jira AC text says "WHERE ReportData IS NOT NULL", but the
OrganizationReport.ReportDatacolumn isNVARCHAR(MAX) NOT NULLwith the entity defaulting tostring.Empty. V2 rows are written withReportData = '', notNULL. The shipping predicate is<> ''. Jira AC wording will be updated post-merge to match.Commits
feat(dirt): enforce validated state in download and upload endpoints: adds the two endpoint guards + 4 xUnit tests.feat(dirt): add ReadLatest SP and skip empty ReportData in V1 SP: new SP, V1 SP filter, 2 DbUp migrations, both repos + interface, EF Core V1 parity.feat(dirt): dispatch GetLatest endpoint by V2 feature flag: query interface + impl, controller refactor, controller + query unit tests.Manual flag-flip regression test verified locally. No repo integration tests added.