[DAPS-1522] Data Router Logging Improvements #1789

megatnt1122 · 2025-11-25T14:03:54Z

Ticket

#1522

Description

Logging improvements to data_router

Tasks

- A description of the PR has been provided, and a diagram included if it is a new feature.
- Formatter has been run
- CHANGELOG comment has been added
- Labels have been assigned to the pr
- A reviwer has been added
- A user has been assigned to work on the pr
- If new feature a unit test has been added

Summary by Sourcery

Improve observability and permission handling in the data router API endpoints.

New Features:

Add structured request logging for all key data router endpoints, including correlation IDs and request status.

Enhancements:

Centralize permission checks by replacing direct usage of the permissions module with equivalent helpers on the shared support library.
Capture and log success and failure details (including results and errors where appropriate) for create, update, export, dependency graph, locking, path resolution, transfer, allocation/owner change, and delete operations to aid debugging and auditing.

sourcery-ai · 2025-11-25T14:04:01Z

Reviewer's Guide

Refactors data_router to use centralized permission helpers and introduces structured request lifecycle logging (start/success/failure) across all data routes for better observability and correlation.

Sequence diagram for POST data/get with task initialization and logging

sequenceDiagram
    actor User
    participant ClientApp
    participant DataRouter
    participant GLib
    participant Logger
    participant ArangoDB
    participant GTasks

    User->>ClientApp: Request data download
    ClientApp->>DataRouter: POST /data/get?client=clientId

    DataRouter->>GLib: getUserFromClientID(clientId)
    GLib-->>DataRouter: client

    DataRouter->>Logger: logRequestStarted(client, correlationId, POST, data/get, Started)

    DataRouter->>ArangoDB: _executeTransaction(read metadata, validate ids)
    activate ArangoDB
    ArangoDB->>GLib: resolveDataID for each id
    GLib-->>ArangoDB: data ids
    ArangoDB-->>DataRouter: validated ids
    deactivate ArangoDB

    DataRouter->>GTasks: taskInitDataGet(client, path, encrypt, ids)
    GTasks-->>DataRouter: task descriptor result

    DataRouter-->>ClientApp: HTTP 200 result (task info)
    DataRouter->>Logger: logRequestSuccess(client, correlationId, POST, data/get, Success, result)

    alt Failure during transaction or task init
        DataRouter->>Logger: logRequestFailure(client, correlationId, POST, data/get, Failure, result, error)
        DataRouter->>GLib: handleException(error, res)
        GLib-->>ClientApp: Error response
    end

Class diagram for updated data_router modules and logging helpers

classDiagram
    class DataRouter {
        +post_create(req, res)
        +post_create_batch(req, res)
        +post_update(req, res)
        +post_update_batch(req, res)
        +post_update_md_err_msg(req, res)
        +post_update_size(req, res)
        +get_view(req, res)
        +post_export(req, res)
        +get_dep_graph_get(req, res)
        +get_lock(req, res)
        +get_path(req, res)
        +get_list_by_alloc(req, res)
        +post_get(req, res)
        +post_put(req, res)
        +post_alloc_chg(req, res)
        +post_owner_chg(req, res)
        +post_delete(req, res)
        -recordCreate(client, record, result)
        -recordUpdate(client, record, result)
    }

    class GLib {
        +getUserFromClientID(clientId)
        +getUserFromClientID_noexcept(clientId)
        +hasManagerPermProj(client, ownerId)
        +hasPermissions(client, resource, perms)
        +hasAdminPermObject(client, objectId)
        +getPermissions(client, resource, perms)
        +ensureAdminPermUser(client, userId)
        +ensureManagerPermProj(client, projId)
        +resolveDataID(id, client)
        +resolveDataCollID(id, client)
        +getObject(id, client)
        +hasPublicRead(dataId)
        +computeDataPath(loc, localOnly)
        +saveRecentGlobusPath(client, path, transferType)
        +handleException(error, res)
        <<constants>> PERM_CREATE
        <<constants>> PERM_WR_META
        <<constants>> PERM_WR_REC
        <<constants>> PERM_RD_REC
        <<constants>> PERM_RD_META
        <<constants>> PERM_RD_DATA
        <<constants>> TT_DATA_EXPORT
        <<constants>> TT_DATA_GET
        <<constants>> TT_DATA_PUT
    }

    class Logger {
        +logRequestStarted(client, correlationId, httpVerb, routePath, status, description)
        +logRequestSuccess(client, correlationId, httpVerb, routePath, status, description, extra)
        +logRequestFailure(client, correlationId, httpVerb, routePath, status, description, extra, error)
    }

    class ArangoDB {
        +_executeTransaction(config)
        +c_document(id)
        +d_document(id)
        +owner_firstExample(query)
        +loc_firstExample(query)
        +_update(id, patch)
        +_remove(id)
        +_query(query, bindVars)
    }

    class GTasks {
        +taskInitDataGet(client, path, encrypt, ids)
        +taskInitDataPut(client, path, encrypt, ids, check, srcRepoId)
        +taskInitRecAllocChg(client, projId, ids, opts)
        +taskInitRecOwnerChg(client, ids, collId, opts)
        +taskInitRecCollDelete(client, ids)
    }

    DataRouter --> GLib : uses
    DataRouter --> Logger : logs via
    DataRouter --> ArangoDB : executes transactions
    DataRouter --> GTasks : initializes tasks
    GLib --> ArangoDB : helper DB access
    GLib <.. DataRouter : centralized permissions and utilities
    Logger <.. DataRouter : request lifecycle logging

Flow diagram for generic data_router request lifecycle logging

flowchart TD
    A[Receive HTTP request
client, headers, body] --> B[Resolve client
getUserFromClientID or getUserFromClientID_noexcept]
    B --> C[logRequestStarted
client, correlationId,
httpVerb, routePath,
status Started,
description]
    C --> D{Handler uses
ArangoDB and helpers}

    D --> E[Build result data
or side effects only]
    E --> F[Send success response
res.send result or empty]
    F --> G[logRequestSuccess
client, correlationId,
httpVerb, routePath,
status Success,
description, extra]

    D --> H[Exception thrown
validation, permissions,
DB error, task error]
    H --> I[logRequestFailure
client, correlationId,
httpVerb, routePath,
status Failure,
description, extra,
error]
    I --> J[handleException
via g_lib.handleException]
    J --> K[Error response
status code and body]

    subgraph RetryLoop[Optional retry loop
for write conflict 1200]
        D --> L{Arango errorNum 1200
and retries left?}
        L -- yes --> D
        L -- no --> H
    end

File-Level Changes

Change	Details	Files
Add structured request lifecycle logging to all major data_router endpoints using a shared logger with correlation IDs and route metadata.	Import shared logger module and define a base path for data routes For each endpoint, capture the client once at the top-level and log request start with HTTP verb, route path, status, and human-readable description On successful completion, log request success including result or key output data where appropriate On exceptions, log request failures with error details and any partial result data before delegating to existing exception handling Ensure variables like client/result are declared in outer scope so they are available in both success and failure logging blocks	`core/database/foxx/api/data_router.js`
Unify permission checks to use g_lib helpers instead of the separate permissions module throughout data_router.	Replace permissions.hasManagerPermProj and permissions.hasPermissions usages with equivalent g_lib helper functions and constants Update record update, view, lock, path, and listing logic to use g_lib.hasAdminPermObject, g_lib.getPermissions, and g_lib permission flags for consistency Remove redundant permissions import now that all permission checks are routed through g_lib Keep overall permission semantics and error handling the same while centralizing implementation	`core/database/foxx/api/data_router.js`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey there - I've reviewed your changes - here's some feedback:

In the /update/batch handler you now set client = req.queryParams.client instead of using g_lib.getUserFromClientID, which changes client from a user object to a raw id string and will impact downstream permission checks and task initialization that expect a full client object.
In the /export transaction you replaced results = [] with result = [] but still appear to use results later (for pushes, res.send(results), and logging), which will either reference an undefined variable or leave the logged/sent value empty; align the variable name consistently (result vs results).
Some of the new logging calls have inconsistencies or likely mistakes (e.g., /dep/graph/get logs httpVerb: "POST" for a GET route, and /update/size logs before client is set and uses extra: "undefined"), so it would be good to standardize these fields and ensure client and extra carry meaningful values across all handlers.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In the /update/batch handler you now set `client = req.queryParams.client` instead of using `g_lib.getUserFromClientID`, which changes `client` from a user object to a raw id string and will impact downstream permission checks and task initialization that expect a full client object.
- In the /export transaction you replaced `results = []` with `result = []` but still appear to use `results` later (for pushes, `res.send(results)`, and logging), which will either reference an undefined variable or leave the logged/sent value empty; align the variable name consistently (`result` vs `results`).
- Some of the new logging calls have inconsistencies or likely mistakes (e.g., /dep/graph/get logs `httpVerb: "POST"` for a GET route, and /update/size logs before `client` is set and uses `extra: "undefined"`), so it would be good to standardize these fields and ensure `client` and `extra` carry meaningful values across all handlers.

## Individual Comments

### Comment 1
<location> `core/database/foxx/api/data_router.js:1279-1288` </location>
<code_context>

 router
     .get("/view", function (req, res) {
+        let results = null;
+        let client = null;
         try {
</code_context>

<issue_to_address>
**issue (bug_risk):** The /view handler logs `results`, but that variable is never populated and the actual response uses `data`.

In this route, `results` is never updated, so the success log will always show `extra: null` instead of the actual payload. To log the returned data, either assign `results = [data];` before `res.send`, or update the log call to use the real payload, e.g. `extra: { results: [data] }` or `extra: data` directly.
</issue_to_address>

### Comment 2
<location> `core/database/foxx/api/data_router.js:264-273` </location>
<code_context>
             try {
-                var result = {
+                client = g_lib.getUserFromClientID(req.queryParams.client);
+                logger.logRequestStarted({
+                    client: client,
+                    correlationId: req.headers["x-correlation-id"],
</code_context>

<issue_to_address>
**suggestion:** Logging for /dep/graph/get uses HTTP verb POST even though this is a GET route.

Please update the logger calls in this handler to use `httpVerb: "GET"` so the logs match the route definition (`router.get("/dep/graph/get", ...)`) and remain accurate for method-based filtering and monitoring.

Suggested implementation:

```javascript
            logger.logRequestStarted({
                client: client,
                correlationId: req.headers["x-correlation-id"],
                httpVerb: "GET",
                routePath: basePath + "/dep/graph/get",
                status: "Started",
                description: "Get dependency graph",
            });

```

```javascript
            logger.logRequestCompleted({
                client: client,
                correlationId: req.headers["x-correlation-id"],
                httpVerb: "GET",
                routePath: basePath + "/dep/graph/get",
                status: "Completed",
                description: "Get dependency graph",
            });

```

```javascript
            logger.logRequestFailed({
                client: client,
                correlationId: req.headers["x-correlation-id"],
                httpVerb: "GET",
                routePath: basePath + "/dep/graph/get",
                status: "Failed",
                description: "Get dependency graph",
                error: err && err.stack ? err.stack : err,
            });

```

- Ensure these logger calls are indeed within the `router.get("/dep/graph/get", ...)` handler.
- Do NOT change `httpVerb` for other routes, such as the `/create` POST route you showed; those should remain `"POST"`.
- If there are any other logger calls (e.g., `logRequestStarted`, `logRequestCompleted`, `logRequestFailed`) associated with `/dep/graph/get` that are not shown here, update their `httpVerb` field to `"GET"` in the same way.
</issue_to_address>

### Comment 3
<location> `core/database/foxx/api/data_router.js:30-36` </location>
<code_context>
        if (owner_id != client._id) {
            if (!g_lib.hasManagerPermProj(client, owner_id)) {
                var parent_coll = g_db.c.document(parent_id);
                if (!g_lib.hasPermissions(client, parent_coll, g_lib.PERM_CREATE)) {
                    throw error.ERR_PERM_DENIED;
                }
            }
        }

</code_context>

<issue_to_address>
**suggestion (code-quality):** Merge nested if conditions ([`merge-nested-ifs`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/JavaScript/Default-Rules/merge-nested-ifs))

```suggestion
        if (owner_id != client._id && !g_lib.hasManagerPermProj(client, owner_id)) {
              var parent_coll = g_db.c.document(parent_id);
              if (!g_lib.hasPermissions(client, parent_coll, g_lib.PERM_CREATE)) {
                  throw error.ERR_PERM_DENIED;
              }
        }

```

<br/><details><summary>Explanation</summary>Reading deeply nested conditional code is confusing, since you have to keep track of which
conditions relate to which levels. We therefore strive to reduce nesting where
possible, and the situation where two `if` conditions can be combined using
`and` is an easy win.
</details>
</issue_to_address>

### Comment 4
<location> `core/database/foxx/api/data_router.js:1304-1308` </location>
<code_context>

</code_context>

<issue_to_address>
**issue (code-quality):** Use `const` or `let` instead of `var`. ([`avoid-using-var`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/JavaScript/Default-Rules/avoid-using-var))

<details><summary>Explanation</summary>`const` is preferred as it ensures you cannot reassign references (which can lead to buggy and confusing code).
`let` may be used if you need to reassign references - it's preferred to `var` because it is block- rather than
function-scoped.

From the [Airbnb JavaScript Style Guide](https://airbnb.io/javascript/#references--prefer-const)
</details>
</issue_to_address>

### Comment 5
<location> `core/database/foxx/api/data_router.js:1802` </location>
<code_context>

</code_context>

<issue_to_address>
**issue (code-quality):** Use `const` or `let` instead of `var`. ([`avoid-using-var`](https://docs.sourcery.ai/Reference/Rules-and-In-Line-Suggestions/JavaScript/Default-Rules/avoid-using-var))

<details><summary>Explanation</summary>`const` is preferred as it ensures you cannot reassign references (which can lead to buggy and confusing code).
`let` may be used if you need to reassign references - it's preferred to `var` because it is block- rather than
function-scoped.

From the [Airbnb JavaScript Style Guide](https://airbnb.io/javascript/#references--prefer-const)
</details>
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-11-25T14:05:27Z

core/database/foxx/api/data_router.js

+        let results = null;
+        let client = null;
        try {
-            const client = g_lib.getUserFromClientID_noexcept(req.queryParams.client);
+            client = g_lib.getUserFromClientID_noexcept(req.queryParams.client);
+            logger.logRequestStarted({
+                client: client,
+                correlationId: req.headers["x-correlation-id"],
+                httpVerb: "GET",
+                routePath: basePath + "/view",
+                status: "Started",


issue (bug_risk): The /view handler logs results, but that variable is never populated and the actual response uses data.

In this route, results is never updated, so the success log will always show extra: null instead of the actual payload. To log the returned data, either assign results = [data]; before res.send, or update the log call to use the real payload, e.g. extra: { results: [data] } or extra: data directly.

sourcery-ai · 2025-11-25T14:05:27Z

core/database/foxx/api/data_router.js

+                logger.logRequestStarted({
+                    client: client,
+                    correlationId: req.headers["x-correlation-id"],
+                    httpVerb: "POST",
+                    routePath: basePath + "/create",
+                    status: "Started",
+                    description: "Create a new data record",
+                });
+                result = {
                    results: [],


suggestion: Logging for /dep/graph/get uses HTTP verb POST even though this is a GET route.

Please update the logger calls in this handler to use httpVerb: "GET" so the logs match the route definition (router.get("/dep/graph/get", ...)) and remain accurate for method-based filtering and monitoring.

Suggested implementation:

logger.logRequestStarted({ client: client, correlationId: req.headers["x-correlation-id"], httpVerb: "GET", routePath: basePath + "/dep/graph/get", status: "Started", description: "Get dependency graph", });

logger.logRequestCompleted({ client: client, correlationId: req.headers["x-correlation-id"], httpVerb: "GET", routePath: basePath + "/dep/graph/get", status: "Completed", description: "Get dependency graph", });

logger.logRequestFailed({ client: client, correlationId: req.headers["x-correlation-id"], httpVerb: "GET", routePath: basePath + "/dep/graph/get", status: "Failed", description: "Get dependency graph", error: err && err.stack ? err.stack : err, });

Ensure these logger calls are indeed within the router.get("/dep/graph/get", ...) handler.

Do NOT change httpVerb for other routes, such as the /create POST route you showed; those should remain "POST".

If there are any other logger calls (e.g., logRequestStarted, logRequestCompleted, logRequestFailed) associated with /dep/graph/get that are not shown here, update their httpVerb field to "GET" in the same way.

megatnt1122 added 2 commits November 25, 2025 08:03

refactor: data_router checkpoint (TESTS NOT IMPLEMENTED)

88dda53

chore: Auto-format JavaScript files with Prettier

7ab8555

sourcery-ai bot reviewed Nov 25, 2025

View reviewed changes

megatnt1122 self-assigned this Nov 25, 2025

megatnt1122 added Component: Database Relates to database microservice / data model Type: Refactor Imlplementation change, same functionality Priority: Low Lower priority work. labels Nov 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DAPS-1522] Data Router Logging Improvements #1789

[DAPS-1522] Data Router Logging Improvements #1789

Uh oh!

megatnt1122 commented Nov 25, 2025 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot commented Nov 25, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Uh oh!

sourcery-ai bot Nov 25, 2025

Uh oh!

sourcery-ai bot Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[DAPS-1522] Data Router Logging Improvements #1789

Are you sure you want to change the base?

[DAPS-1522] Data Router Logging Improvements #1789

Uh oh!

Conversation

megatnt1122 commented Nov 25, 2025 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Ticket

Description

Tasks

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for POST data/get with task initialization and logging

Class diagram for updated data_router modules and logging helpers

Flow diagram for generic data_router request lifecycle logging

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

megatnt1122 commented Nov 25, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Nov 25, 2025 •

edited

Loading