Skip to content

[Draft] Design Proposal: Scheduler Management, Observability & Log Explorer UI#267

Closed
aryunewaskar77-art wants to merge 1 commit into
volcano-sh:mainfrom
aryunewaskar77-art:proposal/scheduler-design-doc-main
Closed

[Draft] Design Proposal: Scheduler Management, Observability & Log Explorer UI#267
aryunewaskar77-art wants to merge 1 commit into
volcano-sh:mainfrom
aryunewaskar77-art:proposal/scheduler-design-doc-main

Conversation

@aryunewaskar77-art
Copy link
Copy Markdown

This draft PR contains a design proposal and implementation discussion for issue #197.

Goals:

validate design assumptions
gather maintainer feedback
discuss incremental implementation strategy
Related:

Issue #197
Exploratory work: #262
Prerequisites Task: #254
No production implementation included.

Signed-off-by: aryunewaskar77-art <aaryaanewaskar@gmail.com>
@volcano-sh-bot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign william-wang for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a detailed proposal for adding a /scheduler section to the Volcano Dashboard, encompassing configuration management, metrics observability, and log exploration. The proposal outlines a 12-week implementation plan targeting the new-ui branch using tRPC and Server-Sent Events. The review feedback highlights several technical improvements for the proposed implementation, specifically regarding the robustness of the log streaming logic (handling UTF-8 and partial lines), aligning default values for log tailing, adding necessary input validation for pod names, and ensuring RBAC rules are flexible enough for non-standard installations.


export const getRecentLogsSchema = z.object({
podName: z.string(),
tailLines: z.number().min(1).max(10000).default(500),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

A maximum of 10,000 lines for tailLines is quite high and could put significant pressure on the Kubernetes API server. Additionally, there is a discrepancy between the default value in the schema (500) and the Route Handler (100 on line 226). Consider lowering the maximum and aligning the defaults for consistency.

Path: apps/web/src/app/api/scheduler/logs/stream/route.ts
export async function GET(req: Request) {
const { searchParams } = new URL(req.url);
const podName = searchParams.get("pod");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The podName parameter is not validated. If it is missing from the request, the subsequent call to k8sLog.log will receive a null value, leading to a runtime error. Consider adding a check to ensure podName is present and returning a 400 Bad Request response if it is missing.

Comment on lines +233 to +237
passThrough.on("data", (chunk) => {
for (const line of chunk.toString("utf8").split("\n")) {
if (line) controller.enqueue(`data: ${line}\n\n`);
}
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current SSE implementation snippet enqueues partial lines if a log line is split across chunks. It also risks corrupting multi-byte UTF-8 characters. Data should be buffered until a newline is encountered, and StringDecoder should be used for safe decoding.

Suggested change
passThrough.on("data", (chunk) => {
for (const line of chunk.toString("utf8").split("\n")) {
if (line) controller.enqueue(`data: ${line}\n\n`);
}
});
const { StringDecoder } = await import("node:string_decoder");
const decoder = new StringDecoder("utf8");
let buffer = "";
passThrough.on("data", (chunk) => {
buffer += decoder.write(chunk);
const lines = buffer.split("\n");
buffer = lines.pop() ?? "";
for (const line of lines) {
controller.enqueue(`data: ${line}\n\n`);
}
});

Add to ClusterRole rules:
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["volcano-scheduler-configmap"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoding volcano-scheduler-configmap in the RBAC rules assumes the default installation name. If the ConfigMap name is customized, the dashboard will lose access. It is recommended to document this limitation or use a more flexible approach (e.g., allowing access to all ConfigMaps in the namespace if scoped correctly).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants