Filter debug messages from console output#1068
Conversation
a8b1a37 to
a29838f
Compare
d26ce7c to
11ad63e
Compare
DavisVaughan
left a comment
There was a problem hiding this comment.
I didn't look at the tests, and only feel like I 60% understand this, but I understand the general idea
crates/ark/src/console_debug.rs
Outdated
| pub(crate) fn debug_update_call_text(&mut self, update: DebugCallTextUpdate) { | ||
| match update { | ||
| DebugCallTextUpdate::Finalized(text, kind) => { | ||
| self.debug_call_text = DebugCallText::Finalized(text, kind); | ||
| }, |
There was a problem hiding this comment.
Something seems outdated here.
DebugCallText seems like it should really just be None and Some now, or something like that.
There is no need for a Finalized state if there is no longer a Capturing state?
There was a problem hiding this comment.
And why are DebugCallTextUpdate and DebugCallText different? It seems like we only need 1 type here.
There was a problem hiding this comment.
Good point, this is now
pub enum DebugCallText {
/// `debug: <expr>` - emitted when stepping without srcrefs
Debug(String),
/// `debug at <path>#<line>: <expr>` - emitted when stepping with srcrefs
DebugAt(String),
}
crates/ark/src/console_filter.rs
Outdated
|
|
||
| fn get_timeout() -> Duration { | ||
| if stdext::IS_TESTING { | ||
| Duration::from_millis(500) |
There was a problem hiding this comment.
why do you need this and new_with_timeout?
There was a problem hiding this comment.
new_with_timeout() is used for unit tests so they don't wait as much.
| self.drain_on_timeout() | ||
| } | ||
|
|
||
| fn drain_on_timeout(&mut self) -> Option<String> { |
There was a problem hiding this comment.
I'm not sure I understand the timeout. Is it really needed?
There was a problem hiding this comment.
This is part of our guarantee that output is never swallowed if the user doesn't actually drop in the debugger. We only hold onto filtered output until a small timeout of 50ms. This way if the user has a cat("debug: ") call in their code, we won't hold all their subsequent output until the next ReadConsole (which might never come in a server-like situation), they'll just see a small delay.
It's still possible that we swallow user output in this way, but only if we eventually hit the debugger within the timeout delay, which makes the risk acceptable.
| /// | ||
| /// Must only be called at browser prompts: at top level, user print | ||
| /// methods could legitimately produce output matching these prefixes. | ||
| pub fn strip_step_lines(text: &mut String) { |
There was a problem hiding this comment.
Is the prefix not always in the first line?
There was a problem hiding this comment.
This is called for top-level filtering of execute results, so there could be user output before.
[1] 1
debug at #3: 2
crates/ark/src/console.rs
Outdated
|
|
||
| /// Filter for debug console output. Removes R's internal debug messages | ||
| /// from user-visible console output. | ||
| pub(crate) filter: ConsoleFilter, |
There was a problem hiding this comment.
| pub(crate) filter: ConsoleFilter, | |
| pub(crate) debug_filter: ConsoleDebugFilter, |
This feels like it could use a very debug specific name?
Otherwise I think it could be invoked at any time to filter something
There was a problem hiding this comment.
hmm the filter is always active, including in top-level repls. The initial debug output comes before dropping in a nested debug repl.
There was a problem hiding this comment.
I let Claude settle it :)
debug_filter. Davis is right thatfilteris too generic on a struct this large. And the fact that it's always active doesn't make it less debug-specific in purpose.
This reverts commit 2790eff7e2eab7316374bb8f2076164aee38db57. Feels like premature optimisation
`exiting from:` is unique among debug prefixes: it is NOT followed by `do_browser()`. After the prefix and its call text, user code resumes in the calling function. When the inner function exits to an outer function (n_frame > 0), the stream filter enters Filtering state and accumulates everything until the next ReadConsole — including user output from the outer function. This causes user output to be silently swallowed. The other filtered prefixes (`debug at`, `debug:`, `Called from:`) are always immediately followed by `do_browser()`, which makes them safe to accumulate and suppress at the browser prompt. There is no such delimiter after `exiting from:`, so there is no reliable way to separate it from subsequent user output in the stream. For symmetry, `debugging in:` receives the same treatment — both are function transition messages (entering/leaving a debugged closure) and neither is critical to suppress.
No longer useful now that we preserve `exiting from` output
0a06887 to
6e11755
Compare
Branched from #1065.
Addresses posit-dev/positron#12230
R's debug repl emits messages like
debug at file.R#10:,Called from:,debugging in:, andexiting from:to the console. In a terminal these help the user stay oriented through debug sessions, and IDEs can parse them to extract stepping information. In Positron, however, the GUI already surfaces this information through the call stack pane and step highlighting, making the console messages redundant and distracting.With the injected breakpoints we use for auto-stepping, the noise unfortunately increased quite a bit. This makes the console quite a hostile place for users during debugging sessions, when they really should feel at home and be able to easily find output of interest.
This PR introduces a
ConsoleFilterstate machine that intercepts stdout atWriteConsoletime, detects debug message prefixes, and suppresses the stepping noise atReadConsoletime (debug at,debug:,Called from:) while leaving function entry/exit messages (debugging in:,exiting from:) visible.What R emits
R's debug handler emits four families of messages, all from C code in
eval.c.debug at file.R#N:ordebug:(the latter when no srcref is available). Emitted on each step through braces, control flow, and functions. Followed bydo_browser():https://github.com/r-devel/r-svn/blob/35ef64e6e7f7ac5423747d9e7be70bbde47f0b96/src/main/eval.c#L2988-L2990
Called from:. Emitted bybrowser()itself before entering the REPL. Followed bydo_browser().debugging in:. Emitted when entering a closure viadebug()ors. Followed bydo_browser():https://github.com/r-devel/r-svn/blob/35ef64e6e7f7ac5423747d9e7be70bbde47f0b96/src/main/eval.c#L2369C1-L2371C34
exiting from:. Emitted when leaving a debugged closure (inR_execClosure). Unlike the other three, this is NOT followed bydo_browser()— user code resumes immediately after:Each prefix is followed by a
PrintValue()orPrintCall()that can produce arbitrary multi-line output (long calls wrapping across lines, literal vectors, even large objects injected by metaprogramming). There's no syntactic delimiter marking the end of a debug message, so we can't use parse-based detection.The key insight that makes reliable filtering possible is that three of the four prefixes are immediately followed by
do_browser(), which entersReadConsole. No user code runs between the prefix and thatReadConsole, so everything in between is debug noise that can be (mostly) safely accumulated and suppressed.exiting from:is the exception. It is emitted after the browser'sReadConsolehas already returned (the user typedc,n, etc.) but before R reaches the nextReadConsole. User code runs after the prefix: the function returns a value and that value may get printed. There is no delimiter between thePrintCalloutput and subsequent user code output, making it impossible to safely stripexiting from:without risking swallowed user output.Filter mechanism
We filter only the stepping noise, i.e. the prefixes that fire on every step and are always immediately followed by
do_browser():debug at/debug:Called from:We do not filter function entry/exit transitions:
exiting from:can't be filtered because user code runs after itdebugging in:not filtered for symmetry withexiting from:This is a good trade-off. The stepping noise is by far the most frequent and distracting — especially with injected breakpoints producing a flurry of
Called from:+debug atpairs on every auto-step. The entry/exit messages are comparatively rare (once per function boundary) and useful for orientation.The filter is a state machine with two states:
Passthrough: Default. Content is emitted to IOPub immediately. At line boundaries, checks if new content matches a full filtered debug prefix. On match, transitions directly to Filtering. We don't need to worry about partial matching since R writes each prefix with a single
Rprintfcall so it always arrives as one complete chunk to WriteConsole.Filtering: Full prefix match confirmed. Accumulates all subsequent content until
ReadConsoleresolves whether this is real debug output or adversarial user output.Resolution happens at
ReadConsoletime:was_debugging || is_browser, we either just left a browser session or are entering one): content is suppressed as real debug output, and the captured expression is extracted for auto-stepping.A timeout mechanism (50ms, 500ms in tests) ensures content is never held indefinitely if
ReadConsoledoesn't arrive promptly.Also noteworthy:
Debug messages also reach autoprint output (the
execute_resultpath) when stepping through top-level braced expressions.strip_step_lines()cleans autoprint at browser prompts by truncating at the first line matchingCalled from:,debug at, ordebug:. This is only applied at browser prompts — at top-level prompts these strings could be legitimate user output from print methods.The old
debug_handle_write_console/debug_handle_read_consolepair that captureddebug:anddebug atfor auto-stepping is replaced by the filter'sfinalize_capture. TheDebugCallText::Capturingvariant is removed since the filter now handles accumulation. The filter producesDebugCallTextUpdatevalues that the console applies atReadConsoletime.Safety guarantees
The filter only suppresses output when it can confirm a debug context: either a browser prompt is reached, or the prefix was matched while already in a debug session. At top-level prompts outside of debug sessions, any prefix-matching content is emitted back to the user. Combined with the timeout fallback, this guarantees the filter has no impact on regular (non-debug) usage.
User
cat()output that matches a filtered debug prefix (Called from:,debug at,debug:) inside a browser session IS suppressed. We can't distinguish it from real debug output because both are followed by a browserReadConsoleprompt. This is documented in the adversarial tests.Filtering can be disabled with
options(ark.debugger.filter_debug_output = FALSE). WhenNULL(the default), filtering is enabled. When disabled, the filter still runs (extracting debug info for auto-stepping) but emits all content instead of suppressing it.QA Notes
Comprehensively tested with both unit tests (state machine) and integration tests (full kernel), including adversarial scenarios where user code produces output matching debug prefixes, and nested debug scenarios (debugged function calling another debugged function, debugged function returning to non-debugged caller).
Debug sessions should no longer show
Called from:,debug at, ordebug:messages in the console.debugging in:andexiting from:messages remain visible. Normal output during debugging (e.g.cat(),print()) should be unaffected.