Skip to content

Add KAPE/Hindsight insider-threat parsers + dashboards#401

Open
RandyRandleman wants to merge 2 commits into
philhagen:mainfrom
RandyRandleman:feature/kape-insider-threat-parsers
Open

Add KAPE/Hindsight insider-threat parsers + dashboards#401
RandyRandleman wants to merge 2 commits into
philhagen:mainfrom
RandyRandleman:feature/kape-insider-threat-parsers

Conversation

@RandyRandleman

Copy link
Copy Markdown

NEW PARSERS
6505-kape_sbecmd.conf SBECmd shellbag output (UsrClass/NTUSER)
6506-kape_pecmd.conf PECmd prefetch — explodes each row into
one ES doc per recorded LastRun/PreviousRun0-6
6507-kape_recmd.conf RECmd batch output (Kroll_Batch, DFIRBatch);
tags by Category for fast filtering
6508-browser_hindsight.conf Hindsight (obsidianforensics) Chromium
internet history; downloads + URL categorization

MODIFIED EXISTING PARSERS (Phil-approved)
6501-kape_mftecmd.conf Date stanzas now accept
yyyy-MM-dd HH:mm:ss.SSSSSSS + yyyy-MM-dd HH:mm:ss
in addition to ISO8601. Without this, csv2json'd
CSV input fires _dateparsefailure on every event.
6503-kape_lecmd.conf Same multi-format date fix.
6504-kape_evtxecmd.conf Same multi-format date fix. Also: host.role
tagging (Kerberos 4768/4769/4770/4771/4776 =
domain_controller; SMBServer 1006/3000/3001 =
file_server) + Computer -> host.hostname copy.
6507-kape_recmd.conf Adds host metadata enrichment: pulls host.os.name
+ host.os.build + host.os.edition from
HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion,
host.role from \Control\ProductOptions\ProductType
(LanmanNT=DC, ServerNT=member_server, WinNT=
endpoint), and host.hostname from \Control ComputerName\ComputerName. Drives the Evidence
Inventory dashboard's OS donut + host-roles table.

ES INDEX TEMPLATES
index-shellbags.json (new)
index-prefetch.json (new)
index-registry.json (new)
index-browser.json (new)
index-evtxlogs.json Modified: winlog.payload.{event_data,user_data,
payload_data} mapped as 'flattened' so heterogeneous
EVTX events (Security 4624 vs SMBServer 1006 etc.)
coexist without mapping conflicts.

KIBANA DATA VIEWS
shellbags / prefetch / registry / browser / lnkfiles / evtxlogs / filesystem
insider-threat (cross-artifact: spans all 7 indices for the unified dashboard)
kape (existing — extended to include new indices)
all-records (existing — extended to include shellbags-/prefetch-/registry-/
browser-
so the Evidence Inventory dashboard sees them too)

KIBANA SAVED SEARCHES + DASHBOARDS (9 dashboards, ~35 saved searches, 9 Lens)
Shellbags / Prefetch / Registry / Browser / LNK / EventLog / Filesystem
Each: 5-panel Discover layout, insider-threat-focused queries, sort by
@timestamp desc, hero panel on top + 2x2 below.
Plus: 'Insider Threat — Unified (cross-artifact)' — 6 concept-grouped panels
(Execution / File Access / Exfil / Auth / Persistence / Unified Timeline)
spanning every artifact index.
Plus: 'Evidence Inventory — Insider Threat' — 9 Lens datatables answering
'what evidence do I have and what time window does it cover'. Top strip:
OS distribution donut + host roles table. Below: per-artifact files-ingested
tables with record count + first/last @timestamp per source file.

FILEBEAT INPUTS
kape.yml: added kape-sbecmd-json, kape-pecmd-json, kape-recmd-json,
browser-hindsight-json inputs with appropriate file patterns.

PREPROCESS ROUTING
1000-preprocess-all.conf: added kape_shellbags / kape_prefetch /
kape_registry / browser_history routing.

SUPPORTING SCRIPTS
csv2json.py Patched to handle every EZ Tools CSV variant:
- UTF-8 BOM auto-detect (RECmd, RegistryExplorer exports)
- UTF-16 LE/BE detection (PowerShell-redirected output)
- Embedded NUL byte stripping (RECmd RegBinary values,
MFTECmd $FILE_NAME bytes)
- csv.field_size_limit raised to 10 MB for big single
fields (PECmd FilesLoaded, RECmd ValueData)
- --preserve-case flag for upstream parsers that expect
EZ Tools' native PascalCase keys (6501/6503/6504)
- errors='replace' safety net for mangled-byte rows
hindsight2json.py (new) Converts Hindsight xlsx Timeline sheet to JSONL
with lowercased keys matching csv2json convention.

TESTING
Validated end-to-end against the SANS FOR509 'shieldbase.lan' KAPE
collection (multi-host, multi-user). All parsers produce correctly-typed
ES documents with accurate @timestamp from the actual event time
(not ingest time). Dashboards populate cleanly across the
2016-08 -> 2018-09 event range.

USE CASE
Targets insider-threat investigations on Windows hosts where KAPE has
been run as the collection tool. The cross-artifact 'Insider Threat —
Unified' dashboard answers questions in investigator vocabulary
('what executed?', 'what files moved?', 'what left the host?')
rather than artifact vocabulary, with the _index column on every panel
pointing back to the dedicated dashboard for drill-down.

NEW PARSERS
  6505-kape_sbecmd.conf       SBECmd shellbag output (UsrClass/NTUSER)
  6506-kape_pecmd.conf        PECmd prefetch — explodes each row into
                              one ES doc per recorded LastRun/PreviousRun0-6
  6507-kape_recmd.conf        RECmd batch output (Kroll_Batch, DFIRBatch);
                              tags by Category for fast filtering
  6508-browser_hindsight.conf Hindsight (obsidianforensics) Chromium
                              internet history; downloads + URL categorization

MODIFIED EXISTING PARSERS (Phil-approved)
  6501-kape_mftecmd.conf      Date stanzas now accept
                              yyyy-MM-dd HH:mm:ss.SSSSSSS + yyyy-MM-dd HH:mm:ss
                              in addition to ISO8601. Without this, csv2json'd
                              CSV input fires _dateparsefailure on every event.
  6503-kape_lecmd.conf        Same multi-format date fix.
  6504-kape_evtxecmd.conf     Same multi-format date fix. Also: host.role
                              tagging (Kerberos 4768/4769/4770/4771/4776 =
                              domain_controller; SMBServer 1006/3000/3001 =
                              file_server) + Computer -> host.hostname copy.
  6507-kape_recmd.conf        Adds host metadata enrichment: pulls host.os.name
                              + host.os.build + host.os.edition from
                              HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion,
                              host.role from \Control\ProductOptions\ProductType
                              (LanmanNT=DC, ServerNT=member_server, WinNT=
                              endpoint), and host.hostname from \Control                              ComputerName\ComputerName. Drives the Evidence
                              Inventory dashboard's OS donut + host-roles table.

ES INDEX TEMPLATES
  index-shellbags.json   (new)
  index-prefetch.json    (new)
  index-registry.json    (new)
  index-browser.json     (new)
  index-evtxlogs.json    Modified: winlog.payload.{event_data,user_data,
                         payload_data} mapped as 'flattened' so heterogeneous
                         EVTX events (Security 4624 vs SMBServer 1006 etc.)
                         coexist without mapping conflicts.

KIBANA DATA VIEWS
  shellbags / prefetch / registry / browser / lnkfiles / evtxlogs / filesystem
  insider-threat (cross-artifact: spans all 7 indices for the unified dashboard)
  kape (existing — extended to include new indices)
  all-records (existing — extended to include shellbags-*/prefetch-*/registry-*/
               browser-* so the Evidence Inventory dashboard sees them too)

KIBANA SAVED SEARCHES + DASHBOARDS (9 dashboards, ~35 saved searches, 9 Lens)
  Shellbags / Prefetch / Registry / Browser / LNK / EventLog / Filesystem
  Each: 5-panel Discover layout, insider-threat-focused queries, sort by
  @timestamp desc, hero panel on top + 2x2 below.
  Plus: 'Insider Threat — Unified (cross-artifact)' — 6 concept-grouped panels
  (Execution / File Access / Exfil / Auth / Persistence / Unified Timeline)
  spanning every artifact index.
  Plus: 'Evidence Inventory — Insider Threat' — 9 Lens datatables answering
  'what evidence do I have and what time window does it cover'. Top strip:
  OS distribution donut + host roles table. Below: per-artifact files-ingested
  tables with record count + first/last @timestamp per source file.

FILEBEAT INPUTS
  kape.yml: added kape-sbecmd-json, kape-pecmd-json, kape-recmd-json,
            browser-hindsight-json inputs with appropriate file patterns.

PREPROCESS ROUTING
  1000-preprocess-all.conf: added kape_shellbags / kape_prefetch /
                            kape_registry / browser_history routing.

SUPPORTING SCRIPTS
  csv2json.py    Patched to handle every EZ Tools CSV variant:
                   - UTF-8 BOM auto-detect (RECmd, RegistryExplorer exports)
                   - UTF-16 LE/BE detection (PowerShell-redirected output)
                   - Embedded NUL byte stripping (RECmd RegBinary values,
                     MFTECmd $FILE_NAME bytes)
                   - csv.field_size_limit raised to 10 MB for big single
                     fields (PECmd FilesLoaded, RECmd ValueData)
                   - --preserve-case flag for upstream parsers that expect
                     EZ Tools' native PascalCase keys (6501/6503/6504)
                   - errors='replace' safety net for mangled-byte rows
  hindsight2json.py (new)  Converts Hindsight xlsx Timeline sheet to JSONL
                           with lowercased keys matching csv2json convention.

TESTING
  Validated end-to-end against the SANS FOR509 'shieldbase.lan' KAPE
  collection (multi-host, multi-user). All parsers produce correctly-typed
  ES documents with accurate @timestamp from the actual event time
  (not ingest time). Dashboards populate cleanly across the
  2016-08 -> 2018-09 event range.

USE CASE
  Targets insider-threat investigations on Windows hosts where KAPE has
  been run as the collection tool. The cross-artifact 'Insider Threat —
  Unified' dashboard answers questions in investigator vocabulary
  ('what executed?', 'what files moved?', 'what left the host?')
  rather than artifact vocabulary, with the _index column on every panel
  pointing back to the dedicated dashboard for drill-down.
@RandyRandleman

Copy link
Copy Markdown
Author

Everything was nominal on testing

NEW PARSERS
  6509-browser_nirsoft.conf   NirSoft BrowsingHistoryView CSV. Covers
                              Chrome, Firefox, IE 10/11, legacy Edge. Shares
                              browser-* index with Hindsight via
                              [browser][source_tool]; sets lowercase
                              [browser][family] for cross-tool grouping.
  6510-kape_jlecmd.conf       JLECmd Jump Lists. Single parser handles both
                              .automaticDestinations-ms (DestList + Pinned +
                              MRU + InteractionCount) and .customDestinations-ms.
                              Insider-threat tags: removable media (drive-letter
                              + DriveType), UNC + mapped network, cross-machine
                              link-tracker, pinned items, app classification,
                              sensitive file types.
  6511-volatility-pstree.conf TESTING SCAFFOLD around Phil's feature/volatility3
                              flatten-pstree.rb. Will rebase out via Phil's
                              merge to main. Calls the recursive flatten,
                              splits each process into its own ES doc,
                              insider-threat tags: LOLBins, misplaced system
                              binaries (svchost/lsass not in System32),
                              suspicious paths (Temp/AppData/Public),
                              evasive PowerShell (-enc/-nop/-w hidden),
                              no-cmdline processes, deep trees,
                              remote-access tooling.

MODIFIED EXISTING
  6508-browser_hindsight.conf  Adds cross-tool [browser][source_tool]='hindsight'
                              and lowercases [browser][family] so Browser
                              Distribution donut groups across both tools.
  1000-preprocess-all.conf     Routing: browser_nirsoft -> browser-*,
                              kape_jumplists -> jumplists-*, volatility_pstree
                              -> volatility-* (last is part of testing scaffold).
  kape.yml (filebeat)          New inputs: browser-nirsoft-json,
                              kape-jlecmd-json, volatility-pstree-json.
  csv2json.py                  Adds cp1252 auto-detect for NirSoft files
                              (try utf-8 first; if it fails, fall back to
                              cp1252 which decodes any byte).

NEW ES TEMPLATES + DATA VIEWS
  index-jumplists.json + data view (jump-list-specific fields)
  index-volatility.json + data view (testing scaffold)
  all-records: extended to include jumplists-* + volatility-*
  insider-threat: extended to include jumplists-*

NEW SUPPORTING SCRIPTS
  volatility-flatten-pstree.rb  Phil's flatten script, copied verbatim from
                              feature/volatility3 for the testing scaffold.
                              Will rebase out via Phil's merge.
  vol2json.py                  BOM-aware Volatility-output -> clean UTF-8
                              NDJSON. PowerShell's default '>' redirection
                              writes UTF-16 LE with a BOM, which would
                              otherwise silently kill JSON parse. Same
                              pattern as csv2json/hindsight2json.

DASHBOARD RENAMES (from 'X - Insider Threat' to plain functional names,
matching Phil's existing 'X Dashboard' naming convention)
  Shellbags / Prefetch / Registry / LNK Files / Jump Lists / Evidence Inventory
  Browser History (Hindsight) / Browser History (NirSoft)
  Insider Threat Overview / Volatility - Process Tree
  KAPE - Event Log / KAPE - Filesystem  (KAPE prefix avoids collision with
                                          your existing 'Eventlog Dashboard'
                                          and 'Filesystem Dashboard')

BUG FIXES
  bn-browser-distribution Lens   sourceField was 'browser.family.keyword';
                                  browser.family is mapped as keyword at top
                                  level, so .keyword pointed at nothing.
                                  Fixed to 'browser.family'.
  jl-app-distribution Lens       Same fix applied to 'jumplist.app_description'.

KNOWN GAP IDENTIFIED FOR FOLLOW-UP
  post_merge.sh import path skips kibana/lens/ — any dashboard that
  references Lens (Evidence Inventory, both Browser History dashboards,
  Jump Lists, Volatility) fails to render until the Lens are manually
  imported via Kibana's saved_objects/_import API. Workaround documented
  in SOFELK_PR.md. Worth fixing in post_merge.sh as a follow-up PR.
philhagen added a commit that referenced this pull request Jun 5, 2026
philhagen added a commit that referenced this pull request Jun 5, 2026
* Add Volatility3 memory forensics parsers for SOF-ELK integration (#395)

This contribution adds support for ingesting Volatility3 memory forensics output into SOF-ELK via Filebeat and Logstash.

Added components:
- 8 Logstash filter configurations for 6 Volatility3 plugins (pslist, pstree, psscan, netscan, cmdline, netstat)
- Filebeat input configuration for monitoring Volatility output directories
- Python script for converting Volatility JSON output to NDJSON format
- Documentation covering supported plugins and overview

Features:
- Process enumeration and hidden process detection
- Network connection analysis with GeoIP enrichment
- Command line analysis with attack pattern detection
- Suspicious indicator tagging for threat hunting

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

* additional volatility integration work

* fix path

* updates for volatility parsing and display

* allow multiline json input

* update preprocessor for invalid field names

* RC for pslist module

* finalize RC for pslist module; use tags to differentiate and a consolidated output ES index for all volatility output

* more parser migration/streamlining

* fix to name vs GH userid

* consolidate volatility parsing

* narrow scope to handle multiline artifacts

* fix list

* fix comment, correct grok syntax

* updates for jsonl-based volatility files

* updates to pslist parsing, consolidate postprocessing

* update all event.duration to nanoseconds

* fix yaml dictionary syntax, move volatility ip address tweaks earlier in the pipeline, fix community_id calculation

* finish moving volatility ip handling

* re-parenting

* restructure postprocessing configs

* correct paths in ECS doc, first stab at flattening pstree

* path correction

* swap null -> nil

* .except not available until ruby3

* RC for volatility pstree flattening

* correct test logic

* remove event.original for child processes.  saves storage space and simplifies visibility but events can still be reconstructed from event.original on top-level process

* rename function and add comment

* cleanup

* cleanup

* fix intend to 2 spaces, syntax

* update syntax for scripts, fix indents

* add volatility preprocessing script based almost entirely on Tony's submission from PR #401

* rename

* minor cleanup

---------

Co-authored-by: Raymond Garay-Paravisini <rgarayparavisini@augusta.edu>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant