Skip to content

Add analyse gaps subcommand for detecting selective evtx record deletion#228

Merged
alexkornitzer merged 3 commits intoWithSecureLabs:masterfrom
Fuzzdkk:analyse-gaps
Apr 27, 2026
Merged

Add analyse gaps subcommand for detecting selective evtx record deletion#228
alexkornitzer merged 3 commits intoWithSecureLabs:masterfrom
Fuzzdkk:analyse-gaps

Conversation

@Fuzzdkk
Copy link
Copy Markdown
Contributor

@Fuzzdkk Fuzzdkk commented Apr 26, 2026

Adds an analyse gaps subcommand that flags two indicators of evtx record tampering inside a single file.

The first is RecordID gaps. Per-channel EventRecordID values are normally monotonically increasing, so a hole inside one evtx file (not at a log-rotation boundary) is the fingerprint left by tools that surgically delete records to avoid triggering EID 1102 ("log cleared"). One real-world example is Eventlogedit-style tampering.

The second is time gaps. Unexpectedly long quiet windows on a normally chatty channel can indicate that records inside the window were removed. The threshold is configurable via --min-time-gap-minutes (default 30).

Output defaults to human-readable text, with JSON via --json. Both detectors can be turned off independently with --no-record-id-gaps and --no-time-gaps. Standard -o, -q, and --skip-errors options are supported.

I went with a module rather than a Sigma rule because the detection is stateful and reasons about absent events, which Sigma's grammar does not support. The module could later emit synthetic events (something like chainsaw.gap_detected) for Sigma rules to match on if that would be useful downstream.

Tests

  • 5 unit tests in src/analyse/gaps.rs cover the gap-detection logic in isolation
  • 3 integration tests in tests/clo.rs exercise the CLI text and JSON paths against the existing tests/evtx/security_sample.evtx fixture
  • Verified zero false positives on the clean fixture and that the threshold-zero case forces a positive

Example output

$ chainsaw analyse gaps ./Logs/

=== ./Logs/Security.evtx ===
[+] Channels seen:
    - Security: 184302 records, RecordID 4521..188822, 2026-04-01T00:00:11Z -> 2026-04-25T18:42:09Z
[!] 2 RecordID gap(s) detected (possible selective record deletion):
    - Security: RecordID 92841 -> 92847 (5 missing) between 2026-04-12T03:14:08Z and 2026-04-12T03:14:09Z
    - Security: RecordID 138210 -> 138215 (4 missing) between 2026-04-19T22:01:33Z and 2026-04-19T22:01:34Z
[+] No suspicious time gaps detected

Happy to adjust defaults, rename the subcommand, or add the synthetic-event emission if you would prefer that direction.

Detects two indicators of evtx record tampering inside a single file:
RecordID holes (records surgically removed without triggering EID 1102)
and unusually long quiet windows on chatty channels.

Output defaults to text, with JSON via --json. Both detectors are
configurable. Includes 5 unit tests and 3 CLI integration tests.
@Fuzzdkk Fuzzdkk requested a review from alexkornitzer as a code owner April 26, 2026 00:35
Copy link
Copy Markdown
Collaborator

@alexkornitzer alexkornitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Fuzzdkk, thanks for submitting this, it is a nifty little feature and I agree that is makes sense as an analysis sub command.

I have left comments that would need to be done before I merge it in but if I understand the intent of the feature correctly I would propose that the following might be a better approach for the implementation?

  1. Parse the a file storing an intermediate of the form Vec<(channel, record_id, timestamp)>. As we don't need to search the array and every entry must be sorted for analysis an array with a onetime sort 'should' be faster and smaller than a btree.
  2. Sort the intermediate array using the channel and record id.
  3. Consume the array in pairs creating a new array of the for Vec<Gap> where gap would look something like:
enum Kind{
  ID,
  Timestamp,
}
struct Gap {
  kind: Kind,
  channel: String,
  from: u64,
  until: u64,
  start: i64,
  stop: i64,
}
  1. This array can now be passed in linear sweeps to create the CLI output as desired, or it can simply be output to JSON or CSV due to its flat nature with easy ability to filter.

Comment thread src/analyse/gaps.rs Outdated
Comment thread src/analyse/gaps.rs Outdated
Comment thread src/analyse/gaps.rs Outdated
Comment thread src/main.rs Outdated
Comment thread src/main.rs Outdated
Comment thread src/main.rs
Refactor to a flat Vec<Gap> with a Kind enum (record_id/timestamp).
Per-file flow now parses into Vec<(channel, record_id, secs)>, sorts
by channel then RecordID, then sweeps pairs. Drops the separate
RecordIdGap and TimeGap vecs along with the BTreeMap intermediate.

Other fixes from the review:
- Store timestamps as seconds only; format at print time.
- Drop the redundant empty check on the per-channel slice.
- Use cs_println! directly instead of buffering with writeln!.
- Wrap cs_println! body in {{ }} so multiple calls in one scope don't
  collide on use std::io::Write (same pattern as cs_print_json! and
  cs_eprintln!).
- Trim "(only flag X gaps)" from the two skip-flag doc strings.
- Add --from, --to, --local, --timezone, mirroring hunt.
- Update the integration test for the new JSON shape.
@Fuzzdkk
Copy link
Copy Markdown
Contributor Author

Fuzzdkk commented Apr 26, 2026

Thanks, that's a much cleaner shape. Pushed the rewrite.

Per file it now parses into Vec<(channel, record_id, secs)>, sorts by channel then RecordID, and sweeps in pairs. Single Gap struct with a kind enum (record_id / timestamp) and your from/until/start/stop fields, so the two separate vecs are gone. JSON is just the flat gap array per file.

Inline ones are all done too: seconds only with formatting at print time, dropped the empty check, cs_println! direct calls instead of buffering, trimmed the bracketed bits from the doc strings, pulled in --from/--to/--local/--timezone from hunt.

One thing worth flagging: had to wrap the body of cs_println! in {{ }} otherwise calling it twice in the same scope errors on duplicate use std::io::Write. cs_print_json! and cs_eprintln! already do this. Happy to split it into its own PR if you'd rather keep the macro fix separate.

@alexkornitzer
Copy link
Copy Markdown
Collaborator

alexkornitzer commented Apr 26, 2026

Ah wonderful, thanks for doing that, it is very refreshing to have people implement features rather than always requesting them :)

I am not all that precious on true commit isolation so having the macro hygiene fixes in here is fine.

One thing on the JSON output, we can skip this if you disagree but I was thinking about it outputting truly flat like search or dump does, i.e. it would just be a flat array of gap entries (we could then inject the file path into each one using the wrapper flatten pattern. Then it should be trivial to add CSV & JSONL support as its literally just a table.

#[derive(Serialize)]
struct JsonWrapper<'a> {
  path: &'a String,
  #[serde(flatten)]
  gap: &'a Gap,
}

After your thoughts on the above I'll squash merge this in and then do the usual project spring cleaning after which I will get this out as a new release.

Match the search/dump pattern: emit a flat array of gap rows with the
file path injected into each entry via a wrapper using
#[serde(flatten)]. Makes CSV/JSONL trivial to add later.

Side effect: the per-file ChannelStats preamble drops out of JSON
(can't really live in a flat row table). Text mode still shows it.
@Fuzzdkk
Copy link
Copy Markdown
Contributor Author

Fuzzdkk commented Apr 26, 2026

Done in c61e8fc. Just flagging that the ChannelStats preamble drops out of the JSON in this approach since it doesn't really fit a flat row table, text mode still shows it.

And likewise, was a fun one to put together. More than happy to help out in future if anything comes up, and I'll keep iterating on this one if you think of anything else. If there are other ideas you haven't had time for I'm open to picking some up.

@alexkornitzer
Copy link
Copy Markdown
Collaborator

Awesome, yep that is how it should be as those outputs tend to be for use in other tools which can usually trivially build the summaries themselves.

When I get a moment (probably tomorrow) I'll get this merged, thanks again for contributing.

The only other things I can think of, although I haven't looked at them in a while would be the currently open issues but they might need a once over for staleness.

@alexkornitzer alexkornitzer merged commit d8924d6 into WithSecureLabs:master Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants