Add analyse gaps subcommand for detecting selective evtx record deletion#228
Conversation
Detects two indicators of evtx record tampering inside a single file: RecordID holes (records surgically removed without triggering EID 1102) and unusually long quiet windows on chatty channels. Output defaults to text, with JSON via --json. Both detectors are configurable. Includes 5 unit tests and 3 CLI integration tests.
alexkornitzer
left a comment
There was a problem hiding this comment.
Hey @Fuzzdkk, thanks for submitting this, it is a nifty little feature and I agree that is makes sense as an analysis sub command.
I have left comments that would need to be done before I merge it in but if I understand the intent of the feature correctly I would propose that the following might be a better approach for the implementation?
- Parse the a file storing an intermediate of the form
Vec<(channel, record_id, timestamp)>. As we don't need to search the array and every entry must be sorted for analysis an array with a onetime sort 'should' be faster and smaller than a btree. - Sort the intermediate array using the channel and record id.
- Consume the array in pairs creating a new array of the for
Vec<Gap>where gap would look something like:
enum Kind{
ID,
Timestamp,
}
struct Gap {
kind: Kind,
channel: String,
from: u64,
until: u64,
start: i64,
stop: i64,
}
- This array can now be passed in linear sweeps to create the CLI output as desired, or it can simply be output to JSON or CSV due to its flat nature with easy ability to filter.
Refactor to a flat Vec<Gap> with a Kind enum (record_id/timestamp).
Per-file flow now parses into Vec<(channel, record_id, secs)>, sorts
by channel then RecordID, then sweeps pairs. Drops the separate
RecordIdGap and TimeGap vecs along with the BTreeMap intermediate.
Other fixes from the review:
- Store timestamps as seconds only; format at print time.
- Drop the redundant empty check on the per-channel slice.
- Use cs_println! directly instead of buffering with writeln!.
- Wrap cs_println! body in {{ }} so multiple calls in one scope don't
collide on use std::io::Write (same pattern as cs_print_json! and
cs_eprintln!).
- Trim "(only flag X gaps)" from the two skip-flag doc strings.
- Add --from, --to, --local, --timezone, mirroring hunt.
- Update the integration test for the new JSON shape.
|
Thanks, that's a much cleaner shape. Pushed the rewrite. Per file it now parses into Inline ones are all done too: seconds only with formatting at print time, dropped the empty check, cs_println! direct calls instead of buffering, trimmed the bracketed bits from the doc strings, pulled in --from/--to/--local/--timezone from hunt. One thing worth flagging: had to wrap the body of |
|
Ah wonderful, thanks for doing that, it is very refreshing to have people implement features rather than always requesting them :) I am not all that precious on true commit isolation so having the macro hygiene fixes in here is fine. One thing on the JSON output, we can skip this if you disagree but I was thinking about it outputting truly flat like search or dump does, i.e. it would just be a flat array of gap entries (we could then inject the file path into each one using the wrapper flatten pattern. Then it should be trivial to add CSV & JSONL support as its literally just a table. #[derive(Serialize)]
struct JsonWrapper<'a> {
path: &'a String,
#[serde(flatten)]
gap: &'a Gap,
}After your thoughts on the above I'll squash merge this in and then do the usual project spring cleaning after which I will get this out as a new release. |
Match the search/dump pattern: emit a flat array of gap rows with the file path injected into each entry via a wrapper using #[serde(flatten)]. Makes CSV/JSONL trivial to add later. Side effect: the per-file ChannelStats preamble drops out of JSON (can't really live in a flat row table). Text mode still shows it.
|
Done in c61e8fc. Just flagging that the ChannelStats preamble drops out of the JSON in this approach since it doesn't really fit a flat row table, text mode still shows it. And likewise, was a fun one to put together. More than happy to help out in future if anything comes up, and I'll keep iterating on this one if you think of anything else. If there are other ideas you haven't had time for I'm open to picking some up. |
|
Awesome, yep that is how it should be as those outputs tend to be for use in other tools which can usually trivially build the summaries themselves. When I get a moment (probably tomorrow) I'll get this merged, thanks again for contributing. The only other things I can think of, although I haven't looked at them in a while would be the currently open issues but they might need a once over for staleness. |
Adds an
analyse gapssubcommand that flags two indicators of evtx record tampering inside a single file.The first is RecordID gaps. Per-channel
EventRecordIDvalues are normally monotonically increasing, so a hole inside one evtx file (not at a log-rotation boundary) is the fingerprint left by tools that surgically delete records to avoid triggering EID 1102 ("log cleared"). One real-world example isEventlogedit-style tampering.The second is time gaps. Unexpectedly long quiet windows on a normally chatty channel can indicate that records inside the window were removed. The threshold is configurable via
--min-time-gap-minutes(default 30).Output defaults to human-readable text, with JSON via
--json. Both detectors can be turned off independently with--no-record-id-gapsand--no-time-gaps. Standard-o,-q, and--skip-errorsoptions are supported.I went with a module rather than a Sigma rule because the detection is stateful and reasons about absent events, which Sigma's grammar does not support. The module could later emit synthetic events (something like
chainsaw.gap_detected) for Sigma rules to match on if that would be useful downstream.Tests
src/analyse/gaps.rscover the gap-detection logic in isolationtests/clo.rsexercise the CLI text and JSON paths against the existingtests/evtx/security_sample.evtxfixtureExample output
Happy to adjust defaults, rename the subcommand, or add the synthetic-event emission if you would prefer that direction.