Build for Detection Engineering, and Alerting Will Improve (Part 3)

Anton Chuvakin
Anton on Security
Published in
4 min readSep 28, 2023

--

This blog series was written jointly with Amine Besson, Principal Cyber Engineer, Behemoth CyberDefence and one more anonymous collaborator.

In this blog (#3 in the series), we will start to define and refine our detection engineering machinery to avoid the problems covered in Parts 1 and 2.

Adopting detection engineering practices should have a roadmap and eventually become a program, effectively re-balancing where efforts go in a SOC by investing in high quality detection creation (and detection content lifecycle, of course).

Put simply, if you spend more time building better detections, then you spend less time triaging bad alerts. Simple, eh? If it were simple everybody would do it!

Embracing leaner, consistent, purpose-driven detection workflows is key, and you may want to assert where you land on these key areas:

⚒️ Breakdown and Backlog: Build a continuous roll of issues corresponding to threats to analyze, and detection requirements to implement. What you are doing next for detection content should be clear in most cases, and yes, this is security, so there will be nasty surprises. Eventually, the only unpredictable tasks would be the genuine rare surprises — your routine detection work would not surprise you.

🌊 Smoothen yer workflow : Remove people interfaces that don’t work, define minimal ceremonies, and put content reviews in the right places. Shorten approval times for releases, ensure detection quality reviews are followed up. If dealing with a MSSP/MDR, make sure to lay the governance structure for building custom content jointly with you (JointOps, not finger-pointing).

☣️ Embrace threat-driven approach: Study adversary tradecraft in detail before making educated calls on what to detect, and where/how. Starting from available telemetry data will more often than not be prone to bias, inefficiency and mistakes.

💡 Embed Intel: A CTI enclave in a SOC will often provide higher ROI than dealing with a separate team, as they understanding SOC needs better, and plug in directly into their processes (this is full of nuance, so YMMV)

⚡ Lower Intel-to-Rule KPIs : Quantify how long it takes to go from intel input (ItR for Intel to rule, ha-ha, we just made up a new acronym! Take that, Gartner! ;-)) to an actual detection; with as much granularity as possible. An good ItR metric would be to transform a high risk threat intel into working detections in hours or in a few days.

👀 Visibility over assumptions : If you can not answer accurately within minutes what your detection coverage is and which shortcomings it has — you likely need to start parsing your detection library and threat modeling into qualitative metrics and make that data transparent to the DE team.

🚀 Release Soon, Release Often! : There are new threat variants every week, and detection engineering scales directly on the quality of the intelligence input. This is where modern software engineering practices come handy.

🔥 Quantify, Measure, Orient Operations : Define what healthy operations look like : what FP rate is acceptable for new detections ? What turnover time for tuning should be aimed at ? Where are quick wins, where are detection gaps ? Where is capacity spent, should it be reassigned to more urgent priorities ? Where are process bottlenecks ?

🦾 Automate the hard — and boring — part : Everything produced during R&D should generate rich and structured knowledge bases, metadata, and metrics. Version your detection library, and roll it out with CI/CD toolings. BTW, this advice alone is worth the price of this blog!

💎 Where does ATT&CK fit in the DE picture ?

While a great model to categorize adversary tradecraft, and a necessary tool in the detection engineering arsenal, ATT&CK is often overused as a palliative measure to map detections and generate coverage maps — by skipping the detail (“Do you cover T1548? — Ehhh… YES?”)

This does not accurately represent SOC detection performance, since techniques can be fairly broad (since it is the purpose of the taxonomy to normalize specificities), and rule quantity doesn’t equate quality (not when they drown analysts in False Positives).

Detection hints from ATT&CK are also rather generic, since a Technique is itself a concept which clusters different procedures together. Thus, while ATT&CK can give a direction of what a SOC needs to develop, it doesn’t give a way to achieve detection objectives; which is the detection engineer core concern.

The journey to Detection Engineering maturity is hard, but you should now have a clearer perspective to smoothen the journey toward building better detections.

But it all starts with quality input : in our next blog post, we’ll look at in more detail what exactly a Detection Engineering team needs from Threat Intelligence to be fully informed, and propose collaborative models. Stay tuned!

UPDATE: the story continues in “Focus Threat Intel Capabilities at Detection Engineering (Part 4)”

Related blog posts:

--

--