Frontier Lab SOC Benchmark

maximelb · March 3, 2026, 5:37pm

Inspired by https://lnkd.in/gH8VBSjV and https://lnkd.in/gVeqCatZ I realized that we’re missing realistic benchmarks around AI in the SOC from frontier labs to use as a baseline for any other SOC+AI work.

So how do we go about that? We need two core things:

Generic set of capabilities, a neutral environment that can be reproduced
Directly put to work the frontier models, without hidden black-box and with the minimal harness possible

I didn’t want to spend a week wrangling OSS solutions together to make a base environment (too much of a moving target), so (surprise surprise) I used LC as the fondation. I think it’s fair as it’s a core set of capabilities that anyone can access and replicate with (using the community edition) and there is not a hint of black-box hidden capabilities. That way I can just use a single CLI for AI to interface with.

Then for models side, I used the CLI from the 3 leading frontier labs:

Claude Code Anthropic
Codex CLI OpenAI
Gemini CLI Google

What you end up with is a set of benchmarks for common activities for a SOC: GitHub - refractionPOINT/asw-bench: Agentic SecOps Workspace Benchmarks · GitHub

I wanted to start with a single solid scenario end to end and then expand later. So I started with “here is a detection, investigate it and report on it”.

For me, the conclusion is just how good those top models are at running security operations out of the box. No, I don’t think they’re “fully autonomous” level yet.

To me it shows just how far you can get with just a good set of capabilities and the frontier models. No secret sauce, no custom models or hidden harnesses.

I would love feedback, suggested future benchmarks etc.
Also, I think this could greatly benefit from a more proper test environment, if there are attacker-simulation type companies that make easily available (transparent) scenarios, I would love to partner.

Topic		Replies	Views
From AI SOC to AI in the SOC (and beyond) - Premiere Resources	0	14	January 20, 2026
[Defender Fridays] Modern SecOps: What and AI-ready SOC actually means with Anton Chuvakin - October 24 Defender Fridays	1	36	October 29, 2025
How we're actually using AI in the SOC with Eric Capuano Resources	0	12	June 17, 2026
[Defender Fridays] How we're actually using AI in the SOC with Eric Capuano - June 12 Defender Fridays	4	46	June 12, 2026
[Defender Fridays] How AI is changing your SOC workflows with Matt Bromiley - September 5 Defender Fridays	1	41	September 15, 2025

Frontier Lab SOC Benchmark

Related topics