Analyst memo
Research1 sourceDeveloping
ITBench-AA Results: AI Models Underperform
The ITBench-AA benchmark reveals that leading AI models score below 50% on Site Reliability Engineering tasks, underscoring challenges for enterprise IT automation.
Published May 28, 2026, 4:16 AMUpdated May 28, 2026, 4:16 AM
What happened
Artificial Analysis and IBM released the ITBench-AA benchmark assessing AI models on SRE tasks, with leading models scoring below 50%.
Why it matters
The benchmark highlights the limitations of current AI models in handling complex IT tasks, impacting enterprise automation strategies.
Who is affected
Enterprises relying on AI for IT management may need to recalibrate expectations, given the current model performance limitations.
Risks / uncertainty
Model performance variability and cost considerations pose risks to widespread adoption of AI for IT tasks.