GLM-5 / .eval_results /terminal_bench_2.yaml
ZHANGYUXUAN-zR's picture
Add Terminal-Bench 2.0 evaluation result (52.4%) (#64)
73b81c6
raw
history blame contribute delete
278 Bytes
- dataset:
id: harborframework/terminal-bench-2.0
task_id: terminalbench_2
value: 52.4
date: '2026-02-23'
source:
url: https://www.tbench.ai/leaderboard/terminal-bench/2.0
name: Terminal-Bench Leaderboard
user: SaylorTwift
notes: "agent: Terminus 2"