EconWebArena: Benchmarking Autonomous Agents on Economic Tasks in Realistic Web Environments
Zefang Liu, Yinzhu Quan
arXiv preprint arXiv:2506.08136, 2025
EconWebArena is a benchmark of 360 human-curated tasks from real websites that evaluates autonomous agents’ ability to perform complex, multimodal economic reasoning and web navigation using authoritative sources.