Skip to content
View reacher-z's full-sized avatar

Highlights

  • Pro

Block or report reacher-z

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
reacher-z/README.md

Hi, I'm Yuxuan Zhang

PhD student at Vector Institute & University of British Columbia · Research on AI Agents, LLM, RL

Website Twitter Google Scholar


  Top Projects

ClawBench — Can AI Agents Complete Everyday Online Tasks?

153 tasks · 144 live websites · 8 categories · Best model: 33.3%

Paper · Dashboard · Dataset · PyPI

VidGround — Watch Before You Answer

Visually grounded post-training for video LLMs.

Paper · HF Paper


  GitHub Activity

GitHub Contribution Graph

GitHub Stats


  News


  Contact

 yuxuan.zhang(at)ubc.ca      Google Scholar      GitHub      Twitter      Website

Pinned Loading

  1. ClawBench ClawBench Public

    Open-source benchmark for browser AI agents on 153 everyday online tasks across 144 live websites. 5-layer recording + DOM-match + LLM judge. Top score 33.3%.

    Python 61 7

  2. vidground vidground Public

    Watch Before You Answer: Learning from Visually Grounded Post-Training (arXiv 2604.05117)

    Python 3