First Run
Run a real environment, task, and reference agent end to end.
The first successful run should prove three things:
- your machine can start an environment
- the environment can reset successfully
- the framework loop can take actions and finish cleanly
Try A Task Directly
from gym_anything import from_config
env = from_config(
"benchmarks/cua_world/environments/moodle_env",
task_id="enroll_student",
)
obs = env.reset(seed=42)
obs, reward, done, info = env.step([
{"mouse": {"left_click": [300, 200]}},
])
obs, reward, done, info = env.step([], mark_done=True)
env.close()Try The CLI
gym-anything run moodle --task enroll_student -iFor benchmark environment keys like moodle, the CLI resolves the environment for you.
Try An Example Agent
gym-anything benchmark moodle --task enroll_student --agent ClaudeAgent --model claude-opus-4What To Check
After a run, check:
- the environment reset completed cleanly
- screenshots were captured into the episode directory
summary.jsonexists after the run finishes- the automatic checker output looks sane for the task you ran
If setup fails before reset, go back to Installation. If the loop behavior is confusing, read Core Overview.