Skip to content

Getting Started

This guide runs one SForge task end to end using the default Docker backend: install SForge, fetch task definitions, pull pre-built task images, start the judge server, run an agent, and inspect the result.

Prerequisites

  • Linux: SForge currently targets Linux hosts.

    WARNING

    The default Docker backend needs direct access to the host Docker daemon; running inside a container introduces Docker-in-Docker issues.

  • Docker Engine: required to pull task images and run work/judge containers. The Docker daemon must be running; verify it with:

    bash
    docker run hello-world
  • Python >= 3.10.

Install SForge

Install the released package:

bash
pip install sforge

Or install from source:

bash
git clone https://github.com/ByteDance-Seed/EdgeBench.git
cd SForge
pip install -e .

Check that the CLI is available:

bash
sforge --help

Run Ad Placement Optimization

Ad Placement Optimization is a C++ optimization task in EdgeBench, used here as a small end-to-end example.

1. Fetch Task Definitions

bash
sforge fetch-tasks edgebench

This downloads the default edgebench task definitions into ./tasks. You can confirm that tasks are visible with:

bash
sforge list

2. Pull Images

bash
sforge pull --task ad_placement_optimization --registry seededge

This pulls the pre-built base, work, and judge images for the task. Image names are derived from the benchmark name, role, task/base key, and content hash, for example:

  • <benchmark>.base.cpp:<hash>
  • <benchmark>.work.ad_placement_optimization:<hash>
  • <benchmark>.judge.ad_placement_optimization:<hash>

3. Start the Judge Server

Open a second terminal:

bash
sforge serve

The judge server listens on 0.0.0.0:8080 by default.

4. Run an Agent

In the first terminal:

bash
SFORGE_AGENT_API_KEY="sk-ant-xxxx" \
sforge run --task ad_placement_optimization --agent claude-code \
  --model claude-opus-4-8 \
  --timeout 7200 \
  --run-id ad-placement-001

The agent works inside the work container, submits code to the judge server, receives test feedback, and iterates until timeout or completion.

View Results

Start the web UI to monitor runs in real time:

bash
sforge visualizer

Then open http://127.0.0.1:8000/.

Run outputs are also written under logs/runs/<run-id>/<task-id>/ if you prefer inspecting files manually:

bash
tail -f logs/runs/ad-placement-001/ad_placement_optimization/agent_output.txt
cat logs/runs/ad-placement-001/ad_placement_optimization/final_result.json

Common Next Steps