HILLCLIMBER
V0.1.0 (pre-release)

HILLCLIMBER

An open-source /goal alternative.

Auto-improve your code. Define your goal, budget, and models—hillclimber orchestrates, executes, and monitors the work.
Open-source and harness-agnostic.

bash
$uv tool install git+https://github.com/oleh-vell/hillclimber.git
$hillclimber init --interactive
How it works
01
Define

Write a spec file

hillclimber.toml
path_to_artefact = "my_repo/src/"
strategy = "chain"

# Explicit target to climb
[goal]
direction = "maximize"
target = 1.0

# Hard stop: max cycles, tokens or money.
[budget]
cycles = 5

# Eval function
[scorer]
kind = "command"
cmd = "python eval.py"

# kind = "none" to switch the sandbox off.
[sandbox]
kind = "seatbelt"

# Proposes the next hypothesis for improving the artefact.
[agents.orchestrator]
harness = "claude"
model = "claude-opus-4-8"

# Applies the proposed change to the artefact.
[agents.worker]
harness = "claude"
model = "claude-opus-4-8"

The spec file defines the core of the long-running experiment. Define your goal, budget, and eval function to measure the improvement rate.
To generate the spec and eval files execute hillclimber init

02
Execute

Run hillclimber

~/my_repo
$ hillclimber run

# preflight — score the untouched artefact, check models
 baseline 0.712
 models verified
 strategy: chain

# each cycle: propose → apply → score, keep what climbs
 cycle 001: Strip markup before matching field boundaries
 cycle 001 scored 0.781 (+0.069)
 cycle 002: Fuzzy-match malformed date fields
 cycle 002 scored 0.774 (-0.007)
 cycle 003: Normalize unicode before matching fields

# live status — redraws in place, gone when the run ends
 cycle 3/5 — applying the hypothesis           12:47
  baseline 0.712  ·  best 0.781
   Read(file_path='src/extract.py')
   Edit(file_path='src/extract.py', old_string=…)
   tool returned: ok

Hillclimber reads your spec and orchestrates the experiment. Each cycle is an isolated git worktee, with dedicated coding agent and tight feedback loop.
To start climbing execute hillclimber run

Why hillclimber
why.md

# Models are great at iteratively improving performance. But without explicit constraints and goals, you risk burning tokens and losing control.

# I built hillclimber to do two things:

1. Force you to be explicit upfront — what you want, and how much you're willing to spend.

2. Leave you free to choose any model provider you like.

You're in control

Explicilty set up the goal, budget, and models.

Free & open-source

It's completely free to use, and you are more than welcome to tweak the source code in any way or form.

Extendable by design

Architecture supports adding new strategies, harnesses, and sandboxes. Work with what suits you best.

Durable execution

If the agent crashes, you can always run hillclimber continue to resume where you left off.

Coming soon

Use with your harness

Let your harness to do all the work and only use hillclimber as experiment orchestrator.

Coming soon
Start climbing

Point it at your repo.

bash
$uv tool install git+https://github.com/oleh-vell/hillclimber.git
$hillclimber init --interactive