V0.1.0 (pre-release)

HILLCLIMBER

An open-source /goal alternative.

Auto-improve your code. Define your goal, budget, and models—hillclimber orchestrates, executes, and monitors the work.
Open-source and harness-agnostic.

bash

$uv tool install git+https://github.com/oleh-vell/hillclimber.git

$hillclimber init --interactive

How it works

Define

Write a spec file

hillclimber.toml

path_to_artefact = "my_repo/src/"
strategy = "chain"

# Explicit target to climb
[goal]
direction = "maximize"
target = 1.0

# Hard stop: max cycles, tokens or money.
[budget]
cycles = 5

# Eval function
[scorer]
kind = "command"
cmd = "python eval.py"

# kind = "none" to switch the sandbox off.
[sandbox]
kind = "seatbelt"

# Proposes the next hypothesis for improving the artefact.
[agents.orchestrator]
harness = "claude"
model = "claude-opus-4-8"

# Applies the proposed change to the artefact.
[agents.worker]
harness = "claude"
model = "claude-opus-4-8"

The spec file defines the core of the long-running experiment. Define your goal, budget, and eval function to measure the improvement rate.
To generate the spec and eval files execute hillclimber init

Execute

Run hillclimber

~/my_repo

$ hillclimber run

# preflight — score the untouched artefact, check models
✓ baseline 0.712
✓ models verified
✓ strategy: chain

# each cycle: propose → apply → score, keep what climbs
◆ cycle 001: Strip markup before matching field boundaries
▴ cycle 001 scored 0.781 (+0.069)
◆ cycle 002: Fuzzy-match malformed date fields
▾ cycle 002 scored 0.774 (-0.007)
◆ cycle 003: Normalize unicode before matching fields

# live status — redraws in place, gone when the run ends
⠹ cycle 3/5 — applying the hypothesis           12:47
  baseline 0.712  ·  best 0.781
  │ Read(file_path='src/extract.py')
  │ Edit(file_path='src/extract.py', old_string=…)
  │ tool returned: ok

Hillclimber reads your spec and orchestrates the experiment. Each cycle is an isolated git worktee, with dedicated coding agent and tight feedback loop.
To start climbing execute hillclimber run

Why hillclimber

why.md

# Models are great at iteratively improving performance. But without explicit constraints and goals, you risk burning tokens and losing control.

# I built hillclimber to do two things:

1. Force you to be explicit upfront — what you want, and how much you're willing to spend.

2. Leave you free to choose any model provider you like.

You're in control

Explicilty set up the goal, budget, and models.

Free & open-source

It's completely free to use, and you are more than welcome to tweak the source code in any way or form.

Extendable by design

Architecture supports adding new strategies, harnesses, and sandboxes. Work with what suits you best.

Durable execution

If the agent crashes, you can always run hillclimber continue to resume where you left off.

Coming soon

Use with your harness

Let your harness to do all the work and only use hillclimber as experiment orchestrator.

Coming soon

Start climbing

Point it at your repo.

bash

$uv tool install git+https://github.com/oleh-vell/hillclimber.git

$hillclimber init --interactive