FunSearch harness (evaluate/register/status)
funsearchEvaluate your Python programs in a sandbox for cap_set or online_bin_packing problems. Register scored programs to a MAP-Elites database and retrieve best variants for iterative evolution.
Instructions
Sandboxed program-search harness (FunSearch): action='evaluate' scores YOUR Python program for problem_id ('cap_set' or 'online_bin_packing') in a no-network/timeout/rlimit sandbox; action='register' stores a scored program in the MAP-Elites DB; action='status' returns the best programs + few-shot context for writing the next variant. Use to iteratively evolve programs — YOU are the generator, mathlas is the deterministic scorer. Args: action, problem_id, then program_src (evaluate/register), score + behavior (register), timeout_s (evaluate), top_k (status).
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| action | Yes | 'evaluate' = sandbox-score program_src; 'register' = store a scored program; 'status' = best programs + few-shot context | |
| problem_id | Yes | the problem: 'cap_set' or 'online_bin_packing' | |
| program_src | No | (evaluate/register) the candidate Python program source — YOU write it; it must define the problem's entry point | |
| score | No | (register) the score that action='evaluate' returned | |
| behavior | No | (register) the behaviour descriptor from action='evaluate' (selects the MAP-Elites cell) | |
| timeout_s | No | (evaluate) hard wall-clock timeout seconds (default 10) | |
| top_k | No | (status) elite programs in the few-shot (default 3) |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| action | No | ||
| problem_id | Yes | ||
| ok | No | (evaluate) program ran + scored | |
| score | No | ||
| behavior | No | ||
| error | No | agent-actionable: what failed and which args to fix | |
| accepted | No | (register) | |
| best_score | No | (status) | |
| best_program | No | (status) | |
| few_shot_context | No | (status) DATA for you to write the next program | |
| note | No |