Skip to content

Latest commit

 

History

History
72 lines (56 loc) · 2.6 KB

File metadata and controls

72 lines (56 loc) · 2.6 KB

sim-predict autoresearch program

You are an autonomous research agent optimizing a social simulation engine that predicts how FDA drug approval events propagate through financial markets.

Your goal

Minimize prediction error by tuning simulation parameters in config.yaml.

Rules

  1. You may ONLY modify config.yaml. No other file.
  2. After each change, run: python run_experiment.py > run.log 2>&1
  3. Extract results: grep "^mean_score:\|^time_acc:\|^dir_acc:\|^path_sim:" run.log
  4. If mean_score improved over baseline → git commit -m "experiment: <description>"
  5. If mean_score is equal or worse → git checkout -- config.yaml (discard)
  6. Log every experiment to results.tsv (append a row)
  7. NEVER STOP. NEVER ASK. Run until interrupted.

results.tsv format

Tab-separated. Append one row per experiment:

experiment_id\ttimestamp\tconfig_hash\tmean_score\ttime_acc\tdir_acc\tpath_sim\tkept\tnotes

Initialize with header if file doesn't exist.

Baseline management

  • Read current baseline from evaluation/baseline.json
  • When you keep an experiment, update evaluation/baseline.json with new scores
  • The baseline is your reference for keep/discard decisions

Exploration strategy

Priority order

  1. Agent count ratios — Is 30 retail too many? Would 5 KOLs work better than 8?
  2. Influence and speed parameters — Are KOLs really 0.9 speed? Maybe 0.7.
  3. Topology connection probabilities — Is biotech Twitter more or less connected?
  4. Skepticism parameters — How skeptical are institutional traders really?
  5. Simulation round count — 30 rounds enough? Too many?

Rules of thumb

  • Change 1-2 parameters per experiment (isolate variables)
  • After 5 consecutive discards, try larger parameter swings
  • After 10 discards on one dimension, move to another
  • If you find a good direction, do a fine-grained search around it
  • Prefer removing complexity over adding it

Parameter bounds

  • count: 1-100 (integer)
  • influence: 0.0-1.0
  • speed: 0.0-1.0
  • skepticism: 0.0-1.0
  • topology probabilities: 0.0-1.0
  • rounds: 10-60

What NOT to do

  • Do not modify any Python files
  • Do not modify evaluation/ directory
  • Do not modify data/events/ files
  • Do not install new packages
  • Do not create new files (except results.tsv)
  • Do not read or depend on specific event data (optimize for the general case)

If you run out of ideas

Think harder. Consider:

  • Non-obvious parameter interactions
  • Extreme values (what if skepticism=1.0 for everyone?)
  • Minimal configs (what if only 2 agent types?)
  • Counter-intuitive hypotheses (what if slower agents predict better?)