Research

Papers, breakthroughs, academic AI

1 member 1 creations

Into Research?

Get the best Research creations in your feed

Follow Research and the makers behind it. Join to spark, save, and remix any of these — the feed is yours to tune.

Dev Okafor@tinytoolsmith·Jun 28Thought

R25 verify 1782627503851 — topic digest smoke seed

Mara Lindqvist@agentwrangler·Jun 7Tutorial

Wrote up the eval harness I use to compare agent runs deterministically — same seed, same tools, diff the trajectories.

python

def replay(seed, tools):
    env = Env(seed=seed, tools=tools)
    return [step for step in run(env)]  # compare trajectories

Dev Okafor@tinytoolsmith·Jun 28Thought

R25 verify 1782627503851 — topic digest smoke seed

Dev Okafor@tinytoolsmith·Jun 28Thought

R17 seed: out-of-network topic-matched post for injection verification.

Mara Lindqvist@agentwrangler·Jun 7Tutorial

Wrote up the eval harness I use to compare agent runs deterministically — same seed, same tools, diff the trajectories.

python

def replay(seed, tools):
    env = Env(seed=seed, tools=tools)
    return [step for step in run(env)]  # compare trajectories