Import
The import command converts agent session transcripts and external eval configs into AgentV formats. Transcript imports let you grade past runs offline without re-running the agent. Config imports help migrate existing suites into AgentV YAML.
Supported Providers
Section titled “Supported Providers”| Provider | Command | Source |
|---|---|---|
| Claude Code | agentv import claude | ~/.claude/projects/<path>/<uuid>.jsonl |
| Codex CLI | agentv import codex | ~/.codex/sessions/<YYYY>/<MM>/<DD>/rollout-*.jsonl |
| Copilot CLI | agentv import copilot | ~/.copilot/session-state/<uuid>/events.jsonl |
| promptfoo | agentv import promptfoo | promptfooconfig.yaml, .json, .json5 |
import promptfoo
Section titled “import promptfoo”Convert a promptfoo config into an AgentV EVAL.yaml.
agentv import promptfoo ./promptfooconfig.yamlDry run
Section titled “Dry run”Print the generated AgentV YAML without writing a file:
agentv import promptfoo ./promptfooconfig.yaml --dry-runCustom output path
Section titled “Custom output path”agentv import promptfoo ./promptfooconfig.yaml -o ./evals/EVAL.yamlDefault output: EVAL.yaml beside the promptfoo config file.
What v1 converts cleanly
Section titled “What v1 converts cleanly”- inline prompts and file-backed text / chat JSON prompts
- inline tests and external YAML / JSON / JSONL / CSV test files
defaultTest.assertpromoted to suite-levelassertions- per-test
vars,description,threshold,metadata, prompt filters, and provider filters - deterministic assertions that map directly to AgentV:
equals,contains,icontains,regex,starts-with,ends-with,contains-any,contains-all,icontains-any,icontains-all,is-json,latency,cost - rubric-style assertions mapped to
llm-grader:llm-rubric,g-eval,factuality,context-faithfulness,context-recall
What still needs manual migration
Section titled “What still needs manual migration”The importer fails explicitly instead of doing a lossy conversion when it sees promptfoo features that need a runtime translation layer or AgentV-specific redesign. Current examples:
javascript,python,similar,assert-set,contains-json, trajectory assertions, and other non-direct assertion types- CSV/XLSX features beyond common
__expected*/__description/__threshold/__metadata:*columns - prompt or test generators, executable prompts,
options.transform,options.transformVars, file-backed vars, andproviderOutput
If the import stops on one of these, keep the generated config for the supported parts and migrate the flagged feature manually.
import claude
Section titled “import claude”Import a Claude Code session transcript.
List available sessions
Section titled “List available sessions”agentv import claude --listOutput:
Found 5 session(s):
4c4f9e4e-e6f1-490b-a1b1-9aef543ebf22 2m ago -home-user-myproject 087b801a-7a63-48ff-b348-62563a290b23 1h ago -home-user-myproject ed8b8c62-4414-49fb-8739-006d809c8588 3h ago -home-user-other-projectImport a specific session
Section titled “Import a specific session”agentv import claude --session-id 4c4f9e4e-e6f1-490b-a1b1-9aef543ebf22Filter by project path
Section titled “Filter by project path”agentv import claude --list --project-path /home/user/myprojectCustom output path
Section titled “Custom output path”agentv import claude --session-id <uuid> -o transcripts/my-session.jsonlDefault output: .agentv/transcripts/claude-<session-id-short>.jsonl
import codex
Section titled “import codex”Import a Codex CLI session transcript.
List available sessions
Section titled “List available sessions”agentv import codex --listImport a specific session
Section titled “Import a specific session”agentv import codex --session-id 019d5cff-9f02-7bc3-8f98-2071ba17ef0eimport copilot
Section titled “import copilot”Import a Copilot CLI session transcript.
List available sessions
Section titled “List available sessions”agentv import copilot --listImport a specific session
Section titled “Import a specific session”agentv import copilot --session-id 9ca6d90c-1d80-40d1-b805-c59ee31fc007Options
Section titled “Options”All three providers share the same core flags:
| Flag | Description |
|---|---|
--session-id <uuid> | Import a specific session by UUID |
--list | List available sessions instead of importing |
--output, -o <path> | Custom output file path |
Provider-specific flags:
| Flag | Provider | Description |
|---|---|---|
--project-path <path> | Claude | Filter sessions by project path |
--projects-dir <dir> | Claude | Override ~/.claude/projects directory |
--date <YYYY-MM-DD> | Codex | Filter sessions by date |
--sessions-dir <dir> | Codex | Override ~/.codex/sessions directory |
--session-state-dir <dir> | Copilot | Override ~/.copilot/session-state directory |
Output Format
Section titled “Output Format”The imported transcript is written as JSONL — one Message object per line:
{"role":"user","content":"Fix the bug in auth.ts"}{"role":"assistant","content":"I'll fix the authentication bug.","toolCalls":[{"tool":"Read","input":{"file_path":"src/auth.ts"},"id":"toolu_01...","output":"...file contents..."}]}Each message follows AgentV’s standard Message interface with role, content, and optional toolCalls (including tool outputs paired from subsequent events).
What Gets Parsed
Section titled “What Gets Parsed”| Claude Event | AgentV Message |
|---|---|
user | { role: 'user', content } |
assistant | { role: 'assistant', content, toolCalls } |
tool_use blocks | ToolCall { tool, input, id } |
tool_result blocks | Paired with matching tool_use by ID |
progress, system | Skipped |
| Subagent events | Filtered out (v1) |
Token usage is aggregated from the final cumulative value per LLM request. Duration is computed from first-to-last event timestamp.
Workflow
Section titled “Workflow”Import a session, then run graders against it:
# 1. List sessions and pick oneagentv import claude --list
# 2. Import a session by IDagentv import claude --session-id 4c4f9e4e-e6f1-490b-a1b1-9aef543ebf22
# 3. Run graders against the imported transcriptagentv eval evals/my-eval.yaml --transcript .agentv/transcripts/claude-4c4f9e4e.jsonlSee examples/features/import-claude/ for a complete working example.
HuggingFace Datasets (SWE-bench)
Section titled “HuggingFace Datasets (SWE-bench)”Use scripts/import-huggingface.py to convert HuggingFace benchmark datasets into AgentV eval files. Currently supports SWE-bench-style datasets.
uv run scripts/import-huggingface.py \ --repo SWE-bench/SWE-bench_Verified \ --split test \ --limit 10 \ --output evals/swebench/Each instance becomes an EVAL.yaml with:
input— the problem statementworkspace.docker.image— the pre-built SWE-bench Docker image (ghcr.io/epoch-research/swe-bench.eval.x86_64.<instance_id>:latest)workspace.repos[].checkout.base_commit— the commit to reset to before the agent runsassertions—code-gradertasks that runFAIL_TO_PASSandPASS_TO_PASSpytest suites inside the container
Run an imported SWE-bench eval against any coding agent target:
# Import one instanceuv run scripts/import-huggingface.py \ --repo SWE-bench/SWE-bench_Verified \ --limit 1 \ --output /tmp/swebench-eval/
# Run with a coding agent targetagentv eval /tmp/swebench-eval/*.EVAL.yaml --target codexThe Docker workspace spins up the pre-built SWE-bench image, checks out base_commit, runs the agent to apply a patch, then grades by running the test suite inside the container.