Skip to content

OpenDFM/Xcientist

Repository files navigation

Xcientist logo

Externalizing Research Synthesis and Validation in AI Scientists through a Research Harness

Project website 2606.18874 Python 3.12

Xcientist system overview

Table of Contents

Xcientist is a multi-agent research workflow for turning a topic into survey artifacts, structured ideas, executable experiments, and technical blog articles. The repository currently centers on four agent stacks:

  • Survey Agent: collects papers, builds topic clusters, and writes survey outputs.
  • Idea Agent (LigAgent): turns a topic or seed idea into a research proposal with survey-grounded retrieval, graph-backed references, and Memory-Guided MCTS.
  • Experiment Agent (SuperAgent): prepares a workspace, generates code, runs experiments, and integrates iteration reports.
  • Blog Agent: reads an experiment workspace and writes a technical blog article with generated figures and quality checks.

The repo also contains a prototype loop runner for Survey -> Idea -> Experiment -> Blog, shared configuration, and a reusable memory subsystem.

🗂️ Repository Map

Xcientist/
├── README.md / README_CN.md       # documentation
├── pyproject.toml / uv.lock        # Python package metadata and locked uv environment
├── requirements.txt                # compatibility dependency list
├── run_survey.sh                   # wrapper for `xcientist survey`
├── run_idea.sh                     # wrapper for `xcientist idea`
├── run_experiment.sh               # wrapper for `xcientist experiment`
├── run_blog.sh                     # wrapper for `xcientist blog`
├── run_pipeline.sh                 # wrapper for `xcientist pipeline`
├── scripts/                        # setup and environment helper scripts
│   ├── install_base.sh
│   ├── install_heavy.sh
│   ├── install_mcp_wrappers.sh
│   └── sync_claude_anthropic_env.py
├── src/
│   ├── __main__.py                 # `python -m src` entrypoint
│   ├── cli.py                      # unified `xcientist` CLI
│   ├── config/
│   │   ├── __init__.py             # unified config loader
│   │   └── default.yaml            # main project config
│   ├── pipeline/                   # Survey -> Idea -> Experiment -> Blog loop
│   ├── agents/
│   │   ├── survey_agent/           # paper retrieval, clustering, survey generation
│   │   ├── idea_agent/             # LigAgent proposal generation
│   │   ├── experiment_agent/       # SuperAgent experiment orchestration
│   │   └── blog_agent/             # technical blog generation
│   └── memory/                     # shared vector/symbolic memory APIs
├── graph/                          # graph retrieval service and indexing scripts
├── database/                       # local caches used by retrieval workflows
├── assets/                         # project images and static assets
└── workspace/                      # default runtime workspace, created/used locally

🔄 How The Pieces Fit Together

Topic
  -> Survey Agent
     output: survey.md + survey.json
  -> Idea Agent
     output: idea_result.json
  -> Experiment Agent
     output: workspace, results, ablation_results.json
  -> Blog Agent
     output: blog workspace, article draft, generated figures

The pipeline runner in src/pipeline/run_loop.py automates the full Survey -> Idea -> Experiment -> Blog flow, but the individual agents remain the clearest way to operate and debug the system.

✅ Prerequisites

  • uv
  • Python 3.12
  • node and npx for Experiment Agent MCP servers
  • API keys depending on which agent you run
  • Local assets for graph-backed retrieval and memory-enabled workflows
    • Paper-Graph related resource donwload link, put them into <repo_root>/data/processed.
    • Embedding model download:
    mkdir -p models/bge-m3
    mkdir -p models/all-MiniLM-L6-v2
    modelscope download -- model baai/bge-m3 --local_dir <repo_root>/models/bge-m3
    modelscope download --model sentence-transformers/all-MiniLM-L6-v2 --local_dir <repo_root>/models/all-MiniLM-L6-v2
    

⚙️ Installation

The default setup path is now uv.

git clone --depth 1 https://github.com/OpenDFM/Xcientist.git
uv sync
source .venv/bin/activate
cp .env.example .env
xcientist doctor

Common group combinations:

# Base CLI / config / API-only workflows
uv sync

# Memory-enabled and local-model workflows
uv sync --group memory --group ml

# PDF parsing stack
uv sync --group pdf

# Blog Agent full workflow: PDF parsing + image generation / OCR / text removal
uv sync --group pdf --group blog

# Full local environment
uv sync --all-groups

If you want local MCP wrapper scripts for Experiment Agent:

xcientist install-mcp-wrappers

environment.yml is still available as a legacy/full-environment fallback, but uv sync is the primary path for Survey + Idea + Experiment + Blog + Pipeline. The dependency layout is now split so the default install stays lightweight and heavy local-model / PDF stacks are opt-in. After activation, the project exposes CLI entrypoints such as xcientist, xcientist-survey, and xcientist-idea directly in the shell.

🔐 Environment Variables

Different agents read slightly different variables. In practice, these are the most useful ones to define:

export OPENAI_API_KEY=...
export OPENAI_BASE_URL=...
export SEMANTIC_SCHOLAR_API_KEY=...
export ANTHROPIC_API_KEY=...
export ANTHROPIC_BASE_URL=...
export SERPER_API_KEY=...
export GITHUB_AI_TOKEN=...
export JINA_API_KEY=...
export TAVILY_API_KEY=...
export HF_TOKEN=...

Notes:

  • Set both OPENAI_API_BASE and OPENAI_BASE_URL if you use a custom OpenAI-compatible endpoint.
  • The CLI loads repo-root .env first and still falls back to src/config/.env for older setups.
  • src/config/default.yaml is the main configuration file for the current unified workflow.
  • Survey, Idea, Experiment, and Blog still have some agent-specific conventions on top of the unified config.

📦 Optional Local Assets

Some retrieval-heavy paths expect local assets that are not stored in the repository:

  • data/processed/graph.db
  • data/processed/core_component_summary_vector_store/
  • models/bge-m3/
  • models/all-MiniLM-L6-v2/

If you use graph-backed retrieval, start the graph service from the repository root:

uvicorn graph.server:app --host 127.0.0.1 --port 8000

Health check:

curl http://127.0.0.1:8000/health

🚀 Quick Start

Recommended first-time flow:

uv sync --group memory --group ml
source .venv/bin/activate
cp .env.example .env
xcientist doctor

If doctor passes and your local assets are in place, use the commands below.

Fastest Path

Using the provided Training-Free Memory System for LLM Agents example:

Generate survey only:

xcientist survey --topic "Training-Free Memory System for LLM Agents"

Run ideation from the provided sample survey:

xcientist idea --topic "Training-Free Memory System for LLM Agents"

Run experiment from the provided sample idea:

xcientist experiment --experiment agent_memory --idea-json <repo_root>/src/agents/idea_agent/example/idea_result.json

Start blog generation from the sample experiment workspace:

xcientist blog --experiment agent_memory --source-workspace <repo_root>/workspace/training-free-memory-example

For further configuration changes, edit src/config/default.yaml.

1. Run Survey Agent

Primary entrypoint:

xcientist survey

Override the topic directly:

xcientist survey --topic <your_topic_name>

Typical outputs:

  • src/agents/survey_agent/outputs/.../survey.md
  • src/agents/survey_agent/outputs/.../survey.json
  • src/agents/survey_agent/outputs/.../evaluation.txt

2. Run Idea Agent

Primary entrypoint:

xcientist idea

Override the topic directly:

xcientist idea --topic <your_topic_name>

The default run uses src/config/default.yaml, materializes a run directory under src/agents/idea_agent/runs/, and writes idea_result.json plus logs.

3. Run Experiment Agent

Primary entrypoint:

xcientist experiment --experiment my_exp --idea-json /abs/path/to/idea_result.json

Prepare only:

xcientist experiment --experiment my_exp --idea-json /abs/path/to/idea_result.json --prepare-only

Direct entrypoint:

python -m src.agents.experiment_agent.main --experiment my_exp --resume --verbose

Key workspace outputs live under workspace/<experiment_id>/ by default and usually include:

  • idea.json
  • project/
  • dataset_candidate/
  • results/
  • agent_reports/
  • ablation_results.json

4. Run Blog Agent

Blog Agent generates a technical blog article from an existing experiment workspace.

Recommended entrypoint:

xcientist blog --experiment my_exp

With the default workspace root at <repo_root>/workspace, this reads the source experiment from:

<repo_root>/workspace/my_exp

You can also pass that experiment workspace explicitly:

xcientist blog --experiment my_exp --source-workspace <repo_root>/workspace/my_exp

If the experiment workspace is not under the blog agent's default source path, pass it explicitly:

xcientist blog --experiment my_exp --source-workspace /abs/path/to/experiment_workspace

Resume an existing blog workspace:

xcientist blog --experiment my_exp --resume

./run_blog.sh remains available as a compatibility wrapper and delegates to the same xcientist blog command.

5. Run The Prototype Pipeline

Run the integrated loop with the topic from src/config/default.yaml:

xcientist pipeline

Override the research topic at launch time:

xcientist pipeline --topic "Training-Free Memory System for LLM Agents"

Use a custom config file:

xcientist pipeline --config /abs/path/to/config.yaml --topic "Your Research Topic"

🧭 Configuration Guide

The current configuration layout is mixed by design:

Area Primary source
Global project config src/config/default.yaml
Survey Agent survey: block in src/config/default.yaml plus src/agents/survey_agent/config/*.yaml
Idea Agent idea: block in src/config/default.yaml
Experiment Agent experiment: block in src/config/default.yaml and environment variables
Blog Agent blog: block in src/config/default.yaml plus BLOG_AGENT_SOURCE_WORKSPACE when needed
Pipeline pipeline: block in src/config/default.yaml

If you are starting fresh, edit src/config/default.yaml first. It is the most reliable single file to understand current defaults.

Citation

If you use Xcientist in your research, please cite:

@misc{
   to be released
}

About

The official repo for the paper "Externalizing Research Synthesis and Validation in AI Scientists through a Research Harness"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages