Files
ONE-OS/axhub-make/skills/third-party/deep-research/reference/methodology.md
王冕 a27e3b8e43 feat: sync full workspace including web modules, docs, and configurations to Gitea
Optimized the root .gitignore to exclude virtual environments, node modules,
and temp folders to ensure clean and lightweight version tracking.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-09 18:12:25 +08:00

14 KiB

Deep Research Methodology: 8-Phase Pipeline

Overview

This document contains the detailed methodology for conducting deep research. The 8 phases represent a comprehensive approach to gathering, verifying, and synthesizing information from multiple sources.


Phase 1: SCOPE - Research Framing

Objective: Define research boundaries and success criteria

Activities:

  1. Decompose the question into core components
  2. Identify stakeholder perspectives
  3. Define scope boundaries (what's in/out)
  4. Establish success criteria
  5. List key assumptions to validate

Ultrathink Application: Use extended reasoning to explore multiple framings of the question before committing to scope.

Output: Structured scope document with research boundaries


Phase 2: PLAN - Strategy Formulation

Objective: Create an intelligent research roadmap

Activities:

  1. Identify primary and secondary sources
  2. Map knowledge dependencies (what must be understood first)
  3. Create search query strategy with variants
  4. Plan triangulation approach
  5. Estimate time/effort per phase
  6. Define quality gates

Graph-of-Thoughts: Branch into multiple potential research paths, then converge on optimal strategy.

Output: Research plan with prioritized investigation paths


Phase 3: RETRIEVE - Parallel Information Gathering

Objective: Systematically collect information from multiple sources using parallel execution for maximum speed

CRITICAL: Execute ALL searches in parallel using a single message with multiple tool calls

Query Decomposition Strategy

Before launching searches, decompose the research question into 5-10 independent search angles:

  1. Core topic (semantic search) - Meaning-based exploration of main concept
  2. Technical details (keyword search) - Specific terms, APIs, implementations
  3. Recent developments (date-filtered) - What's new in 2024-2025
  4. Academic sources (domain-specific) - Papers, research, formal analysis
  5. Alternative perspectives (comparison) - Competing approaches, criticisms
  6. Statistical/data sources - Quantitative evidence, metrics, benchmarks
  7. Industry analysis - Commercial applications, market trends
  8. Critical analysis/limitations - Known problems, failure modes, edge cases

Parallel Execution Protocol

Step 1: Launch ALL searches concurrently (single message)

CRITICAL: Use correct tool and parameters to avoid errors

Choose ONE search approach per research session:

Option A: Use WebSearch (built-in, no MCP required)

  • Standard web search with simple query string
  • Parameters: query (required)
  • Optional: allowed_domains, blocked_domains
  • Example: WebSearch(query="quantum computing 2025")

Option B: Use Exa MCP (if available, more powerful)

  • Advanced semantic + keyword search
  • Tool name: mcp__Exa__exa_search
  • Parameters: query (required), type (auto/neural/keyword), num_results, start_published_date, include_domains
  • Example: mcp__Exa__exa_search(query="quantum computing", type="neural", num_results=10)

NEVER mix parameter styles - this causes "Invalid tool parameters" errors.

Step 2: Spawn parallel deep-dive agents

Use Task tool with general-purpose agents (3-5 agents) for:

  • Academic paper analysis (PDFs, detailed extraction)
  • Documentation deep dives (technical specs, API docs)
  • Repository analysis (code examples, implementations)
  • Specialized domain research (requires multi-step investigation)

Example parallel execution (using WebSearch):

[Single message with multiple tool calls]
- WebSearch(query="quantum computing 2025 state of the art")
- WebSearch(query="quantum computing limitations challenges")
- WebSearch(query="quantum computing commercial applications 2024-2025")
- WebSearch(query="quantum computing vs classical comparison")
- WebSearch(query="quantum error correction research", allowed_domains=["arxiv.org", "scholar.google.com"])
- Task(subagent_type="general-purpose", description="Analyze quantum computing papers", prompt="Deep dive into quantum computing academic papers from 2024-2025, extract key findings and methodologies")
- Task(subagent_type="general-purpose", description="Industry analysis", prompt="Analyze quantum computing industry reports and market data, identify commercial applications")
- Task(subagent_type="general-purpose", description="Technical challenges", prompt="Extract technical limitations and challenges from quantum computing research")

Example parallel execution (using Exa MCP - if available):

[Single message with multiple tool calls]
- mcp__Exa__exa_search(query="quantum computing state of the art", type="neural", num_results=10, start_published_date="2024-01-01")
- mcp__Exa__exa_search(query="quantum computing limitations", type="keyword", num_results=10)
- mcp__Exa__exa_search(query="quantum computing commercial", type="auto", num_results=10, start_published_date="2024-01-01")
- mcp__Exa__exa_search(query="quantum error correction", type="neural", num_results=10, include_domains=["arxiv.org"])
- Task(subagent_type="general-purpose", description="Academic analysis", prompt="Analyze quantum computing academic papers")

Step 3: Collect and organize results

As results arrive:

  1. Extract key passages with source metadata (title, URL, date, credibility)
  2. Track information gaps that emerge
  3. Follow promising tangents with additional targeted searches
  4. Maintain source diversity (mix academic, industry, news, technical docs)
  5. Monitor for quality threshold (see FFS pattern below)

First Finish Search (FFS) Pattern

Adaptive completion based on quality threshold:

Quality gate: Proceed to Phase 4 when FIRST threshold reached:

  • Quick mode: 10+ sources with avg credibility >60/100 OR 2 minutes elapsed
  • Standard mode: 15+ sources with avg credibility >60/100 OR 5 minutes elapsed
  • Deep mode: 25+ sources with avg credibility >70/100 OR 10 minutes elapsed
  • UltraDeep mode: 30+ sources with avg credibility >75/100 OR 15 minutes elapsed

Continue background searches:

  • If threshold reached early, continue remaining parallel searches in background
  • Additional sources used in Phase 5 (SYNTHESIZE) for depth and diversity
  • Allows fast progression without sacrificing thoroughness

Quality Standards

Source diversity requirements:

  • Minimum 3 source types (academic, industry, news, technical docs)
  • Temporal diversity (mix of recent 2024-2025 + foundational older sources)
  • Perspective diversity (proponents + critics + neutral analysis)
  • Geographic diversity (not just US sources)

Credibility tracking:

  • Score each source 0-100 using source_evaluator.py
  • Flag low-credibility sources (<40) for additional verification
  • Prioritize high-credibility sources (>80) for core claims

Techniques:

  • Use WebSearch for current information (primary tool)
  • Use WebFetch for deep dives into specific sources (secondary)
  • Use Exa search (via WebSearch with type="neural") for semantic exploration
  • Use Grep/Read for local documentation
  • Execute code for computational analysis (when needed)
  • Use Task tool to spawn parallel retrieval agents (3-5 agents)

Output: Organized information repository with source tracking, credibility scores, and coverage map


Phase 4: TRIANGULATE - Cross-Reference Verification

Objective: Validate information across multiple independent sources

Activities:

  1. Identify claims requiring verification
  2. Cross-reference facts across 3+ sources
  3. Flag contradictions or uncertainties
  4. Assess source credibility
  5. Note consensus vs. debate areas
  6. Document verification status per claim

Quality Standards:

  • Core claims must have 3+ independent sources
  • Flag any single-source information
  • Note recency of information
  • Identify potential biases

Output: Verified fact base with confidence levels


Phase 4.5: OUTLINE REFINEMENT - Dynamic Evolution (WebWeaver 2025)

Objective: Adapt research direction based on evidence discovered

Problem Solved: Prevents "locked-in" research when evidence points to different conclusions or uncovers more important angles than initially planned.

When to Execute:

  • Standard/Deep/UltraDeep modes only (Quick mode skips this)
  • After Phase 4 (TRIANGULATE) completes
  • Before Phase 5 (SYNTHESIZE)

Activities:

  1. Review Initial Scope vs. Actual Findings

    • Compare Phase 1 scope with Phase 3-4 discoveries
    • Identify unexpected patterns or contradictions
    • Note underexplored angles that emerged as critical
    • Flag overexplored areas that proved less important
  2. Evaluate Outline Adaptation Need

    Signals for adaptation (ANY triggers refinement):

    • Major findings contradict initial assumptions
    • Evidence reveals more important angle than originally scoped
    • Critical subtopic emerged that wasn't in original plan
    • Original research question was too broad/narrow based on evidence
    • Sources consistently discuss aspects not in initial outline

    Signals to keep current outline:

    • Evidence aligns with initial scope
    • All key angles adequately covered
    • No major gaps or surprises
  3. Refine Outline (if needed)

    Update structure to reflect evidence:

    • Add sections for unexpected but important findings
    • Demote/remove sections with insufficient evidence
    • Reorder sections based on evidence strength and importance
    • Adjust scope boundaries based on what's actually discoverable

    Example adaptation:

    Original outline:
    1. Introduction
    2. Technical Architecture
    3. Performance Benchmarks
    4. Conclusion
    
    Refined after Phase 4 (evidence revealed security as critical):
    1. Introduction
    2. Technical Architecture
    3. **Security Vulnerabilities (NEW - major finding)**
    4. Performance Benchmarks (demoted - less critical than expected)
    5. **Real-World Failure Modes (NEW - pattern emerged)**
    6. Synthesis & Recommendations
    
  4. Targeted Gap Filling (if major gaps found)

    If outline refinement reveals critical knowledge gaps:

    • Launch 2-3 targeted searches for newly identified angles
    • Quick retrieval only (don't restart full Phase 3)
    • Time-box to 2-5 minutes
    • Update triangulation for new evidence only
  5. Document Adaptation Rationale

    Record in methodology appendix:

    • What changed in outline
    • Why it changed (evidence-driven reasons)
    • What additional research was conducted (if any)

Quality Standards:

  • Adaptation must be evidence-driven (cite specific sources that prompted change)
  • No more than 50% outline restructuring (if more needed, scope was severely mis scoped)
  • Retain original research question core (don't drift into different topic entirely)
  • New sections must have supporting evidence already gathered

Output: Refined outline that accurately reflects evidence landscape, ready for synthesis

Anti-Pattern Warning:

  • DON'T adapt outline based on speculation or "what would be interesting"
  • DON'T add sections without supporting evidence already in hand
  • DON'T completely abandon original research question
  • DO adapt when evidence clearly indicates better structure
  • DO document rationale for changes
  • DO stay within original topic scope

Phase 5: SYNTHESIZE - Deep Analysis

Objective: Connect insights and generate novel understanding

Activities:

  1. Identify patterns across sources
  2. Map relationships between concepts
  3. Generate insights beyond source material
  4. Create conceptual frameworks
  5. Build argument structures
  6. Develop evidence hierarchies

Ultrathink Integration: Use extended reasoning to explore non-obvious connections and second-order implications.

Output: Synthesized understanding with insight generation


Phase 6: CRITIQUE - Quality Assurance

Objective: Rigorously evaluate research quality

Activities:

  1. Review for logical consistency
  2. Check citation completeness
  3. Identify gaps or weaknesses
  4. Assess balance and objectivity
  5. Verify claims against sources
  6. Test alternative interpretations

Red Team Questions:

  • What's missing?
  • What could be wrong?
  • What alternative explanations exist?
  • What biases might be present?
  • What counterfactuals should be considered?

Output: Critique report with improvement recommendations


Phase 7: REFINE - Iterative Improvement

Objective: Address gaps and strengthen weak areas

Activities:

  1. Conduct additional research for gaps
  2. Strengthen weak arguments
  3. Add missing perspectives
  4. Resolve contradictions
  5. Enhance clarity
  6. Verify revised content

Output: Strengthened research with addressed deficiencies


Phase 8: PACKAGE - Report Generation

Objective: Deliver professional, actionable research

Activities:

  1. Structure report with clear hierarchy
  2. Write executive summary
  3. Develop detailed sections
  4. Create visualizations (tables, diagrams)
  5. Compile full bibliography
  6. Add methodology appendix

Output: Complete research report ready for use


Advanced Features

Graph-of-Thoughts Reasoning

Rather than linear thinking, branch into multiple reasoning paths:

  • Explore alternative framings in parallel
  • Pursue tangential leads that might be relevant
  • Merge insights from different branches
  • Backtrack and revise as new information emerges

Parallel Agent Deployment

Use Task tool to spawn sub-agents for:

  • Parallel source retrieval
  • Independent verification paths
  • Competing hypothesis evaluation
  • Specialized domain analysis

Adaptive Depth Control

Automatically adjust research depth based on:

  • Information complexity
  • Source availability
  • Time constraints
  • Confidence levels

Citation Intelligence

Smart citation management:

  • Track provenance of every claim
  • Link to original sources
  • Assess source credibility
  • Handle conflicting sources
  • Generate proper bibliographies