Files

王冕 a27e3b8e43 feat: sync full workspace including web modules, docs, and configurations to Gitea

Optimized the root .gitignore to exclude virtual environments, node modules,
and temp folders to ensure clean and lightweight version tracking.

Co-authored-by: Cursor <cursoragent@cursor.com>

2026-06-09 18:12:25 +08:00

14 KiB

Raw Blame History

Deep Research Methodology: 8-Phase Pipeline

Overview

This document contains the detailed methodology for conducting deep research. The 8 phases represent a comprehensive approach to gathering, verifying, and synthesizing information from multiple sources.

Phase 1: SCOPE - Research Framing

Objective: Define research boundaries and success criteria

Activities:

Decompose the question into core components
Identify stakeholder perspectives
Define scope boundaries (what's in/out)
Establish success criteria
List key assumptions to validate

Ultrathink Application: Use extended reasoning to explore multiple framings of the question before committing to scope.

Output: Structured scope document with research boundaries

Phase 2: PLAN - Strategy Formulation

Objective: Create an intelligent research roadmap

Activities:

Identify primary and secondary sources
Map knowledge dependencies (what must be understood first)
Create search query strategy with variants
Plan triangulation approach
Estimate time/effort per phase
Define quality gates

Graph-of-Thoughts: Branch into multiple potential research paths, then converge on optimal strategy.

Output: Research plan with prioritized investigation paths

Phase 3: RETRIEVE - Parallel Information Gathering

Objective: Systematically collect information from multiple sources using parallel execution for maximum speed

CRITICAL: Execute ALL searches in parallel using a single message with multiple tool calls

Query Decomposition Strategy

Before launching searches, decompose the research question into 5-10 independent search angles:

Core topic (semantic search) - Meaning-based exploration of main concept
Technical details (keyword search) - Specific terms, APIs, implementations
Recent developments (date-filtered) - What's new in 2024-2025
Academic sources (domain-specific) - Papers, research, formal analysis
Alternative perspectives (comparison) - Competing approaches, criticisms
Statistical/data sources - Quantitative evidence, metrics, benchmarks
Industry analysis - Commercial applications, market trends
Critical analysis/limitations - Known problems, failure modes, edge cases

Parallel Execution Protocol

Step 1: Launch ALL searches concurrently (single message)

CRITICAL: Use correct tool and parameters to avoid errors

Choose ONE search approach per research session:

Option A: Use WebSearch (built-in, no MCP required)

Standard web search with simple query string
Parameters: query (required)
Optional: allowed_domains, blocked_domains
Example: WebSearch(query="quantum computing 2025")

Option B: Use Exa MCP (if available, more powerful)

Advanced semantic + keyword search
Tool name: mcp__Exa__exa_search
Parameters: query (required), type (auto/neural/keyword), num_results, start_published_date, include_domains
Example: mcp__Exa__exa_search(query="quantum computing", type="neural", num_results=10)

NEVER mix parameter styles - this causes "Invalid tool parameters" errors.

Step 2: Spawn parallel deep-dive agents

Use Task tool with general-purpose agents (3-5 agents) for:

Academic paper analysis (PDFs, detailed extraction)
Documentation deep dives (technical specs, API docs)
Repository analysis (code examples, implementations)
Specialized domain research (requires multi-step investigation)

Example parallel execution (using WebSearch):

[Single message with multiple tool calls]
- WebSearch(query="quantum computing 2025 state of the art")
- WebSearch(query="quantum computing limitations challenges")
- WebSearch(query="quantum computing commercial applications 2024-2025")
- WebSearch(query="quantum computing vs classical comparison")
- WebSearch(query="quantum error correction research", allowed_domains=["arxiv.org", "scholar.google.com"])
- Task(subagent_type="general-purpose", description="Analyze quantum computing papers", prompt="Deep dive into quantum computing academic papers from 2024-2025, extract key findings and methodologies")
- Task(subagent_type="general-purpose", description="Industry analysis", prompt="Analyze quantum computing industry reports and market data, identify commercial applications")
- Task(subagent_type="general-purpose", description="Technical challenges", prompt="Extract technical limitations and challenges from quantum computing research")

Example parallel execution (using Exa MCP - if available):

[Single message with multiple tool calls]
- mcp__Exa__exa_search(query="quantum computing state of the art", type="neural", num_results=10, start_published_date="2024-01-01")
- mcp__Exa__exa_search(query="quantum computing limitations", type="keyword", num_results=10)
- mcp__Exa__exa_search(query="quantum computing commercial", type="auto", num_results=10, start_published_date="2024-01-01")
- mcp__Exa__exa_search(query="quantum error correction", type="neural", num_results=10, include_domains=["arxiv.org"])
- Task(subagent_type="general-purpose", description="Academic analysis", prompt="Analyze quantum computing academic papers")

Step 3: Collect and organize results

As results arrive:

Extract key passages with source metadata (title, URL, date, credibility)
Track information gaps that emerge
Follow promising tangents with additional targeted searches
Maintain source diversity (mix academic, industry, news, technical docs)
Monitor for quality threshold (see FFS pattern below)

First Finish Search (FFS) Pattern

Adaptive completion based on quality threshold:

Quality gate: Proceed to Phase 4 when FIRST threshold reached:

Quick mode: 10+ sources with avg credibility >60/100 OR 2 minutes elapsed
Standard mode: 15+ sources with avg credibility >60/100 OR 5 minutes elapsed
Deep mode: 25+ sources with avg credibility >70/100 OR 10 minutes elapsed
UltraDeep mode: 30+ sources with avg credibility >75/100 OR 15 minutes elapsed

Continue background searches:

If threshold reached early, continue remaining parallel searches in background
Additional sources used in Phase 5 (SYNTHESIZE) for depth and diversity
Allows fast progression without sacrificing thoroughness

Quality Standards

Source diversity requirements:

Minimum 3 source types (academic, industry, news, technical docs)
Temporal diversity (mix of recent 2024-2025 + foundational older sources)
Perspective diversity (proponents + critics + neutral analysis)
Geographic diversity (not just US sources)

Credibility tracking:

Score each source 0-100 using source_evaluator.py
Flag low-credibility sources (<40) for additional verification
Prioritize high-credibility sources (>80) for core claims

Techniques:

Use WebSearch for current information (primary tool)
Use WebFetch for deep dives into specific sources (secondary)
Use Exa search (via WebSearch with type="neural") for semantic exploration
Use Grep/Read for local documentation
Execute code for computational analysis (when needed)
Use Task tool to spawn parallel retrieval agents (3-5 agents)

Output: Organized information repository with source tracking, credibility scores, and coverage map

Phase 4: TRIANGULATE - Cross-Reference Verification

Objective: Validate information across multiple independent sources

Activities:

Identify claims requiring verification
Cross-reference facts across 3+ sources
Flag contradictions or uncertainties
Assess source credibility
Note consensus vs. debate areas
Document verification status per claim

Quality Standards:

Core claims must have 3+ independent sources
Flag any single-source information
Note recency of information
Identify potential biases

Output: Verified fact base with confidence levels

Phase 4.5: OUTLINE REFINEMENT - Dynamic Evolution (WebWeaver 2025)

Objective: Adapt research direction based on evidence discovered

Problem Solved: Prevents "locked-in" research when evidence points to different conclusions or uncovers more important angles than initially planned.

When to Execute:

Standard/Deep/UltraDeep modes only (Quick mode skips this)
After Phase 4 (TRIANGULATE) completes
Before Phase 5 (SYNTHESIZE)

Activities:

Review Initial Scope vs. Actual Findings
- Compare Phase 1 scope with Phase 3-4 discoveries
- Identify unexpected patterns or contradictions
- Note underexplored angles that emerged as critical
- Flag overexplored areas that proved less important
Evaluate Outline Adaptation Need

Signals for adaptation (ANY triggers refinement):
- Major findings contradict initial assumptions
- Evidence reveals more important angle than originally scoped
- Critical subtopic emerged that wasn't in original plan
- Original research question was too broad/narrow based on evidence
- Sources consistently discuss aspects not in initial outline
Signals to keep current outline:
- Evidence aligns with initial scope
- All key angles adequately covered
- No major gaps or surprises

Refine Outline (if needed)

Update structure to reflect evidence:

Add sections for unexpected but important findings
Demote/remove sections with insufficient evidence
Reorder sections based on evidence strength and importance
Adjust scope boundaries based on what's actually discoverable

Example adaptation:

Original outline:
1. Introduction
2. Technical Architecture
3. Performance Benchmarks
4. Conclusion

Refined after Phase 4 (evidence revealed security as critical):
1. Introduction
2. Technical Architecture
3. **Security Vulnerabilities (NEW - major finding)**
4. Performance Benchmarks (demoted - less critical than expected)
5. **Real-World Failure Modes (NEW - pattern emerged)**
6. Synthesis & Recommendations

Targeted Gap Filling (if major gaps found)

If outline refinement reveals critical knowledge gaps:
- Launch 2-3 targeted searches for newly identified angles
- Quick retrieval only (don't restart full Phase 3)
- Time-box to 2-5 minutes
- Update triangulation for new evidence only
Document Adaptation Rationale

Record in methodology appendix:
- What changed in outline
- Why it changed (evidence-driven reasons)
- What additional research was conducted (if any)

Quality Standards:

Adaptation must be evidence-driven (cite specific sources that prompted change)
No more than 50% outline restructuring (if more needed, scope was severely mis scoped)
Retain original research question core (don't drift into different topic entirely)
New sections must have supporting evidence already gathered

Output: Refined outline that accurately reflects evidence landscape, ready for synthesis

Anti-Pattern Warning:

❌ DON'T adapt outline based on speculation or "what would be interesting"
❌ DON'T add sections without supporting evidence already in hand
❌ DON'T completely abandon original research question
✅ DO adapt when evidence clearly indicates better structure
✅ DO document rationale for changes
✅ DO stay within original topic scope

Phase 5: SYNTHESIZE - Deep Analysis

Objective: Connect insights and generate novel understanding

Activities:

Identify patterns across sources
Map relationships between concepts
Generate insights beyond source material
Create conceptual frameworks
Build argument structures
Develop evidence hierarchies

Ultrathink Integration: Use extended reasoning to explore non-obvious connections and second-order implications.

Output: Synthesized understanding with insight generation

Phase 6: CRITIQUE - Quality Assurance

Objective: Rigorously evaluate research quality

Activities:

Review for logical consistency
Check citation completeness
Identify gaps or weaknesses
Assess balance and objectivity
Verify claims against sources
Test alternative interpretations

Red Team Questions:

What's missing?
What could be wrong?
What alternative explanations exist?
What biases might be present?
What counterfactuals should be considered?

Output: Critique report with improvement recommendations

Phase 7: REFINE - Iterative Improvement

Objective: Address gaps and strengthen weak areas

Activities:

Conduct additional research for gaps
Strengthen weak arguments
Add missing perspectives
Resolve contradictions
Enhance clarity
Verify revised content

Output: Strengthened research with addressed deficiencies

Phase 8: PACKAGE - Report Generation

Objective: Deliver professional, actionable research

Activities:

Structure report with clear hierarchy
Write executive summary
Develop detailed sections
Create visualizations (tables, diagrams)
Compile full bibliography
Add methodology appendix

Output: Complete research report ready for use

Advanced Features

Graph-of-Thoughts Reasoning

Rather than linear thinking, branch into multiple reasoning paths:

Explore alternative framings in parallel
Pursue tangential leads that might be relevant
Merge insights from different branches
Backtrack and revise as new information emerges

Parallel Agent Deployment

Use Task tool to spawn sub-agents for:

Parallel source retrieval
Independent verification paths
Competing hypothesis evaluation
Specialized domain analysis

Adaptive Depth Control

Automatically adjust research depth based on:

Information complexity
Source availability
Time constraints
Confidence levels

Citation Intelligence

Smart citation management:

Track provenance of every claim
Link to original sources
Assess source credibility
Handle conflicting sources
Generate proper bibliographies

14 KiB Raw Blame History

Deep Research Methodology: 8-Phase Pipeline

Overview

Phase 1: SCOPE - Research Framing

Phase 2: PLAN - Strategy Formulation

Phase 3: RETRIEVE - Parallel Information Gathering

Query Decomposition Strategy

Parallel Execution Protocol

First Finish Search (FFS) Pattern

Quality Standards

Phase 4: TRIANGULATE - Cross-Reference Verification

Phase 4.5: OUTLINE REFINEMENT - Dynamic Evolution (WebWeaver 2025)

Phase 5: SYNTHESIZE - Deep Analysis

Phase 6: CRITIQUE - Quality Assurance

Phase 7: REFINE - Iterative Improvement

Phase 8: PACKAGE - Report Generation

Advanced Features

Graph-of-Thoughts Reasoning

Parallel Agent Deployment

Adaptive Depth Control

Citation Intelligence

14 KiB

Raw Blame History