思维链进阶应用 #

进阶技术概览 #

text

┌─────────────────────────────────────────────────────────────┐
│                    进阶思维链技术                            │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│                    Chain of Thought                         │
│                          │                                  │
│                          ▼                                  │
│  ┌───────────────────────────────────────────────────────┐ │
│  │              线性推理（基础 CoT）                       │ │
│  │  问题 ──> 步骤1 ──> 步骤2 ──> 步骤3 ──> 答案          │ │
│  └───────────────────────────────────────────────────────┘ │
│                          │                                  │
│                          ▼                                  │
│  ┌───────────────────────────────────────────────────────┐ │
│  │              树状推理（Tree of Thoughts）               │ │
│  │         ┌──> 路径1 ──> 答案A                          │ │
│  │  问题 ──┼──> 路径2 ──> 答案B                          │ │
│  │         └──> 路径3 ──> 答案C                          │ │
│  └───────────────────────────────────────────────────────┘ │
│                          │                                  │
│                          ▼                                  │
│  ┌───────────────────────────────────────────────────────┐ │
│  │              图状推理（Graph of Thoughts）              │ │
│  │  问题 ──> 思维节点 <──> 思维节点 ──> 答案             │ │
│  │              ↑              ↓                         │ │
│  │              └──────────────┘                         │ │
│  └───────────────────────────────────────────────────────┘ │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Tree of Thoughts (ToT) #

技术原理 #

Tree of Thoughts 将思维链从线性结构扩展为树状结构，支持多路径探索和回溯。

text

┌─────────────────────────────────────────────────────────────┐
│                    Tree of Thoughts 原理                     │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  核心思想：                                                  │
│  将推理过程建模为搜索问题，在思维树上搜索最优解              │
│                                                             │
│  与基础 CoT 的区别：                                        │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ CoT：线性推理，一条路走到黑                          │   │
│  │ ToT：树状推理，可以探索、评估、回溯                  │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
│  核心组件：                                                  │
│  1. 思维生成（Thought Generation）                         │
│  2. 状态评估（State Evaluation）                           │
│  3. 搜索算法（Search Algorithm）                           │
│                                                             │
└─────────────────────────────────────────────────────────────┘

树状结构示意 #

text

                    问题
                      │
          ┌───────────┼───────────┐
          │           │           │
       思维A        思维B        思维C
       (评分:7)     (评分:5)     (评分:8)
          │           │           │
    ┌─────┴─────┐     │     ┌─────┴─────┐
    │           │     │     │           │
 思维A1      思维A2   │   思维C1      思维C2
 (评分:6)    (评分:8) │  (评分:7)    (评分:9)
    │           │     │     │           │
    │           ▼     │     │           ▼
    │        答案1    │     │        答案2 ✓
    │                 │     │
    └─────────────────┴─────┘
           (回溯探索)

实现方式 #

python

from dataclasses import dataclass
from typing import List, Optional
import heapq

@dataclass
class ThoughtNode:
    """
    思维节点
    """
    thought: str
    score: float
    children: List['ThoughtNode']
    parent: Optional['ThoughtNode'] = None
    is_final: bool = False


class TreeOfThoughts:
    """
    Tree of Thoughts 实现
    """
    
    def __init__(self, max_depth=3, beam_width=3):
        self.max_depth = max_depth
        self.beam_width = beam_width
    
    def solve(self, problem):
        """
        使用 ToT 解决问题
        """
        root = ThoughtNode(thought=problem, score=0, children=[])
        
        beam = [(root.score, id(root), root)]
        
        for depth in range(self.max_depth):
            candidates = []
            
            for _, _, node in beam:
                thoughts = self.generate_thoughts(node)
                
                for thought in thoughts:
                    score = self.evaluate_thought(thought, problem)
                    child = ThoughtNode(
                        thought=thought,
                        score=score,
                        children=[],
                        parent=node
                    )
                    node.children.append(child)
                    candidates.append((score, id(child), child))
            
            beam = heapq.nlargest(self.beam_width, candidates)
            
            for _, _, node in beam:
                if self.is_final_state(node):
                    node.is_final = True
                    return self.extract_solution(node)
        
        _, _, best_node = max(beam)
        return self.extract_solution(best_node)
    
    def generate_thoughts(self, node, num_thoughts=3):
        """
        生成可能的下一步思维
        """
        prompt = f"""
        当前思维：{node.thought}
        
        请生成 {num_thoughts} 个可能的下一步思考方向。
        每个方向应该是一个具体的推理步骤。
        """
        
        response = llm.generate(prompt)
        thoughts = self.parse_thoughts(response, num_thoughts)
        return thoughts
    
    def evaluate_thought(self, thought, problem):
        """
        评估思维节点的质量
        """
        prompt = f"""
        问题：{problem}
        当前思考：{thought}
        
        请评估这个思考方向的有效性（1-10分）：
        - 是否朝着解决问题的方向前进？
        - 推理是否合理？
        - 是否有可能得出正确答案？
        
        只输出分数（1-10）。
        """
        
        response = llm.generate(prompt)
        score = self.parse_score(response)
        return score
    
    def is_final_state(self, node):
        """
        判断是否达到最终状态
        """
        prompt = f"""
        当前思考：{node.thought}
        
        这个思考是否已经得出了最终答案？（是/否）
        """
        
        response = llm.generate(prompt)
        return "是" in response or "yes" in response.lower()
    
    def extract_solution(self, node):
        """
        提取完整解决方案
        """
        path = []
        current = node
        
        while current:
            path.append(current.thought)
            current = current.parent
        
        return list(reversed(path))


# 使用示例
tot = TreeOfThoughts(max_depth=4, beam_width=3)

problem = """
用 24 根火柴棒，如何拼出 4 个大小相同的正方形？
每个正方形的边必须由完整的火柴棒组成。
"""

solution = tot.solve(problem)

print("解决方案路径：")
for i, thought in enumerate(solution):
    print(f"步骤 {i+1}: {thought}")

搜索策略 #

text

┌─────────────────────────────────────────────────────────────┐
│                    ToT 搜索策略                              │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  1. 广度优先搜索 (BFS)                                      │
│     ├── 逐层扩展所有节点                                    │
│     ├── 保证找到最优解                                      │
│     └── 计算成本较高                                        │
│                                                             │
│  2. 深度优先搜索 (DFS)                                      │
│     ├── 沿一条路径深入探索                                  │
│     ├── 快速找到一个解                                      │
│     └── 可能错过最优解                                      │
│                                                             │
│  3. 束搜索 (Beam Search)                                    │
│     ├── 每层只保留 top-k 个节点                             │
│     ├── 平衡效率和质量                                      │
│     └── 最常用的策略                                        │
│                                                             │
│  4. 最佳优先搜索 (Best-First)                               │
│     ├── 优先扩展评分最高的节点                              │
│     ├── 结合启发式评估                                      │
│     └── 适合有明确目标的问题                                │
│                                                             │
└─────────────────────────────────────────────────────────────┘

适用场景 #

text

✅ 适合 ToT 的场景：
├── 需要探索多条路径的问题
│   └── 如：数学证明、策略游戏、创意写作
├── 中间步骤需要评估的问题
│   └── 如：复杂决策、规划问题
├── 可能需要回溯的问题
│   └── 如：约束满足问题、搜索问题
└── 对准确率要求高的问题
    └── 如：重要决策、科学研究

⚠️ 不适合 ToT 的场景：
├── 简单线性推理问题
├── 计算资源受限
├── 响应时间要求严格
└── 不需要探索的问题

Graph of Thoughts (GoT) #

技术原理 #

Graph of Thoughts 进一步将思维结构从树扩展为图，支持更复杂的推理模式。

text

┌─────────────────────────────────────────────────────────────┐
│                    Graph of Thoughts 原理                    │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  核心思想：                                                  │
│  思维节点可以有多对多的关系，支持聚合、分解、循环            │
│                                                             │
│  与 ToT 的区别：                                            │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ ToT：树状结构，每个节点只有一个父节点                │   │
│  │ GoT：图状结构，节点可以有多入多出的连接              │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
│  核心操作：                                                  │
│  1. 分解（Decomposition）：将复杂思维分解为子思维           │
│  2. 聚合（Aggregation）：将多个思维合并为一个               │
│  3. 改进（Improvement）：基于反馈改进思维                   │
│  4. 循环（Loop）：重复执行直到满足条件                      │
│                                                             │
└─────────────────────────────────────────────────────────────┘

图状结构示意 #

text

                    ┌──────────┐
                    │  问题    │
                    └────┬─────┘
                         │
          ┌──────────────┼──────────────┐
          │              │              │
          ▼              ▼              ▼
    ┌──────────┐  ┌──────────┐  ┌──────────┐
    │ 思维A    │  │ 思维B    │  │ 思维C    │
    └────┬─────┘  └────┬─────┘  └────┬─────┘
         │              │              │
         │      ┌───────┴───────┐      │
         │      │               │      │
         ▼      ▼               ▼      ▼
    ┌──────────┐          ┌──────────┐
    │ 子思维A1 │          │ 子思维B1 │
    └────┬─────┘          └────┬─────┘
         │                     │
         └──────────┬──────────┘
                    │
                    ▼
              ┌──────────┐
              │  聚合    │
              └────┬─────┘
                   │
                   ▼
              ┌──────────┐
              │  答案    │
              └──────────┘

实现方式 #

python

from dataclasses import dataclass, field
from typing import List, Dict, Set, Callable
from enum import Enum

class ThoughtType(Enum):
    ORIGINAL = "original"
    DECOMPOSED = "decomposed"
    AGGREGATED = "aggregated"
    IMPROVED = "improved"


@dataclass
class GraphNode:
    """
    图节点
    """
    id: str
    thought: str
    thought_type: ThoughtType
    score: float = 0.0
    predecessors: Set[str] = field(default_factory=set)
    successors: Set[str] = field(default_factory=set)


class GraphOfThoughts:
    """
    Graph of Thoughts 实现
    """
    
    def __init__(self):
        self.nodes: Dict[str, GraphNode] = {}
        self.operations = {
            "decompose": self.decompose,
            "aggregate": self.aggregate,
            "improve": self.improve,
            "generate": self.generate,
        }
    
    def solve(self, problem, operations_sequence):
        """
        使用 GoT 解决问题
        """
        root = GraphNode(
            id="root",
            thought=problem,
            thought_type=ThoughtType.ORIGINAL
        )
        self.nodes["root"] = root
        
        for op_name, params in operations_sequence:
            operation = self.operations[op_name]
            operation(**params)
        
        return self.get_best_solution()
    
    def decompose(self, node_id, num_parts=3):
        """
        分解操作：将一个思维分解为多个子思维
        """
        node = self.nodes[node_id]
        
        prompt = f"""
        当前思维：{node.thought}
        
        请将这个思维分解为 {num_parts} 个独立的子问题或子任务。
        每个子问题应该是可独立解决的。
        """
        
        response = llm.generate(prompt)
        sub_thoughts = self.parse_thoughts(response, num_parts)
        
        for i, thought in enumerate(sub_thoughts):
            child_id = f"{node_id}_d{i}"
            child = GraphNode(
                id=child_id,
                thought=thought,
                thought_type=ThoughtType.DECOMPOSED,
                predecessors={node_id}
            )
            self.nodes[child_id] = child
            node.successors.add(child_id)
    
    def aggregate(self, node_ids, aggregation_strategy="summarize"):
        """
        聚合操作：将多个思维合并为一个
        """
        thoughts = [self.nodes[nid].thought for nid in node_ids]
        
        prompt = f"""
        以下是多个相关的思考：
        
        {chr(10).join(f'- {t}' for t in thoughts)}
        
        请将这些思考聚合为一个综合性的结论。
        """
        
        response = llm.generate(prompt)
        
        agg_id = f"agg_{'_'.join(node_ids)}"
        agg_node = GraphNode(
            id=agg_id,
            thought=response,
            thought_type=ThoughtType.AGGREGATED,
            predecessors=set(node_ids)
        )
        self.nodes[agg_id] = agg_node
        
        for nid in node_ids:
            self.nodes[nid].successors.add(agg_id)
        
        return agg_id
    
    def improve(self, node_id, feedback=None):
        """
        改进操作：基于反馈改进思维
        """
        node = self.nodes[node_id]
        
        prompt = f"""
        当前思维：{node.thought}
        
        {"反馈：" + feedback if feedback else "请改进这个思考，使其更加完善。"}
        """
        
        response = llm.generate(prompt)
        
        improved_id = f"{node_id}_improved"
        improved_node = GraphNode(
            id=improved_id,
            thought=response,
            thought_type=ThoughtType.IMPROVED,
            predecessors={node_id},
            score=self.evaluate(response)
        )
        self.nodes[improved_id] = improved_node
        node.successors.add(improved_id)
        
        return improved_id
    
    def generate(self, node_id):
        """
        生成操作：生成新的思维
        """
        node = self.nodes[node_id]
        
        prompt = f"""
        基于当前思维：{node.thought}
        
        请生成下一步的思考。
        """
        
        response = llm.generate(prompt)
        
        gen_id = f"{node_id}_gen"
        gen_node = GraphNode(
            id=gen_id,
            thought=response,
            thought_type=ThoughtType.ORIGINAL,
            predecessors={node_id},
            score=self.evaluate(response)
        )
        self.nodes[gen_id] = gen_node
        node.successors.add(gen_id)
        
        return gen_id
    
    def evaluate(self, thought):
        """
        评估思维质量
        """
        prompt = f"评估以下思维的质量（1-10分）：{thought}"
        response = llm.generate(prompt)
        return self.parse_score(response)
    
    def get_best_solution(self):
        """
        获取最佳解决方案
        """
        scored_nodes = [
            (node.score, node.thought) 
            for node in self.nodes.values() 
            if node.thought_type in [ThoughtType.AGGREGATED, ThoughtType.IMPROVED]
        ]
        
        if scored_nodes:
            return max(scored_nodes, key=lambda x: x[0])[1]
        
        return None


# 使用示例
got = GraphOfThoughts()

problem = "如何设计一个高效的城市交通系统？"

operations = [
    ("decompose", {"node_id": "root", "num_parts": 4}),
    ("generate", {"node_id": "root_d0"}),
    ("generate", {"node_id": "root_d1"}),
    ("generate", {"node_id": "root_d2"}),
    ("generate", {"node_id": "root_d3"}),
    ("aggregate", {"node_ids": ["root_d0_gen", "root_d1_gen", "root_d2_gen", "root_d3_gen"]}),
    ("improve", {"node_id": "agg_root_d0_gen_root_d1_gen_root_d2_gen_root_d3_gen"}),
]

solution = got.solve(problem, operations)
print(solution)

GoT 操作类型 #

text

┌─────────────────────────────────────────────────────────────┐
│                    GoT 操作类型                              │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  1. 分解操作 (Decomposition)                                │
│     ┌─────────┐                                             │
│     │  A      │                                             │
│     └────┬────┘                                             │
│          │                                                  │
│     ┌────┼────┐                                             │
│     │    │    │                                             │
│     ▼    ▼    ▼                                             │
│    A1   A2   A3                                             │
│                                                             │
│  2. 聚合操作 (Aggregation)                                  │
│    A1   A2   A3                                             │
│     │    │    │                                             │
│     └────┼────┘                                             │
│          │                                                  │
│     ┌────┴────┐                                             │
│     │    A    │                                             │
│     └─────────┘                                             │
│                                                             │
│  3. 改进操作 (Improvement)                                  │
│     ┌─────────┐       ┌─────────┐                          │
│     │    A    │──────>│  A'     │                          │
│     └─────────┘       └─────────┘                          │
│                         (改进后)                            │
│                                                             │
│  4. 生成操作 (Generation)                                   │
│     ┌─────────┐       ┌─────────┐                          │
│     │    A    │──────>│    B    │                          │
│     └─────────┘       └─────────┘                          │
│                         (新生成)                            │
│                                                             │
└─────────────────────────────────────────────────────────────┘

多模态 CoT #

技术原理 #

多模态 CoT 扩展了传统文本思维链，支持图像、表格等多种模态的推理。

text

┌─────────────────────────────────────────────────────────────┐
│                    多模态 CoT 原理                           │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  核心思想：                                                  │
│  将视觉信息融入推理过程，实现跨模态的思维链                  │
│                                                             │
│  应用场景：                                                  │
│  1. 图表分析：理解图表并进行推理                            │
│  2. 图像推理：基于图像内容进行逻辑推理                      │
│  3. 文档理解：结合文本和图像理解文档                        │
│  4. 科学问题：分析实验图像和数据                            │
│                                                             │
└─────────────────────────────────────────────────────────────┘

实现方式 #

python

def multimodal_cot(question, image=None, table=None):
    """
    多模态 CoT 实现
    
    Args:
        question: 问题文本
        image: 图像（可选）
        table: 表格数据（可选）
        
    Returns:
        推理结果
    """
    prompt = f"""
    问题：{question}
    """
    
    if image:
        prompt += """
        
        [图像分析]
        请先分析图像内容，提取关键信息。
        """
        image_description = vision_model.describe(image)
        prompt += f"\n图像描述：{image_description}"
    
    if table:
        prompt += f"""
        
        [表格数据]
        {table}
        
        请分析表格数据，提取相关信息。
        """
    
    prompt += """
    
    让我们一步步思考：
    """
    
    response = llm.generate(prompt)
    return response


# 使用示例
question = """
根据下面的柱状图，分析 2020-2023 年各季度的销售趋势，
并预测 2024 年第一季度的销售额。
"""

result = multimodal_cot(question, image="sales_chart.png")
print(result)

多模态推理示例 #

text

问题：根据图像中的几何图形，计算阴影部分的面积。

[图像分析]
图像显示一个正方形内有一个圆形，圆形内部有一个小正方形。
- 大正方形边长：10 cm
- 圆形直径：10 cm（与正方形内切）
- 小正方形对角线：10 cm（圆形直径）

[推理步骤]
步骤1：计算大正方形面积
       面积 = 10 × 10 = 100 cm²

步骤2：计算圆形面积
       半径 = 10 ÷ 2 = 5 cm
       面积 = π × 5² = 25π ≈ 78.5 cm²

步骤3：计算小正方形面积
       对角线 = 10 cm
       边长 = 10 ÷ √2 ≈ 7.07 cm
       面积 = 7.07² = 50 cm²

步骤4：计算阴影面积
       阴影 = 圆形面积 - 小正方形面积
       阴影 = 78.5 - 50 = 28.5 cm²

答案：阴影部分面积约 28.5 cm²

技术对比与选择 #

text

┌─────────────────────────────────────────────────────────────┐
│                    进阶技术对比                              │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  技术            结构    复杂度    适用场景                  │
│  ─────────────────────────────────────────────────────     │
│  CoT             线性    低        简单推理                  │
│  ToT             树状    中        多路径探索                │
│  GoT             图状    高        复杂推理                  │
│  多模态CoT       线性    中        跨模态推理                │
│                                                             │
│  选择建议：                                                  │
│  ├── 单路径推理足够：使用基础 CoT                          │
│  ├── 需要探索多条路径：使用 ToT                            │
│  ├── 需要聚合多个思维：使用 GoT                            │
│  └── 涉及图像/表格：使用多模态 CoT                         │
│                                                             │
└─────────────────────────────────────────────────────────────┘

性能优化 #

Token 优化 #

python

def optimized_cot(question, max_steps=5):
    """
    优化的思维链实现
    """
    prompt = f"""
    {question}
    
    请用简洁的语言，分 {max_steps} 步以内完成推理。
    每步只写关键信息。
    """
    
    return llm.generate(prompt, max_tokens=500)

缓存策略 #

python

from functools import lru_cache

@lru_cache(maxsize=100)
def cached_thought_generation(thought_hash):
    """
    缓存思维生成结果
    """
    return llm.generate(thought_hash)

def get_cached_thought(thought):
    """
    获取缓存的思维（如果存在）
    """
    thought_hash = hash(thought)
    return cached_thought_generation(thought_hash)

并行处理 #

python

import asyncio

async def parallel_thought_generation(thoughts):
    """
    并行生成多个思维
    """
    tasks = [llm.generate_async(thought) for thought in thoughts]
    results = await asyncio.gather(*tasks)
    return results

下一步 #

现在你已经掌握了思维链的进阶技术，接下来学习实践应用，通过实际案例巩固所学知识！