DSPy 优化器 #

什么是优化器？ #

优化器（Optimizer）是 DSPy 的核心创新之一，它可以自动优化模块的性能。通过提供训练数据，优化器可以自动生成最优的提示词、Few-shot 示例，甚至微调模型。

text

┌─────────────────────────────────────────────────────────────┐
│                    优化器的作用                              │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  传统方式：                                                  │
│  手写提示词 -> 测试 -> 调整 -> 再测试 -> ...                 │
│                                                             │
│  DSPy 优化器：                                               │
│  提供训练数据 -> 选择优化器 -> 自动编译 -> 最优模块          │
│                                                             │
│  优化器可以：                                                │
│  1. 自动生成 Few-shot 示例                                  │
│  2. 优化指令措辞                                            │
│  3. 选择最佳推理路径                                        │
│  4. 适应不同模型                                            │
│                                                             │
└─────────────────────────────────────────────────────────────┘

优化器概览 #

text

┌─────────────────────────────────────────────────────────────┐
│                    DSPy 优化器类型                           │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ BootstrapFewShot                                     │   │
│  │ 通过引导生成 Few-shot 示例                           │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ MIPRO                                                │   │
│  │ 多指令提示优化，优化指令和示例                        │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ BootstrapFinetune                                    │   │
│  │ 引导微调，生成训练数据微调模型                        │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ KNN                                                   │   │
│  │ K近邻示例选择，动态选择相似示例                       │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ Ensemble                                              │   │
│  │ 集成多个优化后的模块                                  │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
└─────────────────────────────────────────────────────────────┘

准备工作 #

创建训练数据 #

python

from dspy import Example

trainset = [
    Example(
        question="Python 的创始人是谁？",
        answer="Python 的创始人是 Guido van Rossum"
    ).with_inputs("question"),
    Example(
        question="JavaScript 的创始人是谁？",
        answer="JavaScript 的创始人是 Brendan Eich"
    ).with_inputs("question"),
    Example(
        question="Linux 的创始人是谁？",
        answer="Linux 的创始人是 Linus Torvalds"
    ).with_inputs("question"),
]

定义评估指标 #

python

def validate_answer(example, pred, trace=None):
    return example.answer.lower() == pred.answer.lower()

def semantic_match(example, pred, trace=None):
    from difflib import SequenceMatcher
    return SequenceMatcher(None, example.answer.lower(), pred.answer.lower()).ratio()

BootstrapFewShot #

BootstrapFewShot 是最常用的优化器，通过引导生成 Few-shot 示例。

基本用法 #

python

import dspy
from dspy.teleprompt import BootstrapFewShot

lm = dspy.LM('openai/gpt-4o-mini')
dspy.configure(lm=lm)

class QA(dspy.Signature):
    """回答问题"""
    question = dspy.InputField()
    answer = dspy.OutputField()

class SimpleQA(dspy.Module):
    def __init__(self):
        super().__init__()
        self.predict = dspy.Predict(QA)
    
    def forward(self, question):
        return self.predict(question=question)

optimizer = BootstrapFewShot(
    metric=validate_answer,
    max_bootstrapped_demos=4,
    max_labeled_demos=16
)

optimized_qa = optimizer.compile(SimpleQA(), trainset=trainset)

参数详解 #

python

optimizer = BootstrapFewShot(
    metric=validate_answer,
    max_bootstrapped_demos=4,
    max_labeled_demos=16,
    max_rounds=1,
    max_errors=10
)

text

┌─────────────────────────────────────────────────────────────┐
│                    BootstrapFewShot 参数                     │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  metric: 评估指标函数                                        │
│  - 用于判断输出是否正确                                     │
│  - 返回 True/False 或分数                                   │
│                                                             │
│  max_bootstrapped_demos: 最大引导示例数                     │
│  - 通过引导生成的示例数量                                   │
│  - 默认 4                                                   │
│                                                             │
│  max_labeled_demos: 最大标注示例数                          │
│  - 直接使用的训练集示例数量                                 │
│  - 默认 16                                                  │
│                                                             │
│  max_rounds: 最大优化轮数                                   │
│  - 优化迭代次数                                             │
│  - 默认 1                                                   │
│                                                             │
│  max_errors: 最大错误数                                     │
│  - 引导过程中允许的最大错误数                               │
│  - 默认 10                                                  │
│                                                             │
└─────────────────────────────────────────────────────────────┘

工作原理 #

text

┌─────────────────────────────────────────────────────────────┐
│                    BootstrapFewShot 工作流程                 │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  1. 初始化                                                  │
│     - 加载训练数据                                          │
│     - 准备模块                                              │
│                                                             │
│  2. 引导生成                                                │
│     - 对每个训练样本执行模块                                │
│     - 如果输出正确，保存为示例                              │
│     - 重复直到达到最大示例数                                │
│                                                             │
│  3. 编译模块                                                │
│     - 将生成的示例添加到模块                                │
│     - 作为 Few-shot 示例使用                                │
│                                                             │
│  4. 返回优化后的模块                                        │
│                                                             │
└─────────────────────────────────────────────────────────────┘

完整示例 #

python

import dspy
from dspy import Example
from dspy.teleprompt import BootstrapFewShot

lm = dspy.LM('openai/gpt-4o-mini')
dspy.configure(lm=lm)

class QA(dspy.Signature):
    """回答编程语言相关问题"""
    question = dspy.InputField()
    answer = dspy.OutputField()

class ProgrammingQA(dspy.Module):
    def __init__(self):
        super().__init__()
        self.predict = dspy.Predict(QA)
    
    def forward(self, question):
        return self.predict(question=question)

trainset = [
    Example(question="Python 是什么类型的语言？", answer="Python 是解释型、面向对象的编程语言").with_inputs("question"),
    Example(question="Java 的主要特点是什么？", answer="Java 是面向对象、跨平台、强类型的编程语言").with_inputs("question"),
    Example(question="JavaScript 用于什么？", answer="JavaScript 主要用于 Web 前端开发").with_inputs("question"),
    Example(question="Go 语言的特点？", answer="Go 是静态类型、编译型、并发支持强的语言").with_inputs("question"),
]

def validate_answer(example, pred, trace=None):
    return example.answer.lower() in pred.answer.lower()

optimizer = BootstrapFewShot(
    metric=validate_answer,
    max_bootstrapped_demos=3
)

optimized_qa = optimizer.compile(ProgrammingQA(), trainset=trainset)

result = optimized_qa(question="Rust 语言有什么特点？")
print(result.answer)

MIPRO #

MIPRO（Multi-Instruction Prompt Optimization）是一个更强大的优化器，可以同时优化指令和示例。

基本用法 #

python

import dspy
from dspy.teleprompt import MIPRO

lm = dspy.LM('openai/gpt-4o-mini')
dspy.configure(lm=lm)

optimizer = MIPRO(
    metric=validate_answer,
    num_threads=4,
    max_bootstrapped_demos=4,
    max_labeled_demos=16
)

optimized_module = optimizer.compile(
    my_module,
    trainset=trainset,
    valset=valset
)

参数详解 #

python

optimizer = MIPRO(
    metric=validate_answer,
    num_threads=4,
    max_bootstrapped_demos=4,
    max_labeled_demos=16,
    max_rounds=3,
    num_candidates=10,
    teacher_settings={}
)

text

┌─────────────────────────────────────────────────────────────┐
│                    MIPRO 参数                                │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  metric: 评估指标                                            │
│  num_threads: 并行线程数                                     │
│  max_bootstrapped_demos: 最大引导示例数                     │
│  max_labeled_demos: 最大标注示例数                          │
│  max_rounds: 最大优化轮数                                   │
│  num_candidates: 候选指令数量                               │
│  teacher_settings: 教师模型设置                             │
│                                                             │
│  MIPRO 特点：                                               │
│  - 优化指令措辞                                             │
│  - 优化示例选择                                             │
│  - 支持多轮优化                                             │
│  - 支持并行处理                                             │
│                                                             │
└─────────────────────────────────────────────────────────────┘

工作原理 #

text

┌─────────────────────────────────────────────────────────────┐
│                    MIPRO 工作流程                            │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  阶段 1: 引导                                               │
│  - 使用教师模型生成示例                                     │
│  - 收集成功的示例                                           │
│                                                             │
│  阶段 2: 指令生成                                           │
│  - 分析任务特点                                             │
│  - 生成多个候选指令                                         │
│                                                             │
│  阶段 3: 评估选择                                           │
│  - 在验证集上评估每个候选                                   │
│  - 选择最佳指令和示例组合                                   │
│                                                             │
│  阶段 4: 迭代优化                                           │
│  - 重复优化过程                                             │
│  - 直到达到最大轮数或收敛                                   │
│                                                             │
└─────────────────────────────────────────────────────────────┘

BootstrapFinetune #

BootstrapFinetune 生成训练数据来微调模型。

基本用法 #

python

import dspy
from dspy.teleprompt import BootstrapFinetune

lm = dspy.LM('openai/gpt-4o-mini')
dspy.configure(lm=lm)

optimizer = BootstrapFinetune(
    metric=validate_answer,
    max_bootstrapped_demos=100
)

optimized_module = optimizer.compile(
    my_module,
    trainset=trainset,
    target_model="gpt-3.5-turbo"
)

适用场景 #

text

┌─────────────────────────────────────────────────────────────┐
│                    BootstrapFinetune 适用场景                │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  适合：                                                     │
│  ✅ 有大量训练数据                                          │
│  ✅ 需要降低推理成本                                        │
│  ✅ 需要特定领域优化                                        │
│  ✅ 需要更快的响应速度                                      │
│                                                             │
│  不适合：                                                   │
│  ❌ 训练数据很少                                            │
│  ❌ 任务变化频繁                                            │
│  ❌ 没有微调资源                                            │
│                                                             │
└─────────────────────────────────────────────────────────────┘

KNN 优化器 #

KNN 优化器动态选择与当前输入最相似的示例。

基本用法 #

python

import dspy
from dspy.teleprompt import KNN

optimizer = KNN(
    k=3,
    trainset=trainset
)

optimized_module = optimizer.compile(my_module)

工作原理 #

text

┌─────────────────────────────────────────────────────────────┐
│                    KNN 优化器工作原理                        │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  1. 向量化：将所有训练样本转换为向量                        │
│                                                             │
│  2. 查询：对于新输入，计算与所有样本的相似度                │
│                                                             │
│  3. 选择：选择最相似的 k 个样本作为示例                     │
│                                                             │
│  4. 生成：使用选中的示例生成输出                            │
│                                                             │
│  优势：                                                     │
│  - 动态选择最相关的示例                                     │
│  - 适应不同类型的输入                                       │
│  - 不需要重新编译                                           │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Ensemble 优化器 #

Ensemble 优化器组合多个优化后的模块。

基本用法 #

python

import dspy
from dspy.teleprompt import BootstrapFewShot, Ensemble

optimizer1 = BootstrapFewShot(metric=metric1, max_bootstrapped_demos=4)
optimizer2 = BootstrapFewShot(metric=metric2, max_bootstrapped_demos=4)

module1 = optimizer1.compile(MyModule(), trainset=trainset)
module2 = optimizer2.compile(MyModule(), trainset=trainset)

ensemble = Ensemble([module1, module2])
result = ensemble(question="测试问题")

优化器选择指南 #

text

┌─────────────────────────────────────────────────────────────┐
│                    优化器选择指南                            │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  BootstrapFewShot：                                         │
│  - 通用场景首选                                             │
│  - 中等规模训练数据                                         │
│  - 需要快速优化                                             │
│                                                             │
│  MIPRO：                                                    │
│  - 需要优化指令措辞                                         │
│  - 有充足的计算资源                                         │
│  - 追求最佳性能                                             │
│                                                             │
│  BootstrapFinetune：                                        │
│  - 大量训练数据                                             │
│  - 需要降低成本                                             │
│  - 特定领域优化                                             │
│                                                             │
│  KNN：                                                      │
│  - 多样化的输入类型                                         │
│  - 需要动态示例选择                                         │
│  - 训练数据分布均匀                                         │
│                                                             │
│  Ensemble：                                                 │
│  - 需要最稳定的结果                                         │
│  - 有多个优化版本                                           │
│  - 可以接受更高延迟                                         │
│                                                             │
└─────────────────────────────────────────────────────────────┘

评估优化效果 #

使用 Evaluate #

python

from dspy.evaluate import Evaluate

evaluator = Evaluate(
    devset=testset,
    metric=validate_answer,
    num_threads=4,
    display_progress=True,
    display_table=5
)

score_before = evaluator(original_module)
score_after = evaluator(optimized_module)

print(f"优化前: {score_before}")
print(f"优化后: {score_after}")

对比分析 #

python

def compare_modules(original, optimized, testset):
    results = []
    for example in testset:
        orig_result = original(question=example.question)
        opt_result = optimized(question=example.question)
        
        results.append({
            'question': example.question,
            'expected': example.answer,
            'original': orig_result.answer,
            'optimized': opt_result.answer
        })
    
    return results

保存和加载优化后的模块 #

保存 #

python

optimized_module.save("optimized_qa.json")

加载 #

python

loaded_module = MyModule()
loaded_module.load("optimized_qa.json")

自定义优化器 #

基本结构 #

python

from dspy.teleprompt import Teleprompter

class MyOptimizer(Teleprompter):
    def __init__(self, metric, **kwargs):
        self.metric = metric
        self.kwargs = kwargs
    
    def compile(self, student, *, teacher=None, trainset):
        optimized = student.deepcopy()
        
        for example in trainset:
            result = optimized(**example.inputs())
            if self.metric(example, result):
                pass
        
        return optimized

完整示例 #

python

from dspy.teleprompt import Teleprompter
import random

class RandomFewShot(Teleprompter):
    def __init__(self, metric, k=3):
        self.metric = metric
        self.k = k
    
    def compile(self, module, trainset):
        successful_examples = []
        
        for example in trainset:
            result = module(**example.inputs())
            if self.metric(example, result):
                successful_examples.append(example)
        
        demos = random.sample(
            successful_examples,
            min(self.k, len(successful_examples))
        )
        
        module.demos = demos
        return module

最佳实践 #

1. 数据质量 #

python

trainset = [
    Example(
        question="高质量的问题",
        answer="准确、详细的答案"
    ).with_inputs("question")
    for _ in range(100)
]

2. 指标设计 #

python

def good_metric(example, pred, trace=None):
    if trace:
        return detailed_check(example, pred)
    return quick_check(example, pred)

3. 渐进优化 #

python

optimizer1 = BootstrapFewShot(metric=metric, max_bootstrapped_demos=2)
module = optimizer1.compile(module, trainset=trainset)

optimizer2 = MIPRO(metric=metric, max_rounds=2)
module = optimizer2.compile(module, trainset=trainset, valset=valset)

下一步 #

现在你已经掌握了优化器的使用方法，接下来学习检索器，了解如何构建 RAG 应用！