快速开始 #

第一个 MLflow 实验 #

让我们从一个简单的机器学习实验开始，体验 MLflow 的核心功能。

准备工作 #

bash

pip install mlflow scikit-learn pandas numpy

完整示例代码 #

python

import mlflow
import mlflow.sklearn
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score

data = load_iris()
X, y = data.data, data.target

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

mlflow.set_experiment("iris-classification")

with mlflow.start_run():
    n_estimators = 100
    max_depth = 5
    random_state = 42
    
    mlflow.log_param("n_estimators", n_estimators)
    mlflow.log_param("max_depth", max_depth)
    mlflow.log_param("random_state", random_state)
    
    model = RandomForestClassifier(
        n_estimators=n_estimators,
        max_depth=max_depth,
        random_state=random_state
    )
    model.fit(X_train, y_train)
    
    y_pred = model.predict(X_test)
    
    accuracy = accuracy_score(y_test, y_pred)
    precision = precision_score(y_test, y_pred, average='weighted')
    recall = recall_score(y_test, y_pred, average='weighted')
    
    mlflow.log_metric("accuracy", accuracy)
    mlflow.log_metric("precision", precision)
    mlflow.log_metric("recall", recall)
    
    mlflow.sklearn.log_model(model, "model")
    
    print(f"Accuracy: {accuracy:.4f}")
    print(f"Precision: {precision:.4f}")
    print(f"Recall: {recall:.4f}")

运行实验 #

bash

python train.py

启动 MLflow UI #

启动命令 #

bash

mlflow ui

访问界面 #

打开浏览器访问：http://localhost:5000

text

┌─────────────────────────────────────────────────────────────┐
│                    MLflow UI 界面                            │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Experiments                                        │   │
│  │  ├── Default                                        │   │
│  │  └── iris-classification (1 run)                   │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Runs                                               │   │
│  │  ┌─────────────────────────────────────────────┐   │   │
│  │  │ Run ID: abc123...                           │   │   │
│  │  │ Start Time: 2024-01-01 10:00:00            │   │   │
│  │  │ Metrics: accuracy=0.9667                    │   │   │
│  │  │ Parameters: n_estimators=100, max_depth=5   │   │   │
│  │  └─────────────────────────────────────────────┘   │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
└─────────────────────────────────────────────────────────────┘

MLflow 核心概念 #

Experiment（实验） #

text

┌─────────────────────────────────────────────────────────────┐
│                      Experiment                              │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  定义：一组相关的 Runs（运行）的容器                         │
│                                                             │
│  作用：                                                     │
│  ├── 组织和管理相关的实验运行                               │
│  ├── 比较不同运行的参数和指标                               │
│  └── 可视化实验结果                                         │
│                                                             │
│  创建方式：                                                  │
│  ├── mlflow.set_experiment("name")                         │
│  ├── mlflow.create_experiment("name")                      │
│  └── 通过 UI 创建                                           │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Run（运行） #

text

┌─────────────────────────────────────────────────────────────┐
│                         Run                                  │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  定义：一次模型训练执行的记录                                │
│                                                             │
│  包含内容：                                                  │
│  ├── Parameters：模型超参数                                 │
│  ├── Metrics：评估指标                                      │
│  ├── Artifacts：输出文件（模型、图表等）                    │
│  ├── Tags：自定义标签                                       │
│  └── Metadata：元数据（时间、用户、代码版本等）             │
│                                                             │
│  生命周期：                                                  │
│  ├── Active：运行中                                         │
│  ├── Finished：正常结束                                     │
│  ├── Failed：失败                                           │
│  └── Killed：被终止                                         │
│                                                             │
└─────────────────────────────────────────────────────────────┘

基本操作 #

创建实验 #

python

import mlflow

experiment_id = mlflow.create_experiment(
    name="my-experiment",
    artifact_location="./mlruns",
    tags={"project": "demo", "version": "1.0"}
)

print(f"Created experiment with ID: {experiment_id}")

设置实验 #

python

import mlflow

mlflow.set_experiment("my-experiment")

开始运行 #

python

import mlflow

with mlflow.start_run():
    pass

with mlflow.start_run(run_name="my-run"):
    pass

run = mlflow.start_run()
try:
    pass
finally:
    mlflow.end_run()

记录参数 #

python

import mlflow

with mlflow.start_run():
    mlflow.log_param("learning_rate", 0.01)
    mlflow.log_param("batch_size", 32)
    mlflow.log_param("epochs", 100)
    
    params = {
        "optimizer": "adam",
        "dropout": 0.5
    }
    mlflow.log_params(params)

记录指标 #

python

import mlflow

with mlflow.start_run():
    mlflow.log_metric("accuracy", 0.95)
    mlflow.log_metric("loss", 0.05)
    
    for epoch in range(10):
        loss = 0.1 * (10 - epoch) / 10
        mlflow.log_metric("train_loss", loss, step=epoch)

记录工件 #

python

import mlflow
import matplotlib.pyplot as plt

with mlflow.start_run():
    mlflow.log_artifact("config.yaml")
    
    mlflow.log_artifacts("./outputs")
    
    plt.figure()
    plt.plot([1, 2, 3], [1, 4, 9])
    plt.savefig("plot.png")
    mlflow.log_artifact("plot.png")

记录标签 #

python

import mlflow

with mlflow.start_run():
    mlflow.set_tag("model_type", "RandomForest")
    mlflow.set_tag("dataset", "iris")
    mlflow.set_tags({
        "team": "ml-team",
        "priority": "high"
    })

自动记录功能 #

Scikit-learn 自动记录 #

python

import mlflow
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

mlflow.sklearn.autolog()

data = load_iris()
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

with mlflow.start_run():
    model = RandomForestClassifier(n_estimators=100, max_depth=5)
    model.fit(X_train, y_train)
    
    score = model.score(X_test, y_test)
    print(f"Test accuracy: {score:.4f}")

TensorFlow/Keras 自动记录 #

python

import mlflow
import tensorflow as tf

mlflow.tensorflow.autolog()

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(10,)),
    tf.keras.layers.Dense(1)
])

model.compile(optimizer='adam', loss='mse')

with mlflow.start_run():
    model.fit(X_train, y_train, epochs=10, validation_split=0.2)

PyTorch 自动记录 #

python

import mlflow
import torch
import torch.nn as nn

mlflow.pytorch.autolog()

class SimpleModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc = nn.Linear(10, 1)
    
    def forward(self, x):
        return self.fc(x)

model = SimpleModel()

with mlflow.start_run():
    pass

模型保存与加载 #

保存模型 #

python

import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier

model = RandomForestClassifier()
model.fit(X_train, y_train)

with mlflow.start_run():
    mlflow.sklearn.log_model(model, "model")

加载模型 #

python

import mlflow.sklearn

model = mlflow.sklearn.load_model("runs:/<run_id>/model")

predictions = model.predict(X_test)

使用 Model URI #

python

import mlflow.sklearn

model = mlflow.sklearn.load_model("models:/my_model/Production")

model = mlflow.sklearn.load_model("models:/my_model/1")

model = mlflow.sklearn.load_model("models:/my_model/latest")

查询实验结果 #

列出实验 #

python

import mlflow

experiments = mlflow.search_experiments()
for exp in experiments:
    print(f"Name: {exp.name}, ID: {exp.experiment_id}")

搜索运行 #

python

import mlflow

runs = mlflow.search_runs(
    experiment_ids=["1"],
    filter_string="metrics.accuracy > 0.9",
    order_by=["metrics.accuracy DESC"]
)

print(runs[["run_id", "metrics.accuracy", "params.n_estimators"]])

获取运行详情 #

python

import mlflow

run = mlflow.get_run("run_id")
print(f"Status: {run.info.status}")
print(f"Metrics: {run.data.metrics}")
print(f"Params: {run.data.params}")

UI 功能介绍 #

实验列表页 #

text

┌─────────────────────────────────────────────────────────────┐
│                    实验列表页面                              │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  功能：                                                     │
│  ├── 查看所有实验                                           │
│  ├── 创建新实验                                             │
│  ├── 搜索实验                                               │
│  └── 删除实验                                               │
│                                                             │
│  显示信息：                                                  │
│  ├── 实验名称                                               │
│  ├── 运行数量                                               │
│  ├── 创建时间                                               │
│  └── 最后更新时间                                           │
│                                                             │
└─────────────────────────────────────────────────────────────┘

运行详情页 #

text

┌─────────────────────────────────────────────────────────────┐
│                    运行详情页面                              │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  标签页：                                                    │
│  ├── Details：运行详情                                      │
│  │   ├── 运行 ID                                           │
│  │   ├── 状态                                              │
│  │   ├── 开始/结束时间                                     │
│  │   └── 用户信息                                          │
│  │                                                         │
│  ├── Parameters：参数列表                                   │
│  │                                                         │
│  ├── Metrics：指标列表                                      │
│  │   └── 支持图表展示                                      │
│  │                                                         │
│  ├── Artifacts：工件列表                                    │
│  │   ├── 模型文件                                          │
│  │   ├── 图表                                              │
│  │   └── 其他文件                                          │
│  │                                                         │
│  └── Tags：标签列表                                         │
│                                                             │
└─────────────────────────────────────────────────────────────┘

比较运行 #

text

┌─────────────────────────────────────────────────────────────┐
│                    运行比较功能                              │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  选择多个运行 → 点击 Compare                                │
│                                                             │
│  比较视图：                                                  │
│  ├── 表格视图：并列显示参数和指标                           │
│  ├── 散点图：参数 vs 指标                                   │
│  ├── 平行坐标图：多维度比较                                 │
│  └── 等高线图：参数组合效果                                 │
│                                                             │
└─────────────────────────────────────────────────────────────┘

完整示例：超参数调优 #

python

import mlflow
import mlflow.sklearn
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.ensemble import RandomForestClassifier
import numpy as np

data = load_iris()
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

mlflow.set_experiment("hyperparameter-tuning")

param_grid = {
    "n_estimators": [50, 100, 200],
    "max_depth": [3, 5, 7, None],
    "min_samples_split": [2, 5, 10]
}

best_score = 0
best_params = None

for n_est in param_grid["n_estimators"]:
    for max_d in param_grid["max_depth"]:
        for min_samples in param_grid["min_samples_split"]:
            with mlflow.start_run():
                mlflow.log_param("n_estimators", n_est)
                mlflow.log_param("max_depth", max_d if max_d else "None")
                mlflow.log_param("min_samples_split", min_samples)
                
                model = RandomForestClassifier(
                    n_estimators=n_est,
                    max_depth=max_d,
                    min_samples_split=min_samples,
                    random_state=42
                )
                
                cv_scores = cross_val_score(model, X_train, y_train, cv=5)
                mean_cv_score = cv_scores.mean()
                
                model.fit(X_train, y_train)
                test_score = model.score(X_test, y_test)
                
                mlflow.log_metric("cv_mean_accuracy", mean_cv_score)
                mlflow.log_metric("test_accuracy", test_score)
                
                mlflow.sklearn.log_model(model, "model")
                
                if test_score > best_score:
                    best_score = test_score
                    best_params = {
                        "n_estimators": n_est,
                        "max_depth": max_d,
                        "min_samples_split": min_samples
                    }
                
                print(f"n_est={n_est}, max_depth={max_d}, min_samples={min_samples}")
                print(f"CV Score: {mean_cv_score:.4f}, Test Score: {test_score:.4f}")

print(f"\nBest Score: {best_score:.4f}")
print(f"Best Params: {best_params}")

最佳实践 #

1. 组织实验 #

python

import mlflow

mlflow.set_experiment(f"project-{project_name}/model-{model_type}")

2. 命名规范 #

python

import mlflow

with mlflow.start_run(run_name="rf-n100-d5-v1"):
    mlflow.set_tag("version", "1.0.0")
    mlflow.set_tag("author", "data-team")

3. 记录数据版本 #

python

import mlflow
import hashlib

def get_data_hash(data):
    return hashlib.md5(str(data).encode()).hexdigest()[:8]

with mlflow.start_run():
    mlflow.log_param("data_hash", get_data_hash(X_train))
    mlflow.log_param("data_shape", X_train.shape)

4. 使用父运行 #

python

import mlflow

with mlflow.start_run(run_name="parent-run") as parent_run:
    for fold in range(5):
        with mlflow.start_run(run_name=f"fold-{fold}", nested=True):
            pass

下一步 #

现在你已经掌握了 MLflow 的基本使用，接下来学习实验跟踪，深入了解 MLflow Tracking 的高级功能！