安全配置 #
概述 #
安全是 Kubeflow 企业部署的关键要素。本章介绍 Kubeflow 的安全配置,帮助你构建安全的机器学习平台。
安全配置内容 #
text
┌─────────────────────────────────────────────────────────────┐
│ 安全配置内容 │
├─────────────────────────────────────────────────────────────┤
│ │
│ 认证: │
│ ├── Dex 身份认证 │
│ ├── OIDC 集成 │
│ ├── LDAP/AD 集成 │
│ └── 静态用户配置 │
│ │
│ 授权: │
│ ├── RBAC 配置 │
│ ├── 命名空间隔离 │
│ ├── 角色管理 │
│ └── 权限控制 │
│ │
│ 网络安全: │
│ ├── 网络策略 │
│ ├── Istio 安全 │
│ ├── TLS 配置 │
│ └── 访问控制 │
│ │
│ 密钥管理: │
│ ├── Secret 管理 │
│ ├── 密钥轮换 │
│ ├── 外部密钥管理 │
│ └── 审计日志 │
│ │
└─────────────────────────────────────────────────────────────┘
认证配置 #
Dex 概述 #
Dex 是 Kubeflow 的身份认证服务,支持多种身份提供商。
text
Dex 支持的认证方式:
├── OIDC (OpenID Connect)
│ ├── Google
│ ├── Azure AD
│ ├── Okta
│ └── Keycloak
├── LDAP/Active Directory
├── GitHub
├── GitLab
└── 静态用户
配置静态用户 #
yaml
apiVersion: v1
kind: Secret
metadata:
name: dex-static-users
namespace: dex
stringData:
config.yaml: |
staticPasswords:
- email: admin@example.com
hash: $2a$10$2b2cU8CPhOTaGrs1HRQuA
username: admin
userID: "08a8684b-db88-4b73-90a9-3cd1661f5466"
- email: user@example.com
hash: $2a$10$2b2cU8CPhOTaGrs1HRQuA
username: user
userID: "08a8684b-db88-4b73-90a9-3cd1661f5467"
配置 OIDC #
yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: dex
namespace: dex
data:
config.yaml: |
issuer: https://kubeflow.example.com/dex
storage:
type: kubernetes
config:
inCluster: true
web:
http: 0.0.0.0:5556
connectors:
- type: oidc
name: Google
id: google
config:
issuer: https://accounts.google.com
clientID: your-client-id.apps.googleusercontent.com
clientSecret: $GOOGLE_CLIENT_SECRET
redirectURI: https://kubeflow.example.com/dex/callback
hostedDomains:
- example.com
配置 LDAP #
yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: dex
namespace: dex
data:
config.yaml: |
connectors:
- type: ldap
name: ActiveDirectory
id: ad
config:
host: ldap.example.com:636
insecureNoSSL: false
bindDN: cn=admin,dc=example,dc=com
bindPW: admin-password
usernamePrompt: Username
userSearch:
baseDN: dc=example,dc=com
filter: "(objectClass=user)"
username: sAMAccountName
idAttr: DN
emailAttr: mail
nameAttr: displayName
groupSearch:
baseDN: dc=example,dc=com
filter: "(objectClass=group)"
userMatchers:
- userAttr: DN
groupAttr: member
nameAttr: cn
配置 GitHub #
yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: dex
namespace: dex
data:
config.yaml: |
connectors:
- type: github
name: GitHub
id: github
config:
clientID: your-github-client-id
clientSecret: $GITHUB_CLIENT_SECRET
redirectURI: https://kubeflow.example.com/dex/callback
orgs:
- name: your-org
teams:
- ml-team
RBAC 配置 #
Kubeflow 角色 #
text
内置角色:
├── kubeflow-admin
│ └── 完全管理权限
├── kubeflow-edit
│ └── 创建和修改资源
├── kubeflow-view
│ └── 只读权限
└── 自定义角色
└── 按需配置
创建自定义角色 #
yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: ml-developer
namespace: kubeflow-user-alice
rules:
- apiGroups: [""]
resources: ["pods", "pods/log", "services", "configmaps", "secrets"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["kubeflow.org"]
resources: ["notebooks", "tfjobs", "pytorchjobs", "experiments", "trials"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["serving.kserve.io"]
resources: ["inferenceservices"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: ml-developer-binding
namespace: kubeflow-user-alice
subjects:
- kind: User
name: alice@example.com
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: ml-developer
apiGroup: rbac.authorization.k8s.io
集群级权限 #
yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: kubeflow-viewer
rules:
- apiGroups: ["kubeflow.org"]
resources: ["profiles"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["nodes", "namespaces"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kubeflow-viewer-binding
subjects:
- kind: Group
name: viewers@example.com
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: kubeflow-viewer
apiGroup: rbac.authorization.k8s.io
网络安全 #
网络策略 #
yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: kubeflow-user-alice
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-istio-ingress
namespace: kubeflow-user-alice
spec:
podSelector: {}
ingress:
- from:
- namespaceSelector:
matchLabels:
name: istio-system
ports:
- protocol: TCP
port: 8888
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
namespace: kubeflow-user-alice
spec:
podSelector: {}
egress:
- to:
- namespaceSelector: {}
ports:
- protocol: UDP
port: 53
Istio 安全配置 #
yaml
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: kubeflow-user-alice
spec:
mtls:
mode: STRICT
---
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: deny-all
namespace: kubeflow-user-alice
spec:
{}
---
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: allow-istio-ingress
namespace: kubeflow-user-alice
spec:
rules:
- from:
- source:
principals: ["cluster.local/ns/istio-system/sa/istio-ingressgateway-service-account"]
TLS 配置 #
yaml
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@example.com
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
ingress:
class: nginx
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: kubeflow-tls
namespace: istio-system
spec:
secretName: kubeflow-tls
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
dnsNames:
- kubeflow.example.com
密钥管理 #
Secret 配置 #
yaml
apiVersion: v1
kind: Secret
metadata:
name: ml-credentials
namespace: kubeflow-user-alice
type: Opaque
stringData:
aws-access-key: AKIAIOSFODNN7EXAMPLE
aws-secret-key: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
mlflow-tracking-uri: https://mlflow.example.com
mlflow-username: mlflow-user
mlflow-password: mlflow-password
在 Pod 中使用 Secret #
yaml
apiVersion: kubeflow.org/v1
kind: Notebook
metadata:
name: secure-notebook
namespace: kubeflow-user-alice
spec:
template:
spec:
containers:
- name: notebook
image: python:3.9
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: ml-credentials
key: aws-access-key
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: ml-credentials
key: aws-secret-key
volumeMounts:
- name: credentials
mountPath: /etc/credentials
readOnly: true
volumes:
- name: credentials
secret:
secretName: ml-credentials
外部密钥管理 #
yaml
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
name: aws-secrets-manager
namespace: kubeflow-user-alice
spec:
provider:
aws:
service: SecretsManager
region: us-west-2
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: ml-credentials
namespace: kubeflow-user-alice
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: SecretStore
target:
name: ml-credentials
creationPolicy: Owner
data:
- secretKey: aws-access-key
remoteRef:
key: ml-credentials
property: access-key
- secretKey: aws-secret-key
remoteRef:
key: ml-credentials
property: secret-key
审计日志 #
启用审计日志 #
yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: Metadata
resources:
- group: ""
resources: ["secrets"]
- level: RequestResponse
resources:
- group: "kubeflow.org"
resources: ["*"]
- level: Metadata
omitStages:
- RequestReceived
审计日志分析 #
bash
# 查看审计日志
kubectl logs -n kube-system kube-apiserver-master | grep audit
# 分析特定用户的操作
kubectl logs -n kube-system kube-apiserver-master | grep "user\":\"alice@example.com"
# 查看敏感操作
kubectl logs -n kube-system kube-apiserver-master | grep -E "(secrets|configmaps)"
安全最佳实践 #
访问控制 #
text
1. 最小权限原则
├── 只授予必要的权限
├── 定期审查权限
└── 及时撤销不需要的权限
2. 身份验证
├── 使用强密码策略
├── 启用多因素认证
└── 定期轮换凭证
3. 会话管理
├── 设置合理的会话超时
├── 限制并发会话
└── 记录登录日志
数据安全 #
text
1. 数据加密
├── 启用 etcd 加密
├── 使用 TLS 传输加密
└── 敏感数据使用 Secret
2. 密钥管理
├── 定期轮换密钥
├── 使用外部密钥管理
└── 限制密钥访问
3. 数据隔离
├── 命名空间隔离
├── 网络策略隔离
└── 存储隔离
容器安全 #
yaml
apiVersion: v1
kind: Pod
metadata:
name: secure-pod
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
containers:
- name: app
image: app:latest
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
安全扫描 #
bash
# 镜像安全扫描
trivy image my-image:latest
# 配置安全扫描
kubeaudit all -n kubeflow-user-alice
# RBAC 安全检查
kubectl auth can-i --list -n kubeflow-user-alice --as=alice@example.com
安全检查清单 #
text
认证配置:
□ 配置强身份认证
□ 启用多因素认证
□ 配置会话超时
□ 禁用静态用户(生产环境)
授权配置:
□ 配置最小权限 RBAC
□ 启用命名空间隔离
□ 定期审查权限
□ 记录权限变更
网络安全:
□ 配置网络策略
□ 启用 TLS
□ 配置 Istio 安全
□ 限制外部访问
密钥管理:
□ 使用 Secret 存储敏感信息
□ 定期轮换密钥
□ 启用 etcd 加密
□ 配置审计日志
容器安全:
□ 使用非 root 用户
□ 只读文件系统
□ 禁用特权
□ 定期安全扫描
下一步 #
现在你已经掌握了安全配置,接下来学习 监控与日志,了解 Kubeflow 的监控和日志管理!
最后更新:2026-04-05