Job与CronJob #

一、Job概述 #

Job用于运行一次性任务,确保指定数量的Pod成功完成。

1.1 Job类型 #

text
Job类型
    │
    ├── 单次Job
    │   └── 运行一个Pod直到完成
    │
    ├── 固定完成数Job
    │   └── 运行指定数量的Pod
    │
    └── 工作队列Job
        └── 并行处理多个任务

1.2 Job特性 #

text
Job特性
    │
    ├── 确保完成
    │   └── Pod失败会自动重试
    │
    ├── 重试策略
    │   └── 可配置重试次数
    │
    ├── 并行执行
    │   └── 支持多Pod并行
    │
    └── 超时控制
        └── 可设置运行超时

二、创建Job #

2.1 单次Job #

yaml
# single-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: single-job
spec:
  template:
    spec:
      containers:
      - name: job
        image: busybox
        command: ["echo", "Hello Kubernetes"]
      restartPolicy: Never
bash
# 创建Job
kubectl apply -f single-job.yaml

# 查看Job
kubectl get jobs

# 输出示例
NAME          COMPLETIONS   DURATION   AGE
single-job    1/1           5s         10s

# 查看Pod
kubectl get pods -l job-name=single-job

# 查看日志
kubectl logs job/single-job

2.2 固定完成数Job #

yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: fixed-job
spec:
  completions: 5
  parallelism: 2
  template:
    spec:
      containers:
      - name: job
        image: busybox
        command: ["sh", "-c", "echo Job $HOSTNAME && sleep 5"]
      restartPolicy: Never

2.3 工作队列Job #

yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: queue-job
spec:
  parallelism: 3
  template:
    spec:
      containers:
      - name: job
        image: busybox
        command: ["sh", "-c", "echo Processing && sleep 10"]
      restartPolicy: Never

三、Job配置 #

3.1 重试策略 #

yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: retry-job
spec:
  backoffLimit: 4
  template:
    spec:
      containers:
      - name: job
        image: busybox
        command: ["sh", "-c", "exit 1"]
      restartPolicy: Never

3.2 超时设置 #

yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: timeout-job
spec:
  activeDeadlineSeconds: 60
  template:
    spec:
      containers:
      - name: job
        image: busybox
        command: ["sh", "-c", "sleep 120"]
      restartPolicy: Never

3.3 清理策略 #

yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: cleanup-job
spec:
  ttlSecondsAfterFinished: 100
  template:
    spec:
      containers:
      - name: job
        image: busybox
        command: ["echo", "done"]
      restartPolicy: Never

3.4 并行配置 #

参数 说明
completions 需要成功完成的Pod数
parallelism 并行运行的Pod数
backoffLimit 最大重试次数
activeDeadlineSeconds 最大运行时间

四、CronJob概述 #

CronJob用于定时运行Job,基于Cron表达式调度。

4.1 Cron表达式 #

text
Cron表达式格式
    │
    └── 分 时 日 月 周
    
示例
    │
    ├── "*/1 * * * *" ─── 每分钟
    ├── "0 * * * *" ─── 每小时
    ├── "0 0 * * *" ─── 每天0点
    ├── "0 0 * * 0" ─── 每周日0点
    └── "0 0 1 * *" ─── 每月1号0点

4.2 时区 #

text
CronJob时区
    │
    ├── 默认:UTC时区
    │
    └── 可配置:timezone字段

五、创建CronJob #

5.1 基本示例 #

yaml
# cronjob.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: hello-cronjob
spec:
  schedule: "*/1 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox
            command: ["echo", "Hello from CronJob"]
          restartPolicy: OnFailure
bash
# 创建CronJob
kubectl apply -f cronjob.yaml

# 查看CronJob
kubectl get cronjobs

# 输出示例
NAME             SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
hello-cronjob    */1 * * * *   False     0        30s             2m

# 查看Job
kubectl get jobs -l cronjob-name=hello-cronjob

# 查看Pod
kubectl get pods

5.2 完整配置 #

yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: backup-cronjob
spec:
  schedule: "0 2 * * *"
  concurrencyPolicy: Forbid
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 1
  startingDeadlineSeconds: 300
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: backup-tool:v1
            command: ["/scripts/backup.sh"]
            env:
            - name: DB_HOST
              value: "mysql-service"
            volumeMounts:
            - name: backup-storage
              mountPath: /backup
          volumes:
          - name: backup-storage
            persistentVolumeClaim:
              claimName: backup-pvc
          restartPolicy: OnFailure

六、CronJob配置 #

6.1 并发策略 #

yaml
spec:
  concurrencyPolicy: Forbid
策略 说明
Allow 允许并发执行Job
Forbid 禁止并发,跳过新Job
Replace 取消当前Job,启动新Job

6.2 历史限制 #

yaml
spec:
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 1

6.3 启动截止时间 #

yaml
spec:
  startingDeadlineSeconds: 300

6.4 暂停CronJob #

yaml
spec:
  suspend: true
bash
# 暂停CronJob
kubectl patch cronjob hello-cronjob -p '{"spec":{"suspend":true}}'

# 恢复CronJob
kubectl patch cronjob hello-cronjob -p '{"spec":{"suspend":false}}'

七、实际应用示例 #

7.1 数据库备份 #

yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: mysql-backup
spec:
  schedule: "0 2 * * *"
  concurrencyPolicy: Forbid
  successfulJobsHistoryLimit: 7
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: mysqldump
            image: mysql:8.0
            command:
            - /bin/sh
            - -c
            - |
              mysqldump -h $MYSQL_HOST -u $MYSQL_USER -p$MYSQL_PASSWORD --all-databases > /backup/backup-$(date +%Y%m%d).sql
            env:
            - name: MYSQL_HOST
              value: mysql-service
            - name: MYSQL_USER
              valueFrom:
                secretKeyRef:
                  name: mysql-secret
                  key: username
            - name: MYSQL_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: mysql-secret
                  key: password
            volumeMounts:
            - name: backup
              mountPath: /backup
          volumes:
          - name: backup
            persistentVolumeClaim:
              claimName: backup-pvc
          restartPolicy: OnFailure

7.2 日志清理 #

yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: log-cleanup
spec:
  schedule: "0 0 * * 0"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: cleanup
            image: busybox
            command:
            - sh
            - -c
            - find /logs -name "*.log" -mtime +30 -delete
            volumeMounts:
            - name: logs
              mountPath: /logs
          volumes:
          - name: logs
            persistentVolumeClaim:
              claimName: logs-pvc
          restartPolicy: OnFailure

7.3 数据同步 #

yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: data-sync
spec:
  schedule: "*/30 * * * *"
  concurrencyPolicy: Forbid
  jobTemplate:
    spec:
      backoffLimit: 3
      activeDeadlineSeconds: 1800
      template:
        spec:
          containers:
          - name: sync
            image: sync-tool:v1
            command: ["/scripts/sync.sh"]
            env:
            - name: SOURCE_URL
              value: "https://api.example.com/data"
            - name: TARGET_DB
              value: "postgresql://db-service:5432/app"
          restartPolicy: OnFailure

八、Job管理 #

8.1 查看Job状态 #

bash
# 查看Job列表
kubectl get jobs

# 查看Job详情
kubectl describe job <job-name>

# 查看Job日志
kubectl logs job/<job-name>

# 查看Job创建的Pod
kubectl get pods -l job-name=<job-name>

8.2 手动触发CronJob #

bash
# 从CronJob创建Job
kubectl create job --from=cronjob/hello-cronjob manual-job-001

# 查看创建的Job
kubectl get jobs

8.3 删除Job #

bash
# 删除Job
kubectl delete job <job-name>

# 删除CronJob
kubectl delete cronjob <cronjob-name>

# 删除所有Job
kubectl delete jobs --all

九、故障排查 #

9.1 常见问题 #

bash
# 查看Job事件
kubectl describe job <job-name>

# 查看Pod状态
kubectl get pods -l job-name=<job-name>

# 查看Pod日志
kubectl logs <pod-name>

# 查看CronJob状态
kubectl describe cronjob <cronjob-name>

9.2 问题诊断 #

问题 原因 解决方案
Job一直失败 命令错误 检查命令和镜像
Job超时 任务时间过长 增加activeDeadlineSeconds
CronJob未执行 时区问题 检查时区配置
并发冲突 并发策略限制 检查concurrencyPolicy

十、最佳实践 #

10.1 Job配置 #

yaml
spec:
  backoffLimit: 3
  activeDeadlineSeconds: 3600
  ttlSecondsAfterFinished: 3600
  template:
    spec:
      restartPolicy: OnFailure

10.2 CronJob配置 #

yaml
spec:
  concurrencyPolicy: Forbid
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 1
  startingDeadlineSeconds: 300

10.3 资源限制 #

yaml
resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 512Mi

十一、总结 #

11.1 核心要点 #

要点 Job CronJob
用途 一次性任务 定时任务
调度 手动触发 Cron表达式
重试 backoffLimit 继承Job配置
清理 ttlSecondsAfterFinished 历史限制

11.2 下一步 #

掌握了Job和CronJob后,让我们学习 Service服务,了解如何暴露和访问应用。

最后更新:2026-03-28