3.5.7
|
| 维度 | include |
template |
|---|---|---|
| 输出 | 渲染后的字符串 | 渲染后的字符串 |
| 能否 pipe 进其他函数 | 能 | 不能 |
| 推荐 | 推荐 | 几乎弃用 |
正确示例:
labels:
app.kubernetes.io/name: {{ include "mychart.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
Hook 是在 release 生命周期某些节点执行的特殊资源:
| Hook | 触发时机 |
|---|---|
pre-install |
install 之前 |
post-install |
install 之后 |
pre-delete |
delete 之前 |
post-delete |
delete 之后 |
pre-upgrade |
upgrade 之前 |
post-upgrade |
upgrade 之后 |
pre-rollback |
rollback 之前 |
post-rollback |
rollback 之后 |
test |
helm test 触发 |
示例:job hook 用于数据库迁移
apiVersion: batch/v1
kind: Job
metadata:
name: {{ include "mychart.fullname" . }}-migrate
annotations:
"helm.sh/hook": pre-upgrade,pre-install
"helm.sh/hook-weight": "-5"
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
template:
spec:
restartPolicy: Never
containers:
- name: migrate
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
command: ["/app/migrate.sh"]
Helm 3 支持把 chart 存为 OCI artifact,与容器镜像用同一仓库:
# 登录 OCI registry
helm registry login harbor.example.com -u admin -p Harbor12345
# 推送 chart
helm push mychart-1.0.0.tgz oci://harbor.example.com/charts
# 列出
helm list oci://harbor.example.com/charts
# 拉取
helm pull oci://harbor.example.com/charts/mychart --version 1.0.0
实际生产中,
helm registry login密码建议从~/.docker/config.json读,或通过环境变量HELM_REGISTRY_CONFIG。
| 协议 | 工具 | 例子 |
|---|---|---|
| HTTP API(chart 仓库) | ChartMuseum | https://charts.example.com |
| OCI artifact | Harbor, Gitea, ACR, ECR | oci://registry.example.com/charts |
| 本地目录 | 离线 | file://./charts |
| HTTP 文件 | 静态文件 | https://example.com/index.yaml |
按"先建规范,再搭仓库,最后接 GitOps"的顺序:
第一步:Chart 设计规范(1-2 天)
第二步:基础公共库(1 周)
_helpers.tpl、_labels.tpl、_tplvalues.tpl第三步:单元测试(1 周)
helm-unittest 插件第四步:CI 集成(3-5 天)
helm lint、helm template、ct lint第五步:仓库搭建(2-3 天)
第六步:GitOps 串联(1-2 周)
第七步:升级流程(持续)
# 1. 创建 chart 骨架
helm create mychart
cd mychart
# 2. 清理默认模板(保留 _helpers.tpl)
rm templates/deployment.yaml
rm templates/service.yaml
rm templates/serviceaccount.yaml
rm templates/hpa.yaml
rm templates/ingress.yaml
rm templates/secret.yaml
rm templates/configmap.yaml
rm templates/pvc.yaml
rm templates/NOTES.txt
rm templates/tests/test-connection.yaml
apiVersion: v2
name: mychart
description: |
Production-grade application chart. Use this as a template for new charts.
type: application
# Chart 版本
version: 0.1.0
# 应用版本
appVersion: "1.0.0"
# 关键字(helm search 可见)
keywords:
- application
- microservice
- production
# 维护者
maintainers:
- name: Platform Team
email: platform@example.com
url: https://wiki.example.com/platform
# 仓库地址
home: https://github.com/example/mychart
sources:
- https://github.com/example/mychart-src
# 图标
icon: https://example.com/icon.png
# Kubernetes 版本要求
kubeVersion: ">=1.24.0-0"
# 应用分类
annotations:
category: Application
licenses: Apache-2.0
# 依赖
dependencies:
- name: common
version: "2.x.x"
repository: "oci://registry.example.com/charts"
condition: common.enabled
# 应用配置
image:
repository: myorg/myapp
tag: "" # 默认用 .Chart.AppVersion
pullPolicy: IfNotPresent
imagePullSecrets: []
# 副本
replicaCount: 3
# 服务账户
serviceAccount:
create: true
annotations: {}
name: ""
# Pod 配置
podAnnotations: {}
podLabels: {}
podSecurityContext: {}
# 资源
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
# 探针
livenessProbe:
httpGet:
path: /healthz
port: http
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: http
initialDelaySeconds: 5
periodSeconds: 5
# 调度
nodeSelector: {}
tolerations: []
affinity: {}
topologySpreadConstraints: []
# 服务
service:
type: ClusterIP
port: 80
targetPort: 8080
# Ingress
ingress:
enabled: false
className: nginx
annotations: {}
hosts:
- host: chart-example.local
paths:
- path: /
pathType: ImplementationSpecific
tls: []
# HPA
autoscaling:
enabled: false
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 80
# ConfigMap
config:
log_level: info
env: production
features: {}
# Secret(实际值不放这里)
secret:
enabled: false
existingSecret: ""
data: {}
# 数据库
database:
enabled: false
type: postgresql
host: ""
port: 5432
name: ""
user: ""
existingSecret: ""
# 中间件
redis:
enabled: false
host: ""
port: 6379
existingSecret: ""
{{/*
Expand the name of the chart.
*/}}
{{- define "mychart.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Create a default fully qualified app name.
*/}}
{{- define "mychart.fullname" -}}
{{- if .Values.fullnameOverride }}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- $name := default .Chart.Name .Values.nameOverride }}
{{- if contains $name .Release.Name }}
{{- .Release.Name | trunc 63 | trimSuffix "-" }}
{{- else }}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
{{- end }}
{{- end }}
{{- end }}
{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "mychart.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
{{- end }}
{{/*
Common labels
*/}}
{{- define "mychart.labels" -}}
helm.sh/chart: {{ include "mychart.chart" . }}
{{ include "mychart.selectorLabels" . }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}
{{/*
Selector labels
*/}}
{{- define "mychart.selectorLabels" -}}
app.kubernetes.io/name: {{ include "mychart.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end }}
{{/*
ServiceAccount name
*/}}
{{- define "mychart.serviceAccountName" -}}
{{- if .Values.serviceAccount.create }}
{{- default (include "mychart.fullname" .) .Values.serviceAccount.name }}
{{- else }}
{{- default "default" .Values.serviceAccount.name }}
{{- end }}
{{- end }}
{{/*
Image reference
*/}}
{{- define "mychart.image" -}}
{{- $tag := .Values.image.tag | default .Chart.AppVersion }}
{{- printf "%s:%s" .Values.image.repository $tag }}
{{- end }}
{{/*
Validate required values
*/}}
{{- define "mychart.validateValues" -}}
{{- if and .Values.database.enabled (not .Values.database.host) }}
{{- fail "database.host is required when database.enabled is true" }}
{{- end }}
{{- end }}
{{- include "mychart.validateValues" . }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "mychart.fullname" . }}
labels:
{{- include "mychart.labels" . | nindent 4 }}
spec:
{{- if not .Values.autoscaling.enabled }}
replicas: {{ .Values.replicaCount }}
{{- end }}
selector:
matchLabels:
{{- include "mychart.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "mychart.selectorLabels" . | nindent 8 }}
{{- with .Values.podLabels }}
{{- toYaml . | nindent 8 }}
{{- end }}
annotations:
# 强制 deployment 在配置变化时滚动
checksum/config: {{ include (print $.Template.BasePath "/configmap.yaml") . | sha256sum }}
{{- with .Values.podAnnotations }}
{{- toYaml . | nindent 8 }}
{{- end }}
spec:
{{- with .Values.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
serviceAccountName: {{ include "mychart.serviceAccountName" . }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
containers:
- name: {{ .Chart.Name }}
securityContext:
runAsNonRoot: true
runAsUser: 65532
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
image: {{ include "mychart.image" . | quote }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: http
containerPort: {{ .Values.service.targetPort }}
protocol: TCP
livenessProbe:
{{- toYaml .Values.livenessProbe | nindent 12 }}
readinessProbe:
{{- toYaml .Values.readinessProbe | nindent 12 }}
resources:
{{- toYaml .Values.resources | nindent 12 }}
envFrom:
- configMapRef:
name: {{ include "mychart.fullname" . }}
{{- if .Values.secret.enabled }}
- secretRef:
name: {{ include "mychart.fullname" . }}
{{- end }}
{{- if .Values.database.enabled }}
env:
- name: DB_HOST
value: {{ .Values.database.host | quote }}
- name: DB_PORT
value: {{ .Values.database.port | default 5432 | quote }}
- name: DB_NAME
value: {{ .Values.database.name | quote }}
- name: DB_USER
valueFrom:
secretKeyRef:
name: {{ .Values.database.existingSecret }}
key: username
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: {{ .Values.database.existingSecret }}
key: password
{{- end }}
volumeMounts:
- name: tmp
mountPath: /tmp
volumes:
- name: tmp
emptyDir: {}
{{- with .Values.nodeSelector }}
nodeSelector:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.topologySpreadConstraints }}
topologySpreadConstraints:
{{- toYaml . | nindent 8 }}
{{- end }}
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "mychart.fullname" . }}
labels:
{{- include "mychart.labels" . | nindent 4 }}
data:
{{- range $key, $value := .Values.config }}
{{ $key }}: {{ $value | quote }}
{{- end }}
# 渲染额外配置文件
{{- if .Values.config.features }}
features.yaml: |
{{- toYaml .Values.config.features | nindent 4 }}
{{- end }}
{{- if .Values.secret.enabled }}
apiVersion: v1
kind: Secret
metadata:
name: {{ include "mychart.fullname" . }}
labels:
{{- include "mychart.labels" . | nindent 4 }}
type: Opaque
data:
{{- range $key, $value := .Values.secret.data }}
{{ $key }}: {{ $value | b64enc | quote }}
{{- end }}
{{- end }}
{{/*
Render a dict with nindent support.
Usage: {{- include "mychart.tplvalues.render" ( dict "value" .Values.path "context" $ ) }}
*/}}
{{- define "mychart.tplvalues.render" -}}
{{- if typeIs "string" .value }}
{{- tpl .value .context }}
{{- else }}
{{- tpl (.value | toYaml) .context }}
{{- end }}
{{- end -}}
values.schema.json:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"required": ["image", "replicaCount", "service"],
"properties": {
"image": {
"type": "object",
"required": ["repository"],
"properties": {
"repository": {"type": "string", "minLength": 1},
"tag": {"type": "string"},
"pullPolicy": {"enum": ["Always", "IfNotPresent", "Never"]}
}
},
"replicaCount": {
"type": "integer",
"minimum": 1,
"maximum": 100
},
"service": {
"type": "object",
"required": ["port", "targetPort"],
"properties": {
"port": {"type": "integer", "minimum": 1, "maximum": 65535},
"targetPort": {"type": "integer", "minimum": 1, "maximum": 65535},
"type": {"enum": ["ClusterIP", "NodePort", "LoadBalancer"]}
}
}
}
}
# 1. Lint
helm lint ./mychart
# 2. 模板渲染并保存
helm template myrelease ./mychart \
--namespace prod \
--values values-prod.yaml \
> rendered.yaml
# 3. 干跑安装
helm install myrelease ./mychart \
--namespace prod \
--values values-prod.yaml \
--dry-run \
--debug
# 4. 干跑升级
helm upgrade myrelease ./mychart \
--namespace prod \
--values values-prod.yaml \
--dry-run \
--debug
预期输出(干跑):
NAME: myrelease
LAST DEPLOYED: Tue Jun 17 10:00:00 2026
NAMESPACE: prod
STATUS: pending-install
REVISION: 1
TEST SUITE: None
HOOKS:
MANIFEST:
---
# Source: mychart/templates/serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
...
Chart.yaml 增加:
dependencies:
- name: postgresql
version: "12.x.x"
repository: "https://charts.bitnami.com/bitnami"
condition: postgresql.enabled
alias: db
更新依赖:
helm dependency update ./mychart
# 生成 Chart.lock + 下载 postgresql-12.x.x.tgz 到 charts/
# 验证
helm dependency list ./mychart
# NAME VERSION REPOSITORY CONDITION
# db 12.x.x https://charts.bitnami.com/bitnami postgresql.enabled
# 1. 创建 namespace
kubectl create namespace prod
# 2. 准备生产 values
cat > values-prod.yaml <<EOF
replicaCount: 5
image:
repository: myorg/myapp
tag: "1.21.0"
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 2000m
memory: 2Gi
database:
enabled: true
host: db.prod.svc
port: 5432
name: myapp
user: myapp
existingSecret: myapp-db-credentials
secret:
enabled: true
data:
apiKey: "your-api-key-here"
EOF
# 3. 模拟跑一次
helm install myapp ./mychart \
--namespace prod \
--values values-prod.yaml \
--dry-run
# 4. 真正安装
helm install myapp ./mychart \
--namespace prod \
--values values-prod.yaml \
--wait \
--timeout 5m
# 5. 查看
helm list -n prod
kubectl get all -n prod -l app.kubernetes.io/instance=myapp
# 1. 修改 values
vi values-prod.yaml
# 改 image.tag: 1.21.0 -> 1.22.0
# 2. 干跑升级看 diff
helm diff upgrade myapp ./mychart \
--namespace prod \
--values values-prod.yaml
# (需要 helm-diff 插件)
# 3. 安装 diff 插件
helm plugin install https://github.com/databus23/helm-diff
# 4. 真实升级
helm upgrade myapp ./mychart \
--namespace prod \
--values values-prod.yaml \
--wait \
--timeout 10m
# 5. 查看升级历史
helm history myapp -n prod
# REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
# 1 Tue Jun 17 10:00:00 2026 superseded mychart-0.1.0 1.0.0 Install complete
# 2 Tue Jun 17 14:00:00 2026 deployed mychart-0.1.0 1.0.0 Upgrade complete
# 1. 查看历史
helm history myapp -n prod
# 2. 回滚到上一版本
helm rollback myapp -n prod
# 或指定 REVISION
helm rollback myapp 1 -n prod
# 3. 验证
helm list -n prod
kubectl get pods -n prod -l app.kubernetes.io/instance=myapp
# 1. 模拟跑(先看会删什么)
helm uninstall myapp -n prod --dry-run --debug
# 2. 真实卸载
helm uninstall myapp -n prod
# 3. 验证
kubectl get all -n prod -l app.kubernetes.io/instance=myapp
# 预期:No resources found
# 4. 保留历史(默认是删的)
helm uninstall myapp -n prod --keep-history
安装 helm-unittest:
helm plugin install https://github.com/helm-unittest/helm-unittest
templates/deployment_test.yaml:
suite: test deployment
templates:
- deployment.yaml
release:
name: myapp
namespace: prod
values:
- values.yaml
tests:
- it: should render deployment with default values
asserts:
- hasDocuments:
count: 1
- containsDocument:
kind: Deployment
name: myapp
- equal:
path: spec.replicas
value: 3
- equal:
path: spec.template.spec.containers[0].image
value: "myorg/myapp:1.0.0"
- it: should respect replicaCount override
set:
replicaCount: 5
asserts:
- equal:
path: spec.replicas
value: 5
- it: should set resource limits
set:
resources:
limits:
cpu: 1
memory: 1Gi
asserts:
- equal:
path: spec.template.spec.containers[0].resources.limits.cpu
value: 1
- it: should fail when required value missing
set:
database:
enabled: true
host: ""
asserts:
- failedTemplate:
errorMessage: "database.host is required when database.enabled is true"
- it: should not render secret when disabled
set:
secret:
enabled: false
templates:
- secret.yaml
asserts:
- hasDocuments:
count: 0
- it: should render secret when enabled
set:
secret:
enabled: true
data:
apiKey: "test"
templates:
- secret.yaml
asserts:
- containsDocument:
kind: Secret
apiVersion: v1
跑测试:
helm unittest ./mychart
# 预期:
# PASSED test deployment
# Tests: 6 passed, 0 failed, 0 skipped
ci/lintconf.yaml:
chart:
- mychart
- subchart1
- subchart2
check-charts: true
ci/chart_schema.yaml:
chart:
- mychart
validate-maintainers: false
lint-conf: ci/lintconf.yaml
跑 chart-testing:
# 安装 ct(chart-testing)
brew install chart-testing # macOS
# 或
curl -sSL https://github.com/helm/chart-testing/releases/download/v3.10.1/chart-testing_3.10.1_linux_amd64.tar.gz | tar xz
sudo mv ct /usr/local/bin/
# 跑 lint
ct lint --config ci/chart_schema.yaml
# 预期:
# Linting chart...
# Linting mychart...
# All charts linted successfully
# 跑 install 演练(需要 K8s 集群)
ct install --config ci/chart_schema.yaml
docker run -d \
--name chartmuseum \
-p 8080:8080 \
-e DEBUG=true \
-e STORAGE=local \
-e STORAGE_LOCAL_ROOTDIR=/charts \
-v /opt/chartmuseum:/charts \
ghcr.io/helm/chartmuseum:v0.16.0
docker run -d \
--name chartmuseum \
-p 8080:8080 \
-e STORAGE=amazon \
-e STORAGE_AMAZON_BUCKET=my-charts \
-e STORAGE_AMAZON_PREFIX=charts \
-e STORAGE_AMAZON_REGION=cn-north-1 \
-e AWS_ACCESS_KEY_ID=xxx \
-e AWS_SECRET_ACCESS_KEY=yyy \
ghcr.io/helm/chartmuseum:v0.16.0
生产环境不推荐用环境变量传 AWS key,建议用 IAM Role。
# 添加仓库
helm repo add chartmuseum https://charts.example.com
helm repo update
# 推送 chart
helm push ./mychart-0.1.0.tgz chartmuseum
# 列出
helm search repo chartmuseum/mychart
# 拉取
helm pull chartmuseum/mychart --version 0.1.0
# 上传
curl -F "chart=@mychart-0.1.0.tgz" https://charts.example.com/api/charts
# 列表
curl https://charts.example.com/api/charts | jq
# 删除(需要开启 ALLOW_OVERWRITE)
curl -X DELETE https://charts.example.com/api/charts/mychart/0.1.0
用 Harbor 是最直接的 OCI Registry 方案。
# 1. 在 Harbor 创建项目 charts
# 2. 登录
helm registry login harbor.example.com -u admin
# 3. 推送
helm push ./mychart-0.1.0.tgz oci://harbor.example.com/charts
# 4. 列出
helm list oci://harbor.example.com/charts
# 5. 拉取
helm pull oci://harbor.example.com/charts/mychart --version 0.1.0
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: myapp-prod
namespace: argocd
spec:
project: production
source:
repoURL: https://charts.example.com
targetRevision: 1.4.2
chart: mychart
helm:
releaseName: myapp
valueFiles:
- values-prod.yaml
parameters:
- name: replicaCount
value: "5"
- name: image.tag
value: "1.22.0"
destination:
server: https://kubernetes.default.svc
namespace: prod
syncPolicy:
automated:
prune: true
selfHeal: true
allowEmpty: false
syncOptions:
- CreateNamespace=false
- PrunePropagationPolicy=foreground
- PruneLast=true
retry:
limit: 5
backoff:
duration: 10s
factor: 2
maxDuration: 5m
argocd/
├── applications/
│ ├── myapp-dev.yaml
│ ├── myapp-staging.yaml
│ └── myapp-prod.yaml
└── projects/
└── production.yaml
不同环境用不同 targetRevision:
# myapp-dev.yaml
source:
chart: mychart
targetRevision: 1.4.2 # 跟 staging 同步
# myapp-staging.yaml
source:
chart: mychart
targetRevision: 1.4.2 # 跟 prod 同步
# myapp-prod.yaml
source:
chart: mychart
targetRevision: 1.4.2 # 生产固化版本
ArgoCD 内置同步失败时自动回滚:
syncPolicy:
automated:
selfHeal: true
retry:
limit: 3
ArgoCD ApplicationSet + Progressive Delivery 是更高级的灰度方案:
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: myapp
spec:
generators:
- list:
elements:
- cluster: dev
url: https://kubernetes.default.svc
revision: 1.4.3
- cluster: staging
url: https://staging.example.com
revision: 1.4.3
template:
metadata:
name: 'myapp-{{cluster}}'
spec:
project: production
source:
repoURL: https://charts.example.com
targetRevision: '{{revision}}'
chart: mychart
destination:
server: '{{url}}'
namespace: myapp
myapp chart 可以被 ArgoCD Image Updater 自动跟踪镜像 tag:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: myapp
annotations:
argocd-image-updater.argoproj.io/image-list: myapp=myorg/myapp:1.22.0
argocd-image-updater.argoproj.io/myapp.update-strategy: latest
argocd-image-updater.argoproj.io/myapp.allow-tags: regexp:^1\.\d+\.\d+$
spec:
source:
chart: mychart
targetRevision: 1.4.2
# 创建
helm create mychart
# Lint
helm lint ./mychart
# 模板渲染
helm template myrelease ./mychart
helm template myrelease ./mychart --values values-prod.yaml
# 干跑
helm install myrelease ./mychart --dry-run --debug
helm upgrade myrelease ./mychart --dry-run --debug
# 渲染并保存
helm template myrelease ./mychart --values values-prod.yaml > rendered.yaml
# 验证渲染结果
kubectl apply --dry-run=client -f rendered.yaml
# 更新依赖
helm dependency update ./mychart
# 列出依赖
helm dependency list ./mychart
# 构建依赖(下载到 charts/)
helm dependency build ./mychart
# 列出已下载的 chart
ls -la ./mychart/charts/
# 安装
helm install myrelease ./mychart --namespace prod --values values.yaml
helm install myrelease ./mychart --namespace prod --values values.yaml --wait --timeout 5m
helm install myrelease ./mychart --namespace prod --values values.yaml --create-namespace
# 升级
helm upgrade myrelease ./mychart --namespace prod --values values.yaml
helm upgrade --install myrelease ./mychart --namespace prod --values values.yaml
# 历史
helm history myrelease -n prod
# 状态
helm status myrelease -n prod
# 列出所有 release
helm list -A
helm list -A --filter 'mychart'
# 卸载
helm uninstall myrelease -n prod
helm uninstall myrelease -n prod --keep-history
# 回滚到上一版本
helm rollback myrelease -n prod
# 回滚到指定版本
helm rollback myrelease 3 -n prod
# 查看历史
helm history myrelease -n prod
# 添加
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo add chartmuseum https://charts.example.com
# 更新
helm repo update
# 列出
helm repo list
# 搜索
helm search repo nginx
helm search hub nginx
# 移除
helm repo remove chartmuseum
# 登录
helm registry login harbor.example.com -u admin
# 推送
helm push ./mychart-0.1.0.tgz oci://harbor.example.com/charts
# 列出
helm list oci://harbor.example.com/charts
# 拉取
helm pull oci://harbor.example.com/charts/mychart --version 0.1.0
# 安装 diff
helm plugin install https://github.com/databus23/helm-diff
# 安装 unittest
helm plugin install https://github.com/helm-unittest/helm-unittest
# 列出
helm plugin list
# 升级
helm plugin update diff
# 卸载
helm plugin uninstall diff
# 跑测试
helm unittest ./mychart
# 详细输出
helm unittest ./mychart -v
# 指定文件
helm unittest ./mychart -f templates/deployment_test.yaml
# 打包
helm package ./mychart
# 生成 mychart-0.1.0.tgz
# 同时更新 index.yaml
helm repo index ./
# 推送
helm push ./mychart-0.1.0.tgz chartmuseum
# 显示 chart 信息
helm show chart ./mychart
# 显示 values
helm show values ./mychart
# 显示 README
helm show readme ./mychart
# 显示所有
helm show all ./mychart
# 看渲染后的 K8s 资源
kubectl get all -n prod -l app.kubernetes.io/instance=myapp
# 看 release 对应的 secret
kubectl get secret -n prod -l owner=helm
# 看 release 状态
kubectl get configmap -n prod -l owner=helm,status=released
apiVersion: v2
name: production-app
description: |
Production application chart for MyApp service.
Includes ConfigMap, Secret, Deployment, Service, Ingress, HPA.
type: application
version: 1.4.2
appVersion: "1.22.0"
kubeVersion: ">=1.24.0-0"
home: https://wiki.example.com/charts/production-app
sources:
- https://github.com/example/production-app
icon: https://example.com/icon.png
keywords:
- production
- application
- microservice
maintainers:
- name: Platform Team
email: platform@example.com
dependencies:
- name: common
version: "2.x.x"
repository: "oci://registry.example.com/charts"
condition: common.enabled
- name: postgresql
version: "12.5.6"
repository: "https://charts.bitnami.com/bitnami"
condition: database.postgresql.enabled
alias: pg
- name: redis
version: "17.x.x"
repository: "https://charts.bitnami.com/bitnami"
condition: cache.redis.enabled
alias: cache
annotations:
category: Application
licenses: Apache-2.0
mychart/
├── values.yaml # 默认
├── values-dev.yaml # 开发
├── values-staging.yaml # 预发
├── values-prod.yaml # 生产
├── values-prod-shanghai.yaml # 生产-上海
└── values-prod-beijing.yaml # 生产-北京
values-prod.yaml:
replicaCount: 5
image:
repository: myorg/myapp
tag: "1.22.0"
pullPolicy: IfNotPresent
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 2000m
memory: 2Gi
autoscaling:
enabled: true
minReplicas: 5
maxReplicas: 20
ingress:
enabled: true
className: nginx
hosts:
- host: app.example.com
paths:
- path: /
pathType: Prefix
tls:
- hosts:
- app.example.com
secretName: app-tls
database:
enabled: true
postgresql:
enabled: false # 走外部
host: pg.prod.svc
port: 5432
name: myapp
user: myapp
existingSecret: myapp-db-creds
cache:
redis:
enabled: true
host: redis.prod.svc
port: 6379
existingSecret: myapp-redis-creds
.gitlab-ci.yml:
stages:
- lint
- test
- package
- publish
variables:
HELM_VERSION: "3.14.4"
CT_VERSION: "3.10.1"
# 1. Lint
helm-lint:
stage: lint
image: alpine/helm:${HELM_VERSION}
script:
- helm version
- helm lint ./charts/mychart
- helm lint ./charts/mychart --values ./charts/mychart/values-prod.yaml
only:
changes:
- charts/mychart/**/*
# 2. 单元测试
helm-unittest:
stage: test
image: alpine/helm:${HELM_VERSION}
before_script:
- helm plugin install https://github.com/helm-unittest/helm-unittest
script:
- helm unittest ./charts/mychart
only:
changes:
- charts/mychart/**/*
# 3. 渲染 + kubeconform 校验
helm-render:
stage: test
image:
name: alpine/helm:${HELM_VERSION}
entrypoint: [""]
before_script:
- apk add --no-cache curl
- curl -sSL https://github.com/yannh/kubeconform/releases/latest/download/kubeconform-linux-amd64.tar.gz | tar xz
- mv kubeconform /usr/local/bin/
script:
- helm template myapp ./charts/mychart --values ./charts/mychart/values-prod.yaml > rendered.yaml
- kubeconform -strict -summary rendered.yaml
only:
changes:
- charts/mychart/**/*
# 4. 打包
helm-package:
stage: package
image: alpine/helm:${HELM_VERSION}
script:
- helm package ./charts/mychart --destination ./dist
artifacts:
paths:
- dist/*.tgz
only:
- main
- tags
# 5. 推送到 ChartMuseum
helm-publish:
stage: publish
image: alpine/helm:${HELM_VERSION}
before_script:
- helm plugin install https://github.com/chartmuseum/helm-push
script:
- helm cm-push ./dist/mychart-*.tgz chartmuseum
only:
- main
- tags
# 1. 创建 K8s secret
kubectl create secret generic myapp-values \
--from-file=values-prod.yaml=./values-prod.yaml \
-n argocd
# 2. ArgoCD Application 引用
apiVersion: argoproj.io/v1alpha1
kind: Application
spec:
source:
helm:
valueFiles:
- $values/values-prod.yaml
templates/migrate-job.yaml:
{{- if .Values.migration.enabled }}
apiVersion: batch/v1
kind: Job
metadata:
name: {{ include "mychart.fullname" . }}-migrate
labels:
{{- include "mychart.labels" . | nindent 4 }}
annotations:
"helm.sh/hook": pre-upgrade,pre-install
"helm.sh/hook-weight": "-5"
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
template:
metadata:
name: {{ include "mychart.fullname" . }}-migrate
spec:
restartPolicy: Never
containers:
- name: migrate
image: {{ include "mychart.image" . | quote }}
command: ["/app/migrate.sh"]
envFrom:
- secretRef:
name: {{ include "mychart.fullname" . }}
backoffLimit: 3
{{- end }}
templates/NOTES.txt:
应用已成功部署!
访问地址:
{{- if .Values.ingress.enabled }}
{{- range $host := .Values.ingress.hosts }}
{{- range .paths }}
https://{{ $host.host }}{{ .path }}
{{- end }}
{{- end }}
{{- else if contains "NodePort" .Values.service.type }}
export NODE_PORT=$(kubectl get -o jsonpath="{.spec.ports[0].nodePort}" services {{ include "mychart.fullname" . }})
export NODE_IP=$(kubectl get nodes -o jsonpath="{.items[0].status.addresses[0].address}")
echo http://$NODE_IP:$NODE_PORT
{{- else if contains "LoadBalancer" .Values.service.type }}
kubectl get -o jsonpath="{.status.loadBalancer.ingress[0].ip}" services {{ include "mychart.fullname" . }}
{{- else if contains "ClusterIP" .Values.service.type }}
kubectl port-forward svc/{{ include "mychart.fullname" . }} 8080:{{ .Values.service.port }}
curl http://127.0.0.1:8080
{{- end }}
查看状态:
kubectl get all -l app.kubernetes.io/instance={{ .Release.Name }} -n {{ .Release.Namespace }}
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: myapp
spec:
goTemplate: true
goTemplateOptions:
- missingkey=error
generators:
- list:
elements:
- cluster: dev
url: https://dev-k8s.example.com
revision: "1.4.2"
env: dev
- cluster: staging
url: https://staging-k8s.example.com
revision: "1.4.2"
env: staging
- cluster: prod
url: https://prod-k8s.example.com
revision: "1.4.2"
env: prod
template:
metadata:
name: 'myapp-{{.cluster}}'
spec:
project: production
source:
repoURL: https://charts.example.com
targetRevision: '{{.revision}}'
chart: mychart
helm:
valueFiles:
- 'values-{{.env}}.yaml'
destination:
server: '{{.url}}'
namespace: myapp
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
- PrunePropagationPolicy=foreground
Helm 是客户端工具,本身没有运行日志。常见调试:
# Debug 模式
helm install myapp ./mychart --debug
# 输出包括:
# - 模板渲染过程
# - 生成的 manifest
# - 错误信息
# 1. 列出
helm list -A
# 输出字段:
# NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
# myapp prod 3 2026-06-17 14:00:00.123456 +0800 CST deployed mychart-1.4.2 1.22.0
# 2. 详细状态
helm status myapp -n prod
# 输出包括:
# NAME: myapp
# LAST DEPLOYED: Tue Jun 17 14:00:00 2026
# NAMESPACE: prod
# STATUS: deployed
# REVISION: 3
# TEST SUITE: None
# NOTES: ...
# 3. 历史
helm history myapp -n prod
Helm release 对应的资源都在 K8s 中,按 label 查:
# 列出 release 所有资源
kubectl get all -l app.kubernetes.io/instance=myapp -n prod
# Pod 日志
kubectl logs -l app.kubernetes.io/instance=myapp -n prod -f
# 特定容器
kubectl logs myapp-xxxx -c main -n prod
| STATUS | 含义 | 处理 |
|---|---|---|
deployed |
安装 / 升级成功 | 正常 |
uninstalled |
卸载成功 | 正常 |
superseded |
被新版本取代 | 正常 |
failed |
失败 | 排查 |
pending-install |
正在 install | 等待 |
pending-upgrade |
正在 upgrade | 等待 |
pending-rollback |
正在 rollback | 等待 |
uninstalling |
正在卸载 | 等待 |
如果用 GitOps,release 状态在 ArgoCD 中查看:
# CLI
argocd app list
argocd app get myapp-prod
argocd app history myapp-prod
# UI: ArgoCD WebUI
# 状态:
# - Synced: 期望状态与实际一致
# - OutOfSync: 期望与实际不一致
# - Healthy: 资源健康
# - Degraded: 资源异常
# - Progressing: 升级中
# 1. release 数
kubectl get configmap -A -l owner=helm,name=myapp -o name | wc -l
# 2. 失败 release
argocd app list -o json | jq '.[] | select(.status.health.status != "Healthy") | {name: .metadata.name, status: .status.health.status}'
# 3. 同步状态
argocd app list -o json | jq '.[] | select(.status.sync.status != "Synced") | {name: .metadata.name, sync: .status.sync.status}'
groups:
- name: helm-releases
rules:
- alert: HelmReleaseDegraded
expr: argocd_app_info{health_status="Degraded"} == 1
for: 5m
labels:
severity: critical
annotations:
summary: "Helm release {{ $labels.name }} 处于 Degraded 状态"
description: "请检查 K8s 资源状态"
- alert: HelmReleaseOutOfSync
expr: argocd_app_info{sync_status="OutOfSync"} == 1
for: 30m
labels:
severity: warning
annotations:
summary: "Helm release {{ $labels.name }} 长时间 OutOfSync"
现象:helm template 报错
Error: template: mychart/templates/deployment.yaml:23:5: executing "mychart" at <.Values.something>: nil pointer evaluating interface{}.something
步骤:
# 1. 确认错误位置
# mychart/templates/deployment.yaml:23:5
# 第 23 行第 5 列
# 2. 常见原因:
# - .Values.something 是 nil,但模板里没判断
# - values 没传该字段
# 3. 修复:
# 加 default
{{ .Values.something | default "x" }}
# 4. 加 required 校验
{{ required "msg" .Values.something }}
现象:helm lint 报 failed to parse
步骤:
# 1. 看完整错误
helm lint ./mychart 2>&1 | head -30
# 2. 常见原因:
# - YAML 缩进错
# - Chart.yaml 字段写错
# - values 引用了不存在的字段
# - 模板函数错误
# 3. 验证 YAML 语法
python3 -c "import yaml; yaml.safe_load(open('mychart/Chart.yaml'))"
现象:helm dependency update 报错
Error: failed to fetch https://charts.bitnami.com/bitnami/index.yaml : 404
步骤:
# 1. 检查网络
curl -I https://charts.bitnami.com/bitnami/index.yaml
# 2. 检查 Chart.yaml 中仓库地址拼写
# 正确:https://charts.bitnami.com/bitnami
# 错误:https://charts.bitnami.com(少了 /bitnami)
# 3. 离线场景
helm dependency build ./mychart
# charts/ 目录下的 .tgz 不会被重新下载
现象:helm upgrade 成功,但 pod 一直 CrashLoopBackOff
步骤:
# 1. 看 pod 状态
kubectl get pods -n prod -l app.kubernetes.io/instance=myapp
# NAME READY STATUS RESTARTS AGE
# myapp-xxxxxxxxxx-yyy 0/1 CrashLoopBackOff 3 2m
# 2. 看 pod 事件
kubectl describe pod myapp-xxx -n prod | tail -30
# 3. 看日志
kubectl logs myapp-xxx -n prod --previous
# 4. 常见原因:
# - 镜像拉不到:kubectl describe pod 看 ImagePullBackOff
# - 配置错误:应用启动报 config not found
# - 资源限制:OOMKilled
# - 端口冲突:service port 被占用
# 5. 紧急:回滚
helm rollback myapp -n prod
现象:ArgoCD 一直显示 OutOfSync,不自动同步
步骤:
# 1. 看 diff
argocd app diff myapp-prod
# 2. 看同步状态
argocd app get myapp-prod
# 3. 手动同步
argocd app sync myapp-prod
# 4. 强制覆盖(危险)
argocd app sync myapp-prod --force
# 5. 看同步历史
argocd app history myapp-prod
现象:helm list 显示 pending-install,超时
步骤:
# 1. 看 release 状态
helm status myapp -n prod
# 2. 看 K8s 资源
kubectl get all -l app.kubernetes.io/instance=myapp -n prod
# 3. 常见原因:
# - hook 失败:migration job 一直 pending
# - webhook 失败:admission webhook 拒绝
# - 资源配额:namespace 资源不足
# 4. 手动回滚
helm rollback myapp -n prod
# 5. 清理
kubectl delete job -n prod -l app.kubernetes.io/instance=myapp
现象:helm repo update 报 connection refused
步骤:
# 1. 测试仓库
curl -I https://charts.example.com/index.yaml
# 2. 错误排查:
# - DNS 解析问题
# - HTTPS 证书问题
# - 代理问题
# - 仓库服务挂了
# 3. 用 helm repo add 加 insecure 仓库
helm repo add test http://charts-internal.example.com --insecure-skip-tls-verify
# 4. 验证
helm repo list
helm search repo mychart
现象:helm push 报 unauthorized
步骤:
# 1. 重新登录
helm registry logout harbor.example.com
helm registry login harbor.example.com -u admin
# 2. 检查项目存在
# Harbor UI 中确认 charts 项目存在且有推送权限
# 3. 检查 OCI 协议支持
# Harbor 2.x 默认开启 OCI
# 4. 验证
helm push ./mychart-0.1.0.tgz oci://harbor.example.com/charts
风险:metadata.labels、namespace、port 等写成固定值,多环境用不了。
# 错误
metadata:
namespace: prod # 写死了
name: myapp
spec:
port: 8080 # 写死了
# 正确
metadata:
namespace: {{ .Release.Namespace }}
name: {{ include "mychart.fullname" . }}
spec:
port: {{ .Values.service.port }}
风险:把 auth.password 改成 existingSecret,但 version 还是 1.4.2,老用户升级就炸了。
# 错误:删除了字段但 version 没改
auth:
# password 字段被删除了,但 version 还是 1.4.2
existingSecret: ""
# 正确:升 MAJOR
# version: 2.0.0
# CHANGELOG.md 里写:
## [2.0.0] - 2026-06-17
### BREAKING
- auth.password 改为 auth.existingSecret
- 迁移:把 auth.password 移到 secret,引用 existingSecret
风险:requirements.yaml 用 ~1.0,不同机器拉到不同版本。
# 错误
dependencies:
- name: mysql
version: "~9.0" # 模糊匹配
# 正确:固定 + 锁定
# Chart.yaml
dependencies:
- name: mysql
version: "9.10.0" # 固定
# Chart.lock(自动生成,必须 commit)
dependencies:
- name: mysql
repository: https://charts.bitnami.com/bitnami
version: 9.10.0
风险:helm upgrade 时 Secret 内容变化,会直接覆盖。如果 Secret 是手工 kubectl create 创建的,Helm 不管理,升级可能让 Secret 内容丢失。
# 1. 用 existingSecret 引用已存在的 Secret
database:
existingSecret: my-db-creds # 不在 Helm 模板中创建
# 2. Secret 单独用 Sealed Secrets / External Secrets Operator 管理
风险:pre-upgrade hook 是 Job,如果失败,整个 release 卡 pending-upgrade。
# 防御:
# 1. pre-upgrade hook 加 timeout
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
"helm.sh/hook-weight": "-5"
# 2. 失败时手动清理
kubectl delete job myapp-migrate -n prod
helm rollback myapp -n prod
风险:Helm 客户端连 K8s cluster 的 kubeconfig 泄漏。
# 防御:
# 1. kubeconfig 权限最小化
# 2. CI 中用 service account token,不存明文
# 3. Helm 3.14+ 的 OCI 支持 cosign 签名验证
风险:values 文件超过 1 MB、循环 1000+ 资源,渲染会卡住。
# 防御:
# 1. values 精简
# 2. 拆 chart
# 3. 模板里避免复杂循环
风险:subchart 的 values 在父 chart 中通过 .Values.subchart.field 访问,但 Helm 3 改了规则。
# 父 chart values.yaml
subchart:
field: value
# 父 chart 模板访问
{{ .Values.subchart.field }} # 正确
# 但 subchart 自己的模板中是
{{ .Values.field }} # subchart 只看到自己命名空间下的 values
lookup 慎用风险:lookup 函数会请求 K8s API,模板渲染变慢,且 helm template 时可能失败。
# 不推荐
{{- $existing := (lookup "v1" "ConfigMap" "ns" "name").data }}
# 推荐:用 values 控制
{{- if .Values.config.data }}
data:
{{- toYaml .Values.config.data | nindent 2 }}
{{- end }}
风险:ChartMuseum 默认无认证,任何人能 push。
# 防御:
# 1. 启用 Basic Auth
-e BASIC_AUTH_USER=admin
-e BASIC_AUTH_PASS=xxx
# 2. 用 Harbor / Cloud Registry 的 RBAC
# 3. 启用 cosign 签名
helm lint ./mychart
# 预期:[INFO] Chart.yaml: icon is recommended
# 或:no failures
# 用 chart-testing
ct lint --config ci/chart_schema.yaml
# 渲染
helm template myapp ./mychart --values values-prod.yaml > rendered.yaml
# 用 kubeconform 校验 K8s schema
kubeconform -strict -summary rendered.yaml
# 预期:Summary: ... Summary of all resources ... 100% valid
# 用 kubectl --dry-run 验证
kubectl apply --dry-run=client -f rendered.yaml
# 预期:configured (dry-run)
helm unittest ./mychart
# 预期:
# PASSED test deployment
# Tests: 6 passed, 0 failed, 0 skipped
helm install myapp ./mychart --values values-prod.yaml --dry-run
# 预期:STATUS: pending-install,无错误
# 1. 准备临时 namespace
kubectl create namespace helm-test
# 2. 演练
helm install myapp ./mychart --namespace helm-test --values values-test.yaml
# 3. 验证资源
kubectl get all -n helm-test
# 4. 演练升级
helm upgrade myapp ./mychart --namespace helm-test --values values-test-v2.yaml
# 5. 演练回滚
helm rollback myapp --namespace helm-test
# 6. 清理
helm uninstall myapp -n helm-test
kubectl delete namespace helm-test
# 1. 提交后看 ArgoCD 状态
argocd app get myapp-prod
# 状态:Synced / Healthy
# 2. 模拟代码变更
# 改 values 后提交,看 ArgoCD 是否自动 sync
# 3. 模拟故障
# 把镜像改成不存在的版本,sync 应失败
# 添加仓库
helm repo add chartmuseum https://charts.example.com
# 列出
helm search repo chartmuseum/mychart
# 拉取
helm pull chartmuseum/mychart --version 1.4.2
# 推送
helm push ./mychart-1.4.2.tgz oci://harbor.example.com/charts
# 列出
helm list oci://harbor.example.com/charts
# 拉取
helm pull oci://harbor.example.com/charts/mychart --version 1.4.2
# 装 diff 插件
helm plugin install https://github.com/databus23/helm-diff
# 看 diff
helm diff upgrade myapp ./mychart --values values-new.yaml
# 预期:列出修改、新增、删除的资源
# 1. 装带 hook 的 chart
helm install myapp ./mychart --dry-run --debug | grep -A 30 "HOOKS"
# 2. 验证 hook Job 是否跑成功
kubectl get jobs -n prod
kubectl logs job/myapp-migrate -n prod
# 1. 看历史
helm history myapp -n prod
# 2. 回滚到上一版本
helm rollback myapp -n prod
# 3. 回滚到指定版本
helm rollback myapp 5 -n prod
# 4. 验证
kubectl get pods -n prod -l app.kubernetes.io/instance=myapp
helm status myapp -n prod
回滚的限制:
kubectl create 创建的资源# 1. 看历史
argocd app history myapp-prod
# 2. 回滚到指定版本
argocd app rollback myapp-prod
# 3. 回滚到指定 ID
argocd app rollback myapp-prod --id 5
# 4. 验证
argocd app get myapp-prod
UI 操作:ArgoCD WebUI → Application → History → Rollback。
如果 chart 里包含数据库迁移,需要单独处理:
# 1. 找出对应的 migration tool
# 例如 Flyway、Liquibase、Prisma Migrate
# 2. 手动回滚
kubectl exec -it myapp-xxx -n prod -- /app/migrate.sh rollback
# 3. 验证
psql -h pg.prod.svc -U myapp -c "SELECT * FROM schema_migrations ORDER BY installed_rank DESC LIMIT 5"
如果升级导致数据问题:
# 1. 立刻停掉应用(不让写入扩散)
kubectl scale deployment myapp -n prod --replicas=0
# 2. 恢复 PVC(如果有 snapshot)
kubectl get pvc -n prod
# 通过 Velero / 云厂商 snapshot 恢复
# 3. 回滚 Helm
helm rollback myapp -n prod
# 4. 验证数据
kubectl exec -it myapp-xxx -n prod -- /app/db-check.sh
# 5. 恢复副本
kubectl scale deployment myapp -n prod --replicas=5
如果整个 K8s 集群挂了:
# 1. 通过 Velero 恢复
velero restore create --from-backup backup-2026-06-17
# 2. 重新拉起 ArgoCD
kubectl apply -f argocd-install.yaml
# 3. ArgoCD 自动同步所有 Application
# 4. 验证
argocd app list
如果 push 了错误版本到 ChartMuseum:
# 1. 删除(需要 ALLOW_OVERWRITE)
curl -X DELETE https://charts.example.com/api/charts/mychart/1.4.3
# 2. 重新 push 正确版本
helm push ./mychart-1.4.3.tgz chartmuseum
# 3. 通知用户不要升级
OCI 仓库同样:
# Harbor UI 中删除 tag,或用 API
curl -X DELETE -u admin:Harbor12345 \
https://harbor.example.com/api/v2.0/charts/mychart/1.4.3
GitOps 中,values 文件在 Git 里,revert 即可:
# 1. 看变更
git log -p values-prod.yaml
# 2. 回滚
git revert HEAD
git push
# 3. ArgoCD 自动 sync
argocd app get myapp-prod
如果 CI 自动 bump 了 chart version,提交一个 revert commit:
# 1. 找错版本 commit
git log --oneline | head -20
# 2. revert
git revert <commit-hash>
git push
# 3. CI 重新跑
| 组件 | 推荐配置 | 说明 |
|---|---|---|
| ArgoCD | 2 CPU / 4 GB / 20 GB 存储 | Application 数 100+ 需要更多 |
| ChartMuseum | 1 CPU / 1 GB / 50 GB 存储 | 存 chart tgz |
| Harbor(OCI) | 4 CPU / 8 GB / 100 GB 存储 | 同时存镜像和 chart |
| Helm client | 100 MB 磁盘 | 客户端工具 |
| ETCD(K8s) | 取决于集群规模 | 存 Helm release 状态 |
Harbor 项目权限:
# Harbor project "charts-prod" 角色
- name: admin
users: [platform-team]
- name: developer
users: [release-manager]
- name: guest
users: [app-team]
ArgoCD Project 权限:
apiVersion: argoproj.io/v1alpha1
kind: AppProject
metadata:
name: production
spec:
sourceRepos:
- https://charts.example.com
destinations:
- server: '*'
namespace: 'prod-*'
clusterResourceWhitelist:
- group: '*'
kind: '*'
namespaceResourceWhitelist:
- group: '*'
kind: '*'
--show-only、Kustomize 替代| 环境 | 同步频率 | 阻断方式 |
|---|---|---|
| dev | 自动 | 无 |
| staging | 自动 | lint + unit test |
| prod | 手动 | lint + unit test + peer review |
GitOps 多环境:
argocd/
├── apps/
│ ├── myapp-dev.yaml # 自动同步
│ ├── myapp-staging.yaml # 自动同步
│ └── myapp-prod.yaml # 手动同步
├── projects/
│ ├── dev.yaml
│ ├── staging.yaml
│ └── production.yaml
└── repos/
└── chart-repos.yaml
用 Argo Rollouts 做蓝绿 / 金丝雀:
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: myapp
spec:
strategy:
canary:
steps:
- setWeight: 10
- pause: {duration: 5m}
- setWeight: 30
- pause: {duration: 10m}
- setWeight: 60
- pause: {duration: 10m}
- setWeight: 100
每个 chart 必须有 README:
# mychart
生产级应用 chart。
## 维护者
Platform Team <platform@example.com>
## 依赖
- PostgreSQL 12.x
- Redis 17.x
## 升级
从 1.3.x 升级到 1.4.x:见 [CHANGELOG.md](./CHANGELOG.md)
## 紧急回滚
见 [RUNBOOK.md](./RUNBOOK.md)
升级前必须做的事:
helm lint、helm template、ct lint、kubeconformhelm unittesthelm rollbackHelm 3.8+ 支持 cosign 签名:
# 签名
cosign sign harbor.example.com/charts/mychart:1.4.2
# 验证
cosign verify harbor.example.com/charts/mychart:1.4.2 --certificate-identity-regexp '.*' --certificate-oidc-issuer 'https://accounts.google.com'
ArgoCD 中可配置 helm.verify: true。
新增字段必须向后兼容:
# 旧字段
auth:
password: xxx
# 新增字段(可选)
auth:
password: xxx
existingSecret: ""
# 老配置还能用
# 等所有用户迁移后再删除 password 字段
# 同时升 MAJOR 版本
Helm Chart 的工程化落地分三块:
最容易踩的坑:
helm-diff 看 diff → 升级时改了不期望改的资源helm upgrade 不分环境 → 一次升级影响所有环境生产环境重点:
下一步可考虑:
对 Kubernetes 应用交付和云原生生态感兴趣的工程师,不妨去 云栈社区 看看,上面有大量关于 Helm、GitOps 和容器化技术的实战避坑指南,遇到问题也能找到同路人一起交流。
参考资料(按需查阅,以官方文档为准):