1核心概念与架构
Kubernetes 是容器编排平台,核心组件包括:API Server、Scheduler、Controller Manager、etcd、kubelet、kube-proxy。基础对象:Pod、Deployment、Service、Ingress、ConfigMap/Secret、Namespace、RBAC。
控制平面
- API Server:统一入口
- etcd:一致性存储
- Scheduler:调度 Pod
- Controller Manager:对齐期望状态
数据平面
- kubelet:节点代理
- kube-proxy/CNI:网络转发
- 容器运行时:containerd/CRI-O
2本地环境搭建(minikube / kind / k3d)
minikube
minikube start --driver=docker
kubectl get nodes
minikube dashboard
kind(Kubernetes in Docker)
kind create cluster --name dev
kubectl cluster-info --context kind-dev
kind delete cluster --name dev
k3d(基于 k3s 的轻量集群)
k3d cluster create demo --agents 2
kubectl get nodes
k3d cluster delete demo
3核心对象详解(YAML 示例)
Deployment + Service
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
spec:
replicas: 2
selector:
matchLabels: { app: web }
template:
metadata:
labels: { app: web }
spec:
containers:
- name: web
image: nginx:1.25
ports: [{ containerPort: 80 }]
---
apiVersion: v1
kind: Service
metadata:
name: web
spec:
type: ClusterIP
selector: { app: web }
ports:
- port: 80
targetPort: 80
Ingress(以 NGINX 为例)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: web
annotations:
kubernetes.io/ingress.class: nginx
spec:
rules:
- host: web.local
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: web
port:
number: 80
ConfigMap / Secret
apiVersion: v1
kind: ConfigMap
metadata: { name: app-config }
data:
LOG_LEVEL: info
---
apiVersion: v1
kind: Secret
metadata: { name: app-secret }
type: Opaque
stringData:
DB_PASSWORD: supersecret
探针与资源限制
livenessProbe:
httpGet: { path: /healthz, port: 8080 }
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
httpGet: { path: /ready, port: 8080 }
resources:
requests: { cpu: "100m", memory: "128Mi" }
limits: { cpu: "500m", memory: "512Mi" }
4开发者工作流(kubectl / Kustomize / Helm)
kubectl 基础
kubectl apply -f k8s/
kubectl get pods,svc,ingress -A
kubectl describe pod <name>
kubectl logs -f deploy/web
kubectl port-forward svc/web 8080:80
Kustomize(覆盖环境差异)
k8s/
base/
deployment.yaml
service.yaml
kustomization.yaml
overlays/
dev/kustomization.yaml
prod/kustomization.yaml
---
kubectl kustomize overlays/dev | kubectl apply -f -
Helm(模板化与发布)
helm create web
helm install web ./web -n default
helm upgrade --install web ./web --set image.tag=1.2.3
5可观测与调试
日常命令
kubectl get events --sort-by=.lastTimestamp -A
kubectl top pod -A # 需 metrics-server
kubectl exec -it deploy/web -- sh
kubectl cp ./local.txt default/web-xxxxx:/tmp/local.txt
生态工具
- stern:多 Pod 日志聚合
- k9s:TUI 管理
- Prometheus + Grafana:指标监控
- Loki + Tempo + Grafana:日志 / Trace
6生产级部署要点
高可用与弹性
- 副本数与跨节点分布(PodAntiAffinity)
- HPA(CPU/内存/自定义指标)
- PDB(PodDisruptionBudget)
网络与入口
- Ingress Controller(NGINX/Traefik)
- Service 类型(ClusterIP/NodePort/LB)
- 全局流量(Gateway API / Service Mesh)
可靠性与回滚
- RollingUpdate / Blue-Green / Canary
- Helm / Argo Rollouts 策略
- 镜像不可变标签与回滚
成本与资源
- Requests/Limits 与 QoS
- 垂直/水平扩缩容(VPA/HPA)
- 节点自动伸缩(Cluster Autoscaler)
7安全与合规
RBAC 与最小权限
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata: { name: app-read, namespace: default }
rules:
- apiGroups: [""]
resources: ["pods", "services"]
verbs: ["get", "list", "watch"]
网络与机密
- NetworkPolicy(隔离命名空间与服务)
- Secret 管理:Sealed Secrets / External Secrets
- 镜像签名与供应链安全(Cosign/SLSA)
8CI/CD 与 GitOps
GitHub Actions:构建镜像并推送
name: ci
on: { push: { branches: [ main ] } }
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: docker/setup-buildx-action@v3
- uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- uses: docker/build-push-action@v6
with:
push: true
tags: ghcr.io/org/app:${{ github.sha }}
Argo CD(GitOps)
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata: { name: web }
spec:
project: default
source:
repoURL: https://github.com/org/infra
path: apps/web
targetRevision: main
destination:
server: https://kubernetes.default.svc
namespace: default
syncPolicy:
automated: { prune: true, selfHeal: true }
Flux(替代方案)
- GitRepository / Kustomization 控制器
- OCI Helm 仓库支持
- 与 Kyverno/OPA 集成做合规
9云上落地(GKE / EKS / AKS)
GKE
- Autopilot 模式省运维
- Cloud NAT / Ingress / NEG
- 与 Cloud Build / Artifact Registry 集成
EKS
- VPC CNI / ALB Ingress Controller
- IRSA(基于角色的服务账号)
- EKS Blueprints 快速落地
AKS
- Azure CNI / App Gateway Ingress
- ACR / Managed Identity
- 内置监控与安全中心
10常见问题排查
服务无法访问:
- 检查 kubectl get svc/ingress 与后端端口映射
- 排查网络策略与安全组(云上)
- 查看 Ingress Controller 日志
Pod 反复重启:
- 查看探针失败原因:kubectl describe pod
- 核对资源限制导致的 OOMKilled
- 使用 kubectl logs -p 查看上次崩溃
部署不生效:
- 对比期望与当前:kubectl get deploy -o yaml
- 确认选择器与标签一致
- 事件排序:kubectl get events --sort-by=.lastTimestamp
11命令速查表
# 资源与命名空间
kubectl get all -A
kubectl config get-contexts
kubectl config use-context <ctx>
kubens default && kubectx kind-dev
# 调试与端口
kubectl describe pod <pod>
kubectl logs -f deploy/web
kubectl port-forward svc/web 8080:80
# 应用部署
kubectl apply -f k8s/
helm upgrade --install web ./chart --namespace default
kubectl kustomize overlays/prod | kubectl apply -f -
最佳实践速览
- 为每个容器设置 requests/limits
- 为关键服务配置 Readiness/Liveness 探针
- 开启 HPA 与 PDB 提升可用性
- 用 GitOps 管理集群变更
学习路径建议
- 本地集群上手 → YAML 基础 → kubectl 熟练
- Kustomize/Helm → CI 构建镜像 → GitOps
- 监控日志 → 扩缩容与回滚 → 云上生产