Files
interview/questions/11-运维/Kubernetes.md
yasinshaw 0e46a367c4 refactor: rename files to Chinese and organize by category
Organized 50 interview questions into 12 categories:
- 01-分布式系统 (9 files): 分布式事务, 分布式锁, 一致性哈希, CAP理论, etc.
- 02-数据库 (2 files): MySQL索引优化, MyBatis核心原理
- 03-缓存 (5 files): Redis数据结构, 缓存问题, LRU算法, etc.
- 04-消息队列 (1 file): RocketMQ/Kafka
- 05-并发编程 (4 files): 线程池, 设计模式, 限流策略, etc.
- 06-JVM (1 file): JVM和垃圾回收
- 07-系统设计 (8 files): 秒杀系统, 短链接, IM, Feed流, etc.
- 08-算法与数据结构 (4 files): B+树, 红黑树, 跳表, 时间轮
- 09-网络与安全 (3 files): TCP/IP, 加密安全, 性能优化
- 10-中间件 (4 files): Spring Boot, Nacos, Dubbo, Nginx
- 11-运维 (4 files): Kubernetes, CI/CD, Docker, 可观测性
- 12-面试技巧 (1 file): 面试技巧和职业规划

All files renamed to Chinese for better accessibility and
organized into categorized folders for easier navigation.

Generated with [Claude Code](https://claude.com/claude-code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
2026-03-01 00:10:53 +08:00

1021 lines
24 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 容器编排 (Kubernetes)
## 问题
**背景**随着容器化技术的普及容器编排成为管理大规模容器集群的关键。Kubernetes 作为事实上的标准,提供了自动化部署、扩展和管理容器化应用的能力。
**问题**
1. 什么是容器编排?为什么需要 Kubernetes
2. Kubernetes 的核心架构有哪些组件?
3. Pod、Deployment、Service 的关系是什么?
4. 请描述 Kubernetes 的网络模型
5. Kubernetes 如何实现服务发现和负载均衡?
6. 什么是 ConfigMap 和 Secret如何使用
7. Kubernetes 的存储卷Volume有哪些类型
8. 请描述 Kubernetes 的调度流程
9. Ingress 是什么?它和 NodePort、LoadBalancer 的区别?
10. 在生产环境中使用 Kubernetes 遇到过哪些坑?
---
## 标准答案
### 1. 容器编排概述
#### **为什么需要容器编排**
```
单机 Docker 的痛点:
├─ 容器生命周期管理复杂
├─ 服务发现和负载均衡困难
├─ 滚动更新和回滚复杂
├─ 资源调度和利用率低
├─ 高可用和故障自恢复难实现
└─ 多主机网络配置复杂
Kubernetes 的解决方案:
├─ 自动化部署和回滚
├─ 服务发现和负载均衡
├─ 自我修复(失败重启、节点迁移)
├─ 自动扩缩容HPA
├─ 存储编排
└─ 配置管理和密钥管理
```
---
### 2. Kubernetes 核心架构
#### **架构图**
```
┌─────────────────────────────────┐
│ Control Plane │
│ (Master 节点) │
└─────────────────────────────────┘
┌─────────────────┼─────────────────┐
│ │ │
┌─────────┐ ┌──────────┐ ┌──────────┐
│API Server│ │Scheduler │ │Controller│
│ (apiserver)│ │ (调度器) │ │ Manager │
└─────────┘ └──────────┘ └──────────┘
│ │ │
┌─────────┐ ┌──────────┐ ┌──────────┐
│ etcd │ │Cloud Ctl │ │ kube- │
│ (存储) │ │ Manager │ │ proxy │
└─────────┘ └──────────┘ └──────────┘
│ HTTP/REST API
┌─────────────────────────────────────────────┐
│ Worker Nodes │
├─────────────────────────────────────────────┤
│ │
│ Node 1 Node 2 │
│ ┌────────────┐ ┌────────────┐ │
│ │ kubelet │ │ kubelet │ │
│ │ (Pod 代理) │ │ │ │
│ └────────────┘ └────────────┘ │
│ ┌────────────┐ ┌────────────┐ │
│ │kube-proxy │ │kube-proxy │ │
│ │ (网络代理) │ │ │ │
│ └────────────┘ └────────────┘ │
│ ┌────────────┐ ┌────────────┐ │
│ │Container │ │Container │ │
│ │Runtime │ │Runtime │ │
│ │(Docker/...)│ │ │ │
│ └────────────┘ └────────────┘ │
│ ┌────────────┐ ┌────────────┐ │
│ │ Pods │ │ Pods │ │
│ │ ┌────────┐ │ │ ┌────────┐ │ │
│ │ │ App 1 │ │ │ │ App 2 │ │ │
│ │ └────────┘ │ │ └────────┘ │ │
│ └────────────┘ └────────────┘ │
│ │
└─────────────────────────────────────────────┘
```
#### **核心组件详解**
**1. API Server apiserver**
- Kubernetes 的入口,所有请求都通过 API Server
- 认证、授权、准入控制
- RESTful API
**2. etcd**
- 分布式键值存储
- 存储集群所有状态数据
- Watch 机制,推送变化
**3. Scheduler调度器**
- 负责决定 Pod 调度到哪个节点
- 调度算法:资源需求、硬件约束、亲和性/反亲和性
**4. Controller Manager**
- 维护集群状态
- 常见控制器:
- Node Controller节点故障处理
- Replication Controller副本管理
- Endpoint ControllerService 端点管理
**5. kubelet**
- 运行在每个节点上
- 负责 Pod 的生命周期管理
- 上报节点状态
**6. kube-proxy**
- 维护网络规则
- 实现 Service 负载均衡
---
### 3. Pod、Deployment、Service
#### **关系图**
```
Deployment (声明式部署)
├── 管理 ReplicaSet (副本集)
│ │
│ └── 管理 Pod (最小调度单元)
│ │
│ ├── Container 1 (应用容器)
│ ├── Container 2 (Sidecar)
│ └── Shared Volume (共享存储)
Service (服务发现)
├── 通过 Label Selector 选择 Pod
└── 提供稳定的访问入口IP/DNS
```
#### **Pod 示例**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
labels:
app: nginx
env: prod
spec:
containers:
- name: nginx
image: nginx:1.21
ports:
- containerPort: 80
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
- name: sidecar
image: fluentd:1.12
volumeMounts:
- name: log-volume
mountPath: /var/log
volumes:
- name: log-volume
emptyDir: {}
```
#### **Deployment 示例**
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3 # 3 个副本
selector:
matchLabels:
app: nginx
template: # Pod 模板
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.21
ports:
- containerPort: 80
livenessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 3
periodSeconds: 3
readinessProbe:
httpGet:
path: /
port: 80
initialDelaySeconds: 3
periodSeconds: 3
```
#### **Service 示例**
```yaml
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx # 选择 Pod
ports:
- protocol: TCP
port: 80 # Service 端口
targetPort: 80 # Pod 端口
type: ClusterIP # 服务类型
```
**三种 Service 类型**
```yaml
# 1. ClusterIP默认
type: ClusterIP
# 仅集群内部访问
# 2. NodePort
type: NodePort
ports:
- port: 80
targetPort: 80
nodePort: 30080 # 每个节点都暴露 30080 端口
# 3. LoadBalancer
type: LoadBalancer
# 云服务商提供外部负载均衡器
```
---
### 4. Kubernetes 网络模型
#### **网络要求**
```
1. 所有 Pod 可以不通过 NAT 直接通信
2. 所有 Node 可以与所有 Pod 通信
3. Pod 看到的自己 IP 和别人看到的 IP 一致
```
#### **网络架构**
```
Internet
┌──────────┐
│ Ingress │
└──────────┘
Service (ClusterIP: 10.0.0.1)
┌──────────────┼──────────────┐
│ │ │
Pod (10.244.1.2) Pod (10.244.1.3) Pod (10.244.2.5)
Node 1 Node 1 Node 2
```
#### **网络插件CNI**
| 插件 | 类型 | 特点 |
|------|------|------|
| Flannel | VxLAN/Host-GW | 简单,性能一般 |
| Calico | BGP | 性能好,支持网络策略 |
| Cilium | eBPF | 高性能,支持透明代理 |
| Weave | VxLAN | 简单,加密支持 |
**Calico 示例**
```yaml
# 安装 Calico
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
# 网络策略
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-from-other-namespaces
namespace: default
spec:
podSelector:
matchLabels:
app: nginx
ingress:
- from:
- podSelector: {}
```
---
### 5. 服务发现和负载均衡
#### **服务发现**
```yaml
# 1. 环境变量
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: app
image: my-app
env:
- name: DB_SERVICE_HOST
value: "mysql-service"
- name: DB_SERVICE_PORT
value: "3306"
# 2. DNS推荐
# Pod 可以通过 DNS 名称访问 Service
# mysql-service.default.svc.cluster.local
```
**Kubernetes DNS 架构**
```
Pod 启动 → /etc/resolv.conf 配置
nameserver 10.96.0.10 # kube-dns 的 ClusterIP
search default.svc.cluster.local svc.cluster.local cluster.local
解析域名
mysql-service.default.svc.cluster.local
返回 Service ClusterIP (10.0.0.1)
kube-proxy 负载均衡到 Pod
```
#### **负载均衡策略**
**kube-proxy 三种模式**
```yaml
# 1. Userspace旧版性能差
mode: userspace
# 2. iptables默认
mode: iptables
# 使用 iptables 规则实现负载均衡
# 3. ipvs推荐
mode: ipvs
# 使用 IPVS性能更好
```
**Service 负载均衡算法**
```yaml
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
sessionAffinity: ClientIP # 会话保持
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10800 # 3 小时
```
---
### 6. ConfigMap 和 Secret
#### **ConfigMap配置管理**
```yaml
# 1. 创建 ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
application.properties: |
server.port=8080
spring.datasource.url=jdbc:mysql://localhost:3306/db
log-level: "info"
feature-flags: |
featureA=true
featureB=false
# 2. 使用 ConfigMap
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: app
image: my-app
env:
- name: LOG_LEVEL
valueFrom:
configMapKeyRef:
name: app-config
key: log-level
volumeMounts:
- name: config-volume
mountPath: /etc/config
volumes:
- name: config-volume
configMap:
name: app-config
```
#### **Secret密钥管理**
```yaml
# 1. 创建 Secret
apiVersion: v1
kind: Secret
metadata:
name: db-secret
type: Opaque
data:
username: YWRtaW4= # base64 编码
password: MWYyZDFlMmU2N2Rm
# 2. 使用 Secret
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: app
image: my-app
env:
- name: DB_USERNAME
valueFrom:
secretKeyRef:
name: db-secret
key: username
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-secret
key: password
```
**从文件创建 Secret**
```bash
# 创建 TLS Secret
kubectl create secret tls my-tls-secret \
--cert=path/to/cert.crt \
--key=path/to/cert.key
# 创建 Docker Registry Secret
kubectl create secret docker-registry my-registry-secret \
--docker-server=registry.example.com \
--docker-username=user \
--docker-password=password
```
---
### 7. 存储卷Volume
#### **常见 Volume 类型**
| 类型 | 说明 | 适用场景 |
|------|------|----------|
| emptyDir | 临时目录Pod 删除后数据丢失 | 临时缓存 |
| hostPath | 主机路径Pod 删除后数据保留 | 日志收集、监控 |
| PersistentVolumeClaim | 持久化存储 | 数据库、应用数据 |
| ConfigMap | 配置文件 | 应用配置 |
| Secret | 敏感数据 | 密钥、证书 |
#### **PV/PVC 示例**
```yaml
# 1. PersistentVolume (PV)
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-example
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
nfs:
server: 192.168.1.100
path: /data/nfs
# 2. PersistentVolumeClaim (PVC)
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-example
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
# 3. 使用 PVC
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: app
image: my-app
volumeMounts:
- name: data-volume
mountPath: /data
volumes:
- name: data-volume
persistentVolumeClaim:
claimName: pvc-example
```
**StorageClass动态存储分配**
```yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
iopsPerGB: "10"
reclaimPolicy: Delete
volumeBindingMode: Immediate
```
---
### 8. Kubernetes 调度流程
#### **调度流程图**
```
1. Pod 创建
2. API Server 接收请求,写入 etcd
3. Scheduler 监听到未调度的 Pod
4. 预选Predicate过滤掉不符合条件的节点
- 资源是否足够CPU、内存
- 节点选择器nodeSelector
- 亲和性/反亲和性
- Taints 和 Tolerations
5. 优选Priority给符合条件的节点打分
- 资源利用率
- 镜像本地缓存
- Pod 分散性
6. 选择得分最高的节点
7. 绑定Binding将 Pod 绑定到节点
8. API Server 更新 Pod 状态
9. kubelet 监听到 Pod 分配到自己,启动容器
```
#### **调度约束示例**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
# 1. nodeSelector节点选择器
nodeSelector:
disktype: ssd
# 2. Node Affinity节点亲和性
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: disktype
operator: In
values:
- ssd
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: zone
operator: In
values:
- cn-shanghai-a
# 3. Pod AffinityPod 亲和性)
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- nginx
topologyKey: kubernetes.io/hostname
# 4. Tolerations容忍度
tolerations:
- key: "dedicated"
operator: "Equal"
value: "gpu"
effect: "NoSchedule"
```
---
### 9. Ingress vs NodePort vs LoadBalancer
#### **对比表**
| 类型 | 适用场景 | 优点 | 缺点 |
|------|----------|------|------|
| NodePort | 测试、开发 | 简单 | 端口管理复杂、性能一般 |
| LoadBalancer | 生产环境(云服务商) | 自动负载均衡 | 成本高、依赖云厂商 |
| Ingress | 生产环境(推荐) | 灵活、支持 7 层路由 | 配置复杂 |
#### **NodePort 示例**
```yaml
apiVersion: v1
kind: Service
metadata:
name: nginx-nodeport
spec:
type: NodePort
ports:
- port: 80
targetPort: 80
nodePort: 30080 # 30000-32767
selector:
app: nginx
```
#### **LoadBalancer 示例**
```yaml
apiVersion: v1
kind: Service
metadata:
name: nginx-lb
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 80
selector:
app: nginx
```
#### **Ingress 示例**
```yaml
# 1. 安装 Ingress Controller如 Nginx
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.1.0/deploy/static/provider/cloud/deploy.yaml
# 2. 创建 Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: nginx-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- host: example.com # 域名
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: nginx-service
port:
number: 80
- host: api.example.com
http:
paths:
- path: /v1
pathType: Prefix
backend:
service:
name: api-service
port:
number: 8080
```
**Ingress 高级配置**
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: nginx-ingress
annotations:
# TLS
cert-manager.io/cluster-issuer: "letsencrypt-prod"
# 限流
nginx.ingress.kubernetes.io/limit-rps: "10"
# 超时
nginx.ingress.kubernetes.io/proxy-connect-timeout: "600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
spec:
tls:
- hosts:
- example.com
secretName: example-tls
rules:
- host: example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: nginx-service
port:
number: 80
```
---
### 10. 生产环境踩坑经验
#### **坑 1Pod 无法启动ImagePullBackOff**
```bash
# 问题:镜像拉取失败
kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-pod 0/1 ImagePullBackOff 0 2m
# 排查
kubectl describe pod nginx-pod
# Events: Failed to pull image "nginx:latest": rpc error: code = Unknown
# 解决
# 1. 检查镜像名称和标签
# 2. 检查私有仓库凭证
kubectl create secret docker-registry my-registry-secret \
--docker-server=registry.example.com \
--docker-username=user \
--docker-password=password
# 3. 在 Pod 中引用 Secret
spec:
imagePullSecrets:
- name: my-registry-secret
```
#### **坑 2CrashLoopBackOff**
```bash
# 问题Pod 不断重启
kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-pod 0/1 CrashLoopBackOff 5 10m
# 排查
kubectl logs nginx-pod
# Error: Cannot connect to database
# 解决
# 1. 检查应用日志
# 2. 检查配置文件
# 3. 检查依赖服务数据库、Redis
kubectl describe pod nginx-pod
# 检查 Events
```
#### **坑 3资源限制设置不当**
```yaml
# 问题Pod 被杀OOMKilled
# 原因:内存限制太小
# 解决
resources:
requests:
memory: "256Mi" # 保证最小内存
cpu: "500m" # 保证最小 CPU
limits:
memory: "512Mi" # 最大内存
cpu: "1000m" # 最大 CPU
# 监控资源使用
kubectl top pods
kubectl top nodes
```
#### **坑 4滚动更新失败**
```bash
# 问题:更新后,所有 Pod 都不可用
kubectl rollout status deployment/nginx-deployment
# Waiting for deployment "nginx-deployment" to progress
# 解决
# 1. 回滚到上一版本
kubectl rollout undo deployment/nginx-deployment
# 2. 查看历史版本
kubectl rollout history deployment/nginx-deployment
# 3. 设置健康检查
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
```
#### **坑 5DNS 解析失败**
```bash
# 问题Pod 无法访问 Service
curl http://nginx-service.default.svc.cluster.local
# curl: (6) Could not resolve host
# 排查
kubectl exec -it my-app -- cat /etc/resolv.conf
# nameserver 10.96.0.10
# 解决
# 1. 检查 kube-dns/CoreDNS 是否运行
kubectl get pods -n kube-system
# 2. 检查 DNS 配置
kubectl get configmap coredns -n kube-system -o yaml
# 3. 重启 DNS
kubectl rollout restart deployment/coredns -n kube-system
```
---
### 11. 实际项目经验
#### **场景 1高可用部署**
```yaml
# 需求:保证服务高可用
# 方案:
# 1. 多副本
replicas: 3
# 2. Pod 反亲和性(分散到不同节点)
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- nginx
topologyKey: kubernetes.io/hostname
# 3. 健康检查
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
# 4. 资源限制
resources:
requests:
memory: "256Mi"
cpu: "500m"
limits:
memory: "512Mi"
cpu: "1000m"
```
#### **场景 2自动扩缩容HPA**
```yaml
# 1. 安装 Metrics Server
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
# 2. 创建 HPA
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx-deployment
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50 # CPU 使用率超过 50% 时扩容
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80 # 内存使用率超过 80% 时扩容
```
#### **场景 3配置管理**
```yaml
# 需求:不同环境使用不同配置
# 方案:使用 ConfigMap
# 开发环境
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config-dev
namespace: dev
data:
spring.profiles.active: "dev"
spring.datasource.url: "jdbc:mysql://dev-mysql:3306/db"
# 生产环境
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config-prod
namespace: prod
data:
spring.profiles.active: "prod"
spring.datasource.url: "jdbc:mysql://prod-mysql:3306/db"
# Pod 使用
apiVersion: v1
kind: Pod
metadata:
name: my-app
namespace: prod
spec:
containers:
- name: app
image: my-app
envFrom:
- configMapRef:
name: app-config-prod
```
---
### 12. 阿里 P7 加分项
**架构设计能力**
- 设计过大规模 Kubernetes 集群1000+ 节点)
- 有多集群/多云 Kubernetes 管理经验
- 实现过自定义 Controller 和 Operator
**深度理解**
- 熟悉 Kubernetes 源码(调度器、控制器、网络模型)
- 理解 Container RuntimeDocker、Containerd、CRI-O
- 有 CNI 插件开发经验
**性能调优**
- 优化过 etcd 性能(存储压缩、快照策略)
- 调整过 kubelet 参数(最大 Pod 数、镜像垃圾回收)
- 优化过网络性能CNI 插件选择、MTU 配置)
**生产实践**
- 主导过从 Docker Swarm 迁移到 Kubernetes
- 解决过生产环境的疑难问题网络分区、etcd 数据恢复)
- 实现过 Kubernetes 多租户隔离
**云原生生态**
- 熟悉 Helm Chart 开发和模板化部署
- 使用过 Prometheus + Grafana 监控 Kubernetes
- 实现过 Kubernetes CI/CD 流程GitOps、ArgoCD
**安全实践**
- 实现 Pod Security StandardsPod Security Policy
- 有 RBAC 权限管理经验
- 使用过 Falco/Kyverno 实现安全策略
- 实现过镜像签名和验证Notary
**成本优化**
- 使用 Cluster Autoscaler 自动扩缩节点
- 实现过 Pod 优先级和抢占机制
- 使用 Spot 实例降低成本
- 资源配额和限制范围管理