Files
interview/questions/ci-cd.md
yasinshaw d80d1cf553 feat: add infrastructure interview questions
Add comprehensive interview materials for:
- Service Mesh (Istio, Linkerd)
- RPC Framework (Dubbo, gRPC)
- Container Orchestration (Kubernetes)
- CI/CD (Jenkins, GitLab CI, GitHub Actions)
- Observability (Monitoring, Logging, Tracing)

Each file includes:
- 5-10 core questions
- Detailed standard answers
- Code examples
- Real-world project experience
- Alibaba P7 bonus points

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-01 00:06:28 +08:00

1339 lines
33 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# CI/CD (持续集成/持续部署)
## 问题
**背景**CI/CD 是现代软件开发的核心实践,通过自动化构建、测试和部署,提高软件交付速度和质量。
**问题**
1. 什么是 CI/CD它解决了哪些问题
2. Jenkins Pipeline 的核心概念是什么?
3. 请描述一个完整的 CI/CD 流程
4. GitLab CI 和 GitHub Actions 的区别是什么?
5. 如何实现蓝绿部署和金丝雀发布?
6. CI/CD 中的环境变量和密钥如何管理?
7. 如何实现基础设施即代码IaC
8. CI/CD 流水线如何集成测试单元测试、集成测试、E2E 测试)?
9. 如何回滚失败的部署?
10. 在实际项目中如何设计 CI/CD 流水线?
---
## 标准答案
### 1. CI/CD 概述
#### **定义**
```
CI (Continuous Integration持续集成)
- 开发人员频繁提交代码到共享仓库
- 每次提交都自动触发构建和测试
- 及早发现集成错误
CD (Continuous Delivery持续交付)
- 代码通过测试后自动部署到 staging 环境
- 随时可以部署到生产环境
- 需要手动触发生产部署
CD (Continuous Deployment持续部署)
- 代码通过测试后自动部署到生产环境
- 无需人工干预
- 完全自动化
```
#### **解决的问题**
```
传统开发痛点:
├─ 集成困难(大量代码合并冲突)
├─ 测试反馈慢(手动测试耗时长)
├─ 部署风险高(手动部署容易出错)
├─ 交付周期长(从开发到上线耗时数周)
└─ 回滚困难(出问题难以快速恢复)
CI/CD 解决方案:
├─ 自动化构建和测试(每次提交自动运行)
├─ 快速反馈(几分钟内知道测试结果)
├─ 自动化部署(一键部署到任何环境)
├─ 快速交付(每天多次部署)
└─ 易于回滚(保留历史版本,快速回滚)
```
---
### 2. Jenkins Pipeline 核心概念
#### **Pipeline 类型**
```groovy
// 1. Declarative Pipeline声明式推荐
pipeline {
agent any
stages {
stage('Build') {
steps {
sh 'mvn clean package'
}
}
stage('Test') {
steps {
sh 'mvn test'
}
}
stage('Deploy') {
steps {
sh 'kubectl apply -f k8s/'
}
}
}
post {
success {
echo 'Pipeline succeeded!'
}
failure {
echo 'Pipeline failed!'
}
}
}
// 2. Scripted Pipeline脚本式灵活
node {
stage('Build') {
sh 'mvn clean package'
}
stage('Test') {
sh 'mvn test'
}
stage('Deploy') {
sh 'kubectl apply -f k8s/'
}
}
```
#### **核心概念**
**1. Agent代理**
```groovy
// 任意可用 agent
agent any
// 指定标签
agent { label 'linux' }
// Docker agent
agent {
docker {
image 'maven:3.8.1-openjdk-11'
args '-v $HOME/.m2:/root/.m2'
}
}
// Kubernetes agentPod Template
agent {
kubernetes {
yaml '''
spec:
containers:
- name: maven
image: maven:3.8.1-openjdk-11
command: ["cat"]
tty: true
'''
}
}
```
**2. Stages阶段**
```groovy
stages {
stage('Build') {
when {
branch 'main' // 只在 main 分支执行
}
steps {
sh 'mvn clean package'
}
}
stage('Deploy to Staging') {
when {
branch 'develop'
}
steps {
sh 'kubectl apply -f k8s/staging/'
}
}
stage('Deploy to Production') {
when {
tag pattern: "v\\d+\\.\\d+\\.\\d+", comparator: "REGEXP"
}
steps {
sh 'kubectl apply -f k8s/production/'
}
}
}
```
**3. Post后置操作**
```groovy
post {
always {
junit 'target/surefire-reports/*.xml' // 发布测试报告
}
success {
sh 'notify-success.sh' // 发送成功通知
}
failure {
sh 'notify-failure.sh' // 发送失败通知
}
unstable {
echo 'This pipeline is unstable!'
}
changed {
echo 'Pipeline status changed!'
}
}
```
**4. Environment环境变量**
```groovy
environment {
MAVEN_HOME = '/opt/maven'
DATABASE_URL = credentials('database-url') // 从 Jenkins 凭证获取
DEPLOY_ENV = 'staging'
}
pipeline {
agent any
environment {
APP_VERSION = sh(script: 'git describe --tags --always', returnStdout: true).trim()
}
stages {
stage('Build') {
steps {
sh "mvn clean package -Dapp.version=${APP_VERSION}"
}
}
}
}
```
**5. Parameters参数化构建**
```groovy
pipeline {
agent any
parameters {
string(name: 'DEPLOY_ENV', defaultValue: 'staging', description: 'Deploy environment')
booleanParam(name: 'RUN_TESTS', defaultValue: true, description: 'Run tests')
choice(name: 'TIER', choices: ['dev', 'staging', 'production'], description: 'Environment tier')
}
stages {
stage('Deploy') {
when {
expression { params.DEPLOY_ENV == 'production' }
}
steps {
sh "kubectl apply -f k8s/${params.DEPLOY_ENV}/"
}
}
}
}
```
---
### 3. 完整 CI/CD 流程
#### **流程图**
```
开发者提交代码
触发 CI/CD 流水线
┌─────────────────────────────────────┐
│ 1. 代码检查 │
│ - 代码风格检查 (ESLint, Checkstyle)│
│ - 静态分析 (SonarQube) │
│ - 安全扫描 (Snyk, OWASP) │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ 2. 构建 │
│ - 编译 (Maven/Gradle/npm) │
│ - 打包 (JAR/WAR/Docker镜像) │
│ - 版本打标 (git tag) │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ 3. 测试 │
│ - 单元测试 (JUnit, pytest) │
│ - 集成测试 (TestContainers) │
│ - 代码覆盖率 (JaCoCo) │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ 4. 构建镜像 │
│ - Docker Build │
│ - 推送到镜像仓库 (Docker Hub/ECR) │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ 5. 部署到 Staging 环境 │
│ - Kubernetes Deploy │
│ - 数据库迁移 (Flyway/Liquibase) │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ 6. 自动化测试 │
│ - E2E 测试 (Selenium/Cypress) │
│ - 性能测试 (JMeter/Gatling) │
│ - 安全测试 (OWASP ZAP) │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ 7. 人工审核 (可选) │
│ - 查看测试报告 │
│ - 审核变更内容 │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ 8. 部署到 Production 环境 │
│ - 蓝绿部署/金丝雀发布 │
│ - 监控验证 │
└─────────────────────────────────────┘
┌─────────────────────────────────────┐
│ 9. 发布通知 │
│ - 钉钉/企业微信/Slack │
│ - 发布日志 │
└─────────────────────────────────────┘
```
#### **Jenkinsfile 示例**
```groovy
pipeline {
agent any
tools {
maven 'Maven 3.8.1'
jdk 'JDK 11'
}
environment {
IMAGE_NAME = "my-app"
IMAGE_TAG = "${env.BUILD_NUMBER}"
REGISTRY = "registry.example.com"
KUBECONFIG = credentials('kubeconfig')
}
stages {
stage('Checkout') {
steps {
git branch: 'main', url: 'https://github.com/myorg/myapp.git'
}
}
stage('Code Quality') {
steps {
sh 'mvn checkstyle:check'
sh 'mvn spotbugs:check'
script {
def scannerHome = tool 'SonarQube Scanner';
withSonarQubeEnv('MySonarQube') {
sh "${scannerHome}/bin/sonar-scanner"
}
}
}
}
stage('Build') {
steps {
sh 'mvn clean package -DskipTests'
archiveArtifacts artifacts: 'target/*.jar', fingerprint: true
}
}
stage('Unit Test') {
steps {
sh 'mvn test'
junit 'target/surefire-reports/*.xml'
}
}
stage('Integration Test') {
steps {
sh 'mvn verify -Pintegration-test'
junit 'target/failsafe-reports/*.xml'
}
}
stage('Build & Push Docker Image') {
steps {
script {
docker.withRegistry("https://${REGISTRY}", 'docker-registry-credentials') {
def image = docker.build("${IMAGE_NAME}:${IMAGE_TAG}")
image.push()
image.push('latest')
}
}
}
}
stage('Deploy to Staging') {
steps {
sh """
kubectl set image deployment/my-app \
my-app=${REGISTRY}/${IMAGE_NAME}:${IMAGE_TAG} \
--namespace=staging
"""
sh 'kubectl rollout status deployment/my-app --namespace=staging'
}
}
stage('E2E Test') {
steps {
sh 'mvn verify -Pe2e-test'
}
}
stage('Deploy to Production') {
when {
branch 'main'
}
steps {
input message: 'Deploy to Production?', ok: 'Deploy'
sh """
kubectl set image deployment/my-app \
my-app=${REGISTRY}/${IMAGE_NAME}:${IMAGE_TAG} \
--namespace=production
"""
sh 'kubectl rollout status deployment/my-app --namespace=production'
}
}
}
post {
always {
cleanWs()
}
success {
sh 'notify-success.sh'
}
failure {
sh 'notify-failure.sh'
}
}
}
```
---
### 4. GitLab CI vs GitHub Actions
#### **对比表**
| 特性 | GitLab CI | GitHub Actions |
|------|-----------|----------------|
| 集成度 | GitLab 内置 | GitHub 内置 |
| 配置文件 | .gitlab-ci.yml | .github/workflows/*.yml |
| Runner 类型 | Shared/Specific/Group | Hosted/Self-hosted |
| 缓存 | artifacts/cache | actions/cache |
| 密钥管理 | Variables/Secrets | Secrets/Environments |
| 矩阵构建 | 支持 | 支持 |
| 复用性 | Include/Template | Reusable Workflows |
| 社区生态 | 丰富 | 快速增长 |
#### **GitLab CI 示例**
```yaml
# .gitlab-ci.yml
stages:
- build
- test
- deploy
variables:
MAVEN_OPTS: "-Dmaven.repo.local=$CI_PROJECT_DIR/.m2/repository"
IMAGE_NAME: registry.example.com/my-app
IMAGE_TAG: $CI_PIPELINE_ID
cache:
paths:
- .m2/repository/
# 构建
build:
stage: build
image: maven:3.8.1-openjdk-11
script:
- mvn clean package -DskipTests
artifacts:
paths:
- target/*.jar
expire_in: 1 week
# 单元测试
unit-test:
stage: test
image: maven:3.8.1-openjdk-11
script:
- mvn test
artifacts:
reports:
junit: target/surefire-reports/*.xml
# 代码质量
code-quality:
stage: test
image: sonarsource/sonar-scanner-cli
script:
- sonar-scanner
allow_failure: true
# 构建镜像
build-image:
stage: build
image: docker:20.10.7
services:
- docker:20.10.7-dind
before_script:
- docker login -u $REGISTRY_USER -p $REGISTRY_PASSWORD registry.example.com
script:
- docker build -t $IMAGE_NAME:$IMAGE_TAG .
- docker push $IMAGE_NAME:$IMAGE_TAG
- docker tag $IMAGE_NAME:$IMAGE_TAG $IMAGE_NAME:latest
- docker push $IMAGE_NAME:latest
# 部署到 Staging
deploy-staging:
stage: deploy
image: bitnami/kubectl:latest
script:
- kubectl set image deployment/my-app my-app=$IMAGE_NAME:$IMAGE_TAG --namespace=staging
- kubectl rollout status deployment/my-app --namespace=staging
environment:
name: staging
url: https://staging.example.com
only:
- develop
# 部署到 Production
deploy-production:
stage: deploy
image: bitnami/kubectl:latest
script:
- kubectl set image deployment/my-app my-app=$IMAGE_NAME:$IMAGE_TAG --namespace=production
- kubectl rollout status deployment/my-app --namespace=production
environment:
name: production
url: https://example.com
when: manual # 手动触发
only:
- main
```
#### **GitHub Actions 示例**
```yaml
# .github/workflows/ci-cd.yml
name: CI/CD Pipeline
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
release:
types: [ created ]
env:
REGISTRY: registry.example.com
IMAGE_NAME: my-app
jobs:
# 并行运行
build-and-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up JDK 11
uses: actions/setup-java@v3
with:
java-version: '11'
distribution: 'temurin'
cache: maven
- name: Build with Maven
run: mvn clean package -DskipTests
- name: Run Unit Tests
run: mvn test
- name: Upload Test Results
if: always()
uses: actions/upload-artifact@v3
with:
name: test-results
path: target/surefire-reports/*.xml
- name: Build Docker Image
run: |
docker build -t ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.run_number }} .
docker tag ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.run_number }} ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:latest
- name: Login to Registry
uses: docker/login-action@v2
with:
registry: ${{ env.REGISTRY }}
username: ${{ secrets.REGISTRY_USERNAME }}
password: ${{ secrets.REGISTRY_PASSWORD }}
- name: Push Docker Image
run: |
docker push ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.run_number }}
docker push ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:latest
# 依赖 build-and-test
deploy-staging:
needs: build-and-test
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/develop'
steps:
- uses: actions/checkout@v3
- name: Set up kubectl
uses: azure/setup-kubectl@v3
- name: Deploy to Staging
run: |
kubectl set image deployment/my-app my-app=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.run_number }} --namespace=staging
kubectl rollout status deployment/my-app --namespace=staging
deploy-production:
needs: build-and-test
runs-on: ubuntu-latest
if: github.event_name == 'release'
environment:
name: production
url: https://example.com
steps:
- uses: actions/checkout@v3
- name: Set up kubectl
uses: azure/setup-kubectl@v3
- name: Deploy to Production
run: |
kubectl set image deployment/my-app my-app=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.run_number }} --namespace=production
kubectl rollout status deployment/my-app --namespace=production
- name: Notify
run: |
curl -X POST ${{ secrets.SLACK_WEBHOOK }} \
-H 'Content-Type: application/json' \
-d '{"text":"Deployment to Production successful!"}'
```
---
### 5. 蓝绿部署和金丝雀发布
#### **蓝绿部署Blue-Green Deployment**
```
步骤:
1. 部署新版本到 Green 环境
2. 验证 Green 环境健康检查、E2E 测试)
3. 切换流量到 Green 环境
4. 保留 Blue 环境,以便快速回滚
优势:
- 零停机部署
- 快速回滚(切换回 Blue
- 风险低
劣势:
- 需要双倍资源
- 数据库迁移需要特殊处理
```
**Kubernetes 实现**
```yaml
# 1. 部署 Blue当前版本
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app-blue
spec:
replicas: 3
selector:
matchLabels:
app: my-app
version: blue
template:
metadata:
labels:
app: my-app
version: blue
spec:
containers:
- name: my-app
image: my-app:1.0.0
---
# 2. 部署 Green新版本
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app-green
spec:
replicas: 3
selector:
matchLabels:
app: my-app
version: green
template:
metadata:
labels:
app: my-app
version: green
spec:
containers:
- name: my-app
image: my-app:2.0.0
---
# 3. Service 指向 Blue
apiVersion: v1
kind: Service
metadata:
name: my-app
spec:
selector:
app: my-app
version: blue # 当前指向 Blue
ports:
- port: 80
targetPort: 8080
# 切换到 Green修改 selector.version = green
```
**Jenkins Pipeline**
```groovy
stage('Blue-Green Deployment') {
steps {
script {
// 1. 部署 Green
sh "kubectl apply -f k8s/green-deployment.yaml"
// 2. 等待 Green Ready
sh 'kubectl rollout status deployment/my-app-green'
// 3. 健康检查
sh """
for i in {1..30}; do
curl -f http://my-app-green.default.svc.cluster.local/health && break || sleep 5
done
"""
// 4. 切换流量到 Green
sh "kubectl patch service my-app -p '{\"spec\":{\"selector\":{\"version\":\"green\"}}}'"
// 5. 监控 Green如果失败切换回 Blue
timeout(time: 5, unit: 'MINUTES') {
input message: 'Verify Green environment. OK to proceed?', ok: 'Keep Green'
}
}
}
post {
failure {
// 回滚到 Blue
sh "kubectl patch service my-app -p '{\"spec\":{\"selector\":{\"version\":\"blue\"}}}'"
}
}
}
```
#### **金丝雀发布Canary Deployment**
```
步骤:
1. 部署新版本到小部分实例(如 10%
2. 观察错误率、延迟等指标
3. 逐步增加流量10% → 50% → 100%
4. 如果出现问题,立即回滚
优势:
- 风险可控
- 渐进式发布
- 可以快速发现问题
劣势:
- 需要流量管理(如 Istio
- 监控要求高
```
**Istio 实现**
```yaml
# 1. 部署 v1 和 v2
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app-v1
spec:
replicas: 9
template:
metadata:
labels:
app: my-app
version: v1
spec:
containers:
- name: my-app
image: my-app:1.0.0
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app-v2
spec:
replicas: 1
template:
metadata:
labels:
app: my-app
version: v2
spec:
containers:
- name: my-app
image: my-app:2.0.0
---
# 2. VirtualService 配置金丝雀
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: my-app
spec:
hosts:
- my-app
http:
- route:
- destination:
host: my-app
subset: v1
weight: 90 # 90% 流量到 v1
- destination:
host: my-app
subset: v2
weight: 10 # 10% 流量到 v2
---
# 3. DestinationRule 定义 subset
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: my-app
spec:
host: my-app
subsets:
- name: v1
labels:
version: v1
- name: v2
labels:
version: v2
```
**Jenkins Pipeline**
```groovy
stage('Canary Deployment') {
steps {
script {
// 1. 部署 v21 个副本)
sh "kubectl apply -f k8s/v2-deployment.yaml"
// 2. 配置 10% 流量到 v2
sh "kubectl apply -f istio/10-percent-canary.yaml"
// 3. 监控 5 分钟
sleep(time: 5, unit: 'MINUTES')
// 4. 检查错误率
def errorRate = sh(
script: 'curl -s http://prometheus/api/v1/query?query=rate(requests_total{status=~"5.."}[5m]) | jq .data.result[0].value[1]',
returnStdout: true
).trim()
if (errorRate.toDouble() > 0.01) {
error "Error rate too high: ${errorRate}"
}
// 5. 逐步增加流量
sh "kubectl apply -f istio/50-percent-canary.yaml"
sleep(time: 5, unit: 'MINUTES')
// 6. 100% 流量到 v2
sh "kubectl apply -f istio/100-percent-canary.yaml"
}
}
post {
failure {
// 回滚到 v1
sh "kubectl apply -f istio/rollback-to-v1.yaml"
}
}
}
```
---
### 6. 环境变量和密钥管理
#### **Jenkins 凭证管理**
```groovy
// 1. 使用 Jenkins 凭证
withCredentials([
string(credentialsId: 'database-url', variable: 'DATABASE_URL'),
usernamePassword(credentialsId: 'docker-registry', usernameVariable: 'REGISTRY_USER', passwordVariable: 'REGISTRY_PASSWORD')
]) {
sh """
docker login -u $REGISTRY_USER -p $REGISTRY_PASSWORD registry.example.com
docker build -t myapp:${BUILD_NUMBER} --build-arg DATABASE_URL=$DATABASE_URL .
"""
}
// 2. 使用 Secret File
withCredentials([file(credentialsId: 'kubeconfig', variable: 'KUBECONFIG')]) {
sh 'kubectl --kubeconfig=$KUBECONFIG get pods'
}
// 3. 使用 Secret Text
withCredentials([string(credentialsId: 'slack-webhook', variable: 'SLACK_WEBHOOK')]) {
sh """
curl -X POST $SLACK_WEBHOOK \
-H 'Content-Type: application/json' \
-d '{"text":"Build successful!"}'
"""
}
```
#### **GitLab CI 密钥管理**
```yaml
# 在 GitLab UI 中设置 CI/CD Variables
variables:
# 普通变量
DEPLOY_ENV: production
# 使用 Masked Variables隐藏变量
build:
script:
- docker login -u $REGISTRY_USER -p $REGISTRY_PASSWORD
- echo $DATABASE_URL # 在日志中会被隐藏为 ****
# 使用 File 类型变量(自动保存为文件)
deploy:
script:
- kubectl --kubeconfig=$KUBECONFIG get pods
```
#### **GitHub Actions 密钥管理**
```yaml
# 在 GitHub UI 中设置 Secrets
env:
DATABASE_URL: ${{ secrets.DATABASE_URL }}
steps:
- name: Login to Registry
uses: docker/login-action@v2
with:
registry: registry.example.com
username: ${{ secrets.REGISTRY_USERNAME }}
password: ${{ secrets.REGISTRY_PASSWORD }}
- name: Deploy
run: |
kubectl --kubeconfig <(echo "${{ secrets.KUBECONFIG }}") get pods
```
#### **Kubernetes Secrets**
```yaml
# 1. 创建 Secret
apiVersion: v1
kind: Secret
metadata:
name: app-secrets
type: Opaque
data:
database-url: BASE64_ENCODED_URL
api-key: BASE64_ENCODED_KEY
# 2. 在 Pod 中使用
apiVersion: v1
kind: Pod
spec:
containers:
- name: app
image: my-app
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: app-secrets
key: database-url
- name: API_KEY
valueFrom:
secretKeyRef:
name: app-secrets
key: api-key
```
---
### 7. 基础设施即代码IaC
#### **Terraform 示例**
```hcl
# main.tf
provider "aws" {
region = "us-west-2"
}
# VPC
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "main-vpc"
}
}
# EKS Cluster
resource "aws_eks_cluster" "main" {
name = "main-cluster"
role_arn = aws_iam_role.eks_cluster.arn
vpc_config {
subnet_ids = aws_subnet.private[*].id
}
depends_on = [aws_iam_role_policy_attachment.eks_cluster_policy]
}
# Node Group
resource "aws_eks_node_group" "main" {
cluster_name = aws_eks_cluster.main.name
node_group_name = "main-node-group"
node_role_arn = aws_iam_role.eks_nodes.arn
subnet_ids = aws_subnet.private[*].id
scaling_config {
desired_size = 3
max_size = 5
min_size = 1
}
instance_types = ["t3.medium"]
depends_on = [aws_iam_role_policy_attachment.eks_nodes_policy]
}
# RDS Database
resource "aws_db_instance" "main" {
identifier = "main-db"
engine = "mysql"
engine_version = "8.0"
instance_class = "db.t3.micro"
allocated_storage = 20
storage_encrypted = true
db_name = "mydb"
username = var.db_username
password = var.db_password
vpc_security_group_ids = [aws_security_group.db.id]
db_subnet_group_name = aws_db_subnet_group.main.name
backup_retention_period = 7
skip_final_snapshot = false
final_snapshot_identifier = "main-db-final-snapshot"
}
# Output
output "cluster_endpoint" {
value = aws_eks_cluster.main.endpoint
}
output "db_endpoint" {
value = aws_db_instance.main.endpoint
sensitive = true
}
```
**在 CI/CD 中使用 Terraform**
```groovy
stage('Terraform Apply') {
steps {
withCredentials([
string(credentialsId: 'aws-access-key-id', variable: 'AWS_ACCESS_KEY_ID'),
string(credentialsId: 'aws-secret-access-key', variable: 'AWS_SECRET_ACCESS_KEY')
]) {
script {
dir('terraform') {
// 1. 初始化
sh 'terraform init'
// 2. 格式检查
sh 'terraform fmt -check'
// 3. 验证
sh 'terraform validate'
// 4. Plan
def plan = sh(
script: 'terraform plan -out=tfplan',
returnStdout: true
)
// 5. 人工审核
input message: "Review Terraform Plan:\n${plan}", ok: 'Apply'
// 6. Apply
sh 'terraform apply tfplan'
}
}
}
}
}
```
---
### 8. 集成测试
#### **Pipeline 测试阶段**
```groovy
stage('Test') {
parallel {
stage('Unit Test') {
steps {
sh 'mvn test'
junit 'target/surefire-reports/*.xml'
}
}
stage('Integration Test') {
steps {
sh 'mvn verify -Pintegration-test'
junit 'target/failsafe-reports/*.xml'
}
}
stage('Code Coverage') {
steps {
sh 'mvn jacoco:report'
jacoco execPattern: 'target/jacoco.exec', classPattern: 'target/classes', sourcePattern: 'src/main/java'
}
}
}
}
stage('E2E Test') {
steps {
sh 'mvn verify -Pe2e-test'
publishHTML([
reportDir: 'target/cypress-report',
reportFiles: 'index.html',
reportName: 'E2E Test Report'
])
}
}
```
#### **TestContainers 示例**
```java
@SpringBootTest
@Testcontainers
public class UserServiceIntegrationTest {
@Container
private static final PostgreSQL<?> postgres = new PostgreSQL<>("postgres:13");
@Container
private static final GenericContainer<?> redis = new GenericContainer<>("redis:6")
.withExposedPorts(6379);
@DynamicPropertySource
static void configureProperties(DynamicPropertyRegistry registry) {
registry.add("spring.datasource.url", postgres::getJdbcUrl);
registry.add("spring.datasource.username", postgres::getUsername);
registry.add("spring.datasource.password", postgres::getPassword);
registry.add("spring.redis.host", redis::getHost);
registry.add("spring.redis.port", () -> redis.getFirstMappedPort());
}
@Test
void shouldCreateUser() {
User user = new User("Alice", "alice@example.com");
User saved = userService.save(user);
assertNotNull(saved.getId());
}
}
```
---
### 9. 回滚失败的部署
#### **Kubernetes 回滚**
```bash
# 1. 查看历史版本
kubectl rollout history deployment/my-app
# 2. 回滚到上一版本
kubectl rollout undo deployment/my-app
# 3. 回滚到指定版本
kubectl rollout undo deployment/my-app --to-revision=3
# 4. 暂停部署(出现问题)
kubectl rollout pause deployment/my-app
# 5. 恢复部署
kubectl rollout resume deployment/my-app
```
#### **Jenkins Pipeline 回滚**
```groovy
stage('Deploy') {
steps {
script {
// 1. 记录当前版本
def currentVersion = sh(
script: 'kubectl get deployment my-app -o jsonpath="{.spec.template.spec.containers[0].image}"',
returnStdout: true
).trim()
// 2. 部署新版本
sh "kubectl set image deployment/my-app my-app=${IMAGE}:${TAG}"
// 3. 等待部署完成
timeout(time: 5, unit: 'MINUTES') {
sh 'kubectl rollout status deployment/my-app'
}
// 4. 健康检查
sh """
for i in {1..30}; do
curl -f http://my-app.default.svc.cluster.local/health && break || sleep 5
done
"""
// 5. 监控
sleep(time: 2, unit: 'MINUTES')
// 6. 检查错误率
def errorRate = sh(
script: 'curl -s http://prometheus/api/v1/query?query=rate(requests_total{status=~"5.."}[5m]) | jq .data.result[0].value[1]',
returnStdout: true
).trim()
if (errorRate.toDouble() > 0.05) {
error "Error rate too high: ${errorRate}, rolling back..."
}
}
}
post {
failure {
script {
// 回滚到上一版本
sh 'kubectl rollout undo deployment/my-app'
echo 'Rolled back to previous version'
}
}
}
}
```
---
### 10. 实际项目经验
#### **场景 1电商系统 CI/CD 流水线**
```
需求:
- 代码提交后自动构建、测试
- 通过测试后自动部署到 Staging
- Staging 通过 E2E 测试后,手动部署到 Production
- 生产环境支持蓝绿部署
方案:
1. GitHub Actions 构建 Docker 镜像
2. 推送到私有镜像仓库
3. 部署到 Kubernetes Staging 环境
4. 运行 Cypress E2E 测试
5. 人工审核后部署到 Production
6. 使用 Istio 实现蓝绿切换
```
#### **场景 2数据库迁移自动化**
```yaml
# 使用 Flyway 自动迁移数据库
migrate:
stage: migrate
image: flyway/flyway:7
script:
- flyway migrate -url=$DATABASE_URL -user=$DATABASE_USER -password=$DATABASE_PASSWORD
only:
- main
```
#### **场景 3多环境配置管理**
```yaml
# 使用 Helm Charts 实现多环境部署
deploy-staging:
script:
- helm upgrade --install my-app ./helm-chart --namespace staging --values helm-chart/values-staging.yaml
deploy-production:
script:
- helm upgrade --install my-app ./helm-chart --namespace production --values helm-chart/values-production.yaml
```
---
### 11. 阿里 P7 加分项
**架构设计能力**
- 设计过企业级 CI/CD 平台(支持多语言、多环境)
- 实现过自建 Runner 集群Kubernetes Executors
- 有多租户、多团队的 CI/CD 隔离经验
**深度理解**
- 熟悉 Jenkins/GitLab CI 源码和插件开发
- 理解分布式缓存和构建加速Build Cache、Docker Layer Cache
- 有 GitOps 实践经验ArgoCD、Flux
**性能优化**
- 优化过构建时间(并行构建、增量构建、缓存策略)
- 实现过分布式构建Build Farm
- 优化过 Runner 资源利用率(动态扩缩容)
**安全实践**
- 实现过 CI/CD 安全扫描SAST、DAST、依赖扫描
- 有签名和验证经验容器镜像签名、Commit Signing
- 实现过密钥轮换和凭证管理
**监控和可观测性**
- 集成过 CI/CD 监控(构建成功率、构建时间、部署频率)
- 实现过部署追踪Deployment Tracking、Change Log
- 设计过性能测试自动化K6、JMeter 集成)
**DevSecOps**
- 实现过安全左移Pre-commit Hooks、PR Check
- 集成过合规性检查PCI-DSS、GDPR
- 实现过供应链安全SBOM、漏洞扫描