Nginx作为高性能Web服务器和反向代理,已经成为互联网基础设施的标配。但是,默认配置的Nginx远未发挥出真正的性能潜力。
本文将从多个维度系统性地讲解Nginx性能优化,涵盖从基础配置到系统调优,从单机优化到集群架构的完整方案。
你将学到:
✅ Nginx配置参数深度解析✅ 系统层面性能调优✅ 缓存策略与CDN集成✅ SSL/TLS性能优化✅ 负载均衡高级技巧✅ 监控与故障诊断✅ 从1万到10万+并发的实战经验关键词:Nginx优化、高并发、性能调优、负载均衡、缓存策略
为什么要先测试?
了解当前性能瓶颈优化效果可量化避免过度优化| 工具 | 适用场景 | 优点 | 缺点 |
|---|---|---|---|
| ab | 快速测试 | 简单易用 | 功能单一 |
| wrk | 压力测试 | 性能强、支持Lua | 配置复杂 |
| siege | 并发测试 | 统计详细 | 性能一般 |
| JMeter | 复杂场景 | 功能强大、GUI | 资源消耗大 |
| Locust | 分布式压测 | Python脚本、易扩展 | 学习曲线 |
# 1. ab快速测试(Apache Bench)
ab -n 10000 -c 100 http://localhost/
# 参数说明:
# -n: 总请求数
# -c: 并发数
# -t: 测试时长(秒)
# -k: 开启HTTP KeepAlive
# 输出示例:
# Requests per second: 3421.56 [#/sec] (mean)
# Time per request: 29.226 [ms] (mean)
# Transfer rate: 684.31 [Kbytes/sec] received
# 2. wrk压力测试(推荐)
wrk -t 12 -c 400 -d 30s --latency http://localhost/
# 参数说明:
# -t: 线程数(建议=CPU核心数)
# -c: 连接数
# -d: 测试时长
# --latency: 显示延迟分布
# 输出示例:
# Running 30s test @ http://localhost/
# 12 threads and 400 connections
# Thread Stats Avg Stdev Max +/- Stdev
# Latency 45.32ms 12.89ms 201.15ms 87.23%
# Req/Sec 8.12k 1.33k 12.45k 72.15%
# Latency Distribution
# 50% 43.21ms
# 75% 51.34ms
# 90% 62.45ms
# 99% 89.12ms
# 2897456 requests in 30.01s, 2.13GB read
# Requests/sec: 96523.45
# Transfer/sec: 72.67MB
# 3. siege持续压测
siege -c 200 -t 60s http://localhost/
# 4. wrk高级用法(自定义请求)
cat > post.lua <<EOF
wrk.method = "POST"
wrk.body = '{"key":"value"}'
wrk.headers["Content-Type"] = "application/json"
EOF
wrk -t 4 -c 100 -d 30s -s post.lua http://localhost/api
| 指标 | 含义 | 优秀 | 良好 | 需优化 |
|---|---|---|---|---|
| QPS | 每秒请求数 | >50k | 10k-50k | <10k |
| 延迟(P50) | 50%请求的响应时间 | <10ms | 10-50ms | >50ms |
| 延迟(P99) | 99%请求的响应时间 | <50ms | 50-200ms | >200ms |
| 错误率 | 失败请求比例 | <0.01% | 0.01-0.1% | >0.1% |
| 带宽 | 网络吞吐量 | 接近物理限制 | >50% | <50% |
# 1. 实时监控系统资源
# CPU使用率
top -p $(pgrep nginx | head -1)
# 内存使用
ps aux | grep nginx | awk '{sum+=$6} END {print sum/1024 "MB"}'
# 网络连接数
netstat -an | grep :80 | wc -l
# 2. Nginx状态监控
# 启用stub_status模块
# nginx.conf:
location /nginx_status {
stub_status on;
access_log off;
allow 127.0.0.1;
deny all;
}
# 访问查看
curl http://localhost/nginx_status
# 输出:
# Active connections: 291
# server accepts handled requests
# 16630948 16630948 31070465
# Reading: 6 Writing: 179 Keepalive: 106
# 3. 连接状态统计
ss -s
# 4. TCP队列积压
netstat -s | grep -i listen
# 5. 文件描述符使用
lsof -n | grep nginx | wc -l
cat /proc/sys/fs/file-nr
| 瓶颈类型 | 症状 | 排查方法 |
|---|---|---|
| CPU瓶颈 | CPU使用率>80% |
top,
perf |
| 内存瓶颈 | 频繁swap |
free -h,
vmstat |
| 磁盘I/O | iowait高 |
iostat,
iotop |
| 网络带宽 | 带宽跑满 |
iftop,
nethogs |
| 连接数限制 | Too many open files |
ulimit -n |
| upstream慢 | 后端响应慢 |
$upstream_response_time |
# nginx.conf 核心配置
# 1. Worker进程数
worker_processes auto; # 推荐:auto(自动匹配CPU核心数)
# 或手动指定
# worker_processes 8;
# 2. Worker CPU亲和性(绑定CPU核心,减少上下文切换)
worker_cpu_affinity auto;
# 或手动绑定(4核心示例)
# worker_cpu_affinity 0001 0010 0100 1000;
# 8核心示例:
# worker_cpu_affinity 00000001 00000010 00000100 00001000 00010000 00100000 01000000 10000000;
# 3. Worker优先级(范围-20到19,越小优先级越高)
worker_priority -5;
# 4. 每个Worker的最大连接数
events {
worker_connections 65535; # 根据系统ulimit调整
# 使用epoll(Linux高性能I/O模型)
use epoll;
# 尽可能接受更多连接
multi_accept on;
# 接受锁(防止惊群)
accept_mutex off; # Nginx 1.11.3+版本建议关闭
}
# 5. 最大文件打开数
worker_rlimit_nofile 65535;
最大并发连接数 = worker_processes × worker_connections
理论QPS = 最大并发连接数 / 平均响应时间
示例:
8个worker × 65535连接 = 524,280 并发连接
如果平均响应时间50ms,理论QPS = 524,280 / 0.05 = 10,485,600
http {
# ========== 基础优化 ==========
# 隐藏Nginx版本号(安全)
server_tokens off;
# 文件高效传输
sendfile on;
tcp_nopush on; # 数据包累积到一定大小再发送
tcp_nodelay on; # 小数据包立即发送(和tcp_nopush不冲突)
# ========== 超时设置 ==========
# 客户端请求头超时
client_header_timeout 15s;
# 客户端请求体超时
client_body_timeout 15s;
# 响应客户端超时
send_timeout 15s;
# 长连接超时(重要!)
keepalive_timeout 65s;
keepalive_requests 100; # 单个连接最大请求数
# ========== 缓冲区优化 ==========
# 客户端请求头缓冲
client_header_buffer_size 4k;
large_client_header_buffers 4 32k;
# 客户端请求体缓冲
client_body_buffer_size 128k;
client_max_body_size 50m;
# 输出缓冲
output_buffers 4 32k;
postpone_output 1460; # 累积1460字节(一个MTU)后发送
# ========== Gzip压缩 ==========
gzip on;
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6; # 压缩级别1-9,6是性能和压缩率的平衡
gzip_types
text/plain
text/css
text/xml
text/javascript
application/json
application/javascript
application/xml+rss
application/rss+xml
font/truetype
font/opentype
application/vnd.ms-fontobject
image/svg+xml;
gzip_min_length 1000; # 小于1KB的文件不压缩
gzip_buffers 16 8k;
gzip_http_version 1.1;
# ========== 日志优化 ==========
# 访问日志格式
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for" '
'rt=$request_time uct="$upstream_connect_time" '
'uht="$upstream_header_time" urt="$upstream_response_time"';
# 日志缓冲(减少磁盘写入)
access_log /var/log/nginx/access.log main buffer=32k flush=5s;
# 高并发场景可以关闭访问日志(提升性能)
# access_log off;
# 错误日志级别(warn或error)
error_log /var/log/nginx/error.log warn;
# ========== 文件缓存 ==========
# 打开文件缓存
open_file_cache max=10000 inactive=60s;
open_file_cache_valid 30s;
open_file_cache_min_uses 2;
open_file_cache_errors on;
# ========== 其他优化 ==========
# 重置超时的长连接
reset_timedout_connection on;
# 服务器名称哈希表
server_names_hash_bucket_size 128;
server_names_hash_max_size 512;
# 类型哈希表
types_hash_max_size 2048;
}
server {
listen 80;
server_name static.example.com;
root /var/www/static;
# 静态资源位置
location ~* .(jpg|jpeg|png|gif|ico|css|js|svg|woff|woff2|ttf|eot)$ {
# 浏览器缓存
expires 1y;
add_header Cache-Control "public, immutable";
# 访问日志(静态资源可以关闭)
access_log off;
# 跨域头
add_header Access-Control-Allow-Origin *;
# Gzip静态压缩(预压缩)
gzip_static on; # 需要编译时添加--with-http_gzip_static_module
# 零拷贝(大文件)
sendfile on;
tcp_nopush on;
tcp_nodelay on;
# 直接I/O(大文件,绕过缓存)
directio 4m;
directio_alignment 512;
# 分片读取(大文件)
output_buffers 1 128k;
}
# 小文件优化
location ~* .(html|xml|json)$ {
expires 1h;
add_header Cache-Control "public";
# 小文件使用内存缓存
open_file_cache max=1000 inactive=20s;
open_file_cache_valid 30s;
open_file_cache_min_uses 2;
}
}
upstream backend {
# 负载均衡策略
# 1. 默认轮询
# 2. least_conn - 最少连接
# 3. ip_hash - IP哈希(会话保持)
# 4. hash $request_uri consistent - 一致性哈希
least_conn;
# 后端服务器
server 192.168.1.101:8080 weight=3 max_fails=2 fail_timeout=30s;
server 192.168.1.102:8080 weight=2 max_fails=2 fail_timeout=30s;
server 192.168.1.103:8080 weight=1 max_fails=2 fail_timeout=30s;
# 后端长连接池(重要!)
keepalive 100; # 保持100个空闲连接
keepalive_requests 100; # 每个连接最多100个请求
keepalive_timeout 60s; # 空闲连接超时
}
server {
listen 80;
server_name api.example.com;
location / {
# 代理配置
proxy_pass http://backend;
# ========== 代理头 ==========
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# ========== 超时设置 ==========
proxy_connect_timeout 5s; # 连接后端超时
proxy_send_timeout 60s; # 发送数据超时
proxy_read_timeout 60s; # 读取响应超时
# ========== 缓冲区 ==========
proxy_buffering on;
proxy_buffer_size 8k;
proxy_buffers 32 8k;
proxy_busy_buffers_size 64k;
# ========== 后端长连接(关键!)==========
proxy_http_version 1.1;
proxy_set_header Connection "";
# ========== 错误处理 ==========
proxy_next_upstream error timeout http_500 http_502 http_503;
proxy_next_upstream_tries 2;
proxy_next_upstream_timeout 10s;
# ========== 临时文件 ==========
proxy_max_temp_file_size 0; # 禁用临时文件
}
}
# 1. 查看当前限制
ulimit -n
# 默认通常是1024,远远不够
# 2. 临时修改(重启失效)
ulimit -n 65535
# 3. 永久修改(推荐)
sudo vi /etc/security/limits.conf
# 添加以下内容:
* soft nofile 65535
* hard nofile 65535
root soft nofile 65535
root hard nofile 65535
# 4. 系统级限制
sudo vi /etc/sysctl.conf
fs.file-max = 2097152
fs.nr_open = 2097152
# 应用配置
sudo sysctl -p
# 5. Systemd服务限制(如果使用systemd)
sudo mkdir -p /etc/systemd/system/nginx.service.d
sudo vi /etc/systemd/system/nginx.service.d/limits.conf
[Service]
LimitNOFILE=65535
LimitNPROC=65535
sudo systemctl daemon-reload
sudo systemctl restart nginx
# 6. 验证
cat /proc/$(pgrep nginx | head -1)/limits | grep "open files"
# /etc/sysctl.conf
# 完整的高性能网络配置
# ========== TCP基础参数 ==========
# TCP连接队列
net.core.somaxconn = 65535 # 最大监听队列
net.core.netdev_max_backlog = 65535 # 网卡接收队列
net.ipv4.tcp_max_syn_backlog = 65535 # SYN队列长度
# TCP连接数
net.ipv4.ip_local_port_range = 1024 65535 # 可用端口范围
net.ipv4.tcp_max_tw_buckets = 20000 # TIME_WAIT数量限制
# ========== TCP性能优化 ==========
# 开启TCP Fast Open(减少握手延迟)
net.ipv4.tcp_fastopen = 3
# TCP拥塞控制算法(BBR最优)
net.core.default_qdisc = fq
net.ipv4.tcp_congestion_control = bbr
# TCP窗口缩放
net.ipv4.tcp_window_scaling = 1
# TCP缓冲区大小(自动调整)
net.core.rmem_max = 16777216 # 接收缓冲最大值
net.core.wmem_max = 16777216 # 发送缓冲最大值
net.core.rmem_default = 262144
net.core.wmem_default = 262144
net.ipv4.tcp_rmem = 4096 87380 16777216 # min default max
net.ipv4.tcp_wmem = 4096 65536 16777216
# ========== TCP连接回收 ==========
# 快速回收TIME_WAIT连接
net.ipv4.tcp_tw_reuse = 1 # 允许复用TIME_WAIT
# FIN超时时间(加快连接释放)
net.ipv4.tcp_fin_timeout = 15
# Keepalive设置
net.ipv4.tcp_keepalive_time = 600 # 开始探测前的空闲时间
net.ipv4.tcp_keepalive_probes = 3 # 探测次数
net.ipv4.tcp_keepalive_intvl = 15 # 探测间隔
# ========== SYN防护 ==========
# SYN Cookies(防SYN Flood)
net.ipv4.tcp_syncookies = 1
# SYN+ACK重试次数
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
# ========== 内存优化 ==========
# TCP内存
net.ipv4.tcp_mem = 786432 1048576 1572864 # min pressure max (单位:页,1页=4KB)
# 禁用SWAP(可选,根据内存大小)
# vm.swappiness = 0
# ========== 其他优化 ==========
# 启用时间戳(RTT计算)
net.ipv4.tcp_timestamps = 1
# 启用SACK(选择性确认)
net.ipv4.tcp_sack = 1
# MTU探测
net.ipv4.tcp_mtu_probing = 1
# 应用配置
sudo sysctl -p
# 验证BBR是否生效
sysctl net.ipv4.tcp_congestion_control
lsmod | grep bbr
BBR vs 传统算法性能对比:
| 场景 | Cubic(传统) | BBR | 提升 |
|---|---|---|---|
| 低延迟网络 | 950Mbps | 980Mbps | +3% |
| 高延迟网络(200ms) | 120Mbps | 850Mbps | +708% |
| 有丢包网络(1%) | 450Mbps | 780Mbps | +73% |
启用BBR(内核要求4.9+):
# 1. 检查内核版本
uname -r
# 如果低于4.9,需要升级内核
# 2. 检查是否支持BBR
grep -i bbr /boot/config-$(uname -r)
# 3. 启用BBR
echo "net.core.default_qdisc=fq" | sudo tee -a /etc/sysctl.conf
echo "net.ipv4.tcp_congestion_control=bbr" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
# 4. 验证
sysctl net.ipv4.tcp_congestion_control
# 输出:net.ipv4.tcp_congestion_control = bbr
lsmod | grep bbr
# 输出:tcp_bbr 20480 1
# 1. 查看网卡队列数
ethtool -l eth0
# 输出:
# Channel parameters for eth0:
# Pre-set maximums:
# RX: 8
# TX: 8
# Combined: 8
# Current hardware settings:
# RX: 4
# TX: 4
# Combined: 4
# 2. 设置队列数(设为CPU核心数)
sudo ethtool -L eth0 combined 8
# 3. 查看中断分配
cat /proc/interrupts | grep eth0
# 4. 启用RPS/RFS(软件层多队列)
# RPS: Receive Packet Steering
for i in /sys/class/net/eth0/queues/rx-*/rps_cpus; do
echo "ff" | sudo tee $i
done
# RFS: Receive Flow Steering
echo 32768 | sudo tee /proc/sys/net/core/rps_sock_flow_entries
for i in /sys/class/net/eth0/queues/rx-*/rps_flow_cnt; do
echo 2048 | sudo tee $i
done
# 1. 查看当前Ring Buffer大小
ethtool -g eth0
# 输出:
# Ring parameters for eth0:
# Pre-set maximums:
# RX: 4096
# TX: 4096
# Current hardware settings:
# RX: 512
# TX: 512
# 2. 增大Ring Buffer(减少丢包)
sudo ethtool -G eth0 rx 4096 tx 4096
# 3. 查看丢包统计
ethtool -S eth0 | grep -i drop
ethtool -S eth0 | grep -i error
# 4. 实时监控
watch -n 1 'ethtool -S eth0 | grep -E "rx_dropped|tx_dropped"'
# 1. 实时带宽监控
iftop -i eth0
# 2. 连接数统计
ss -s
# 输出:
# Total: 1324
# TCP: 1200 (estab 980, closed 180, orphaned 0, timewait 150)
# 3. 每个状态的连接数
ss -ant | awk '{print $1}' | sort | uniq -c
# 4. 监控特定端口
watch -n 1 'ss -tan state established "( dport = :80 or sport = :80 )" | wc -l'
# 5. 查看TCP重传率
nstat -az | grep -i retrans
# 6. 网络延迟测试
ping -c 100 -i 0.2 -q target-server
# -q: 安静模式,只显示统计
http {
# ========== 缓存路径配置 ==========
# 定义缓存路径
proxy_cache_path /var/cache/nginx/proxy
levels=1:2 # 二级目录结构
keys_zone=proxy_cache:100m # 内存中的缓存索引(100MB)
max_size=10g # 磁盘缓存最大10GB
inactive=7d # 7天未访问删除
use_temp_path=off; # 直接写入缓存目录
# FastCGI缓存
fastcgi_cache_path /var/cache/nginx/fastcgi
levels=1:2
keys_zone=fastcgi_cache:100m
max_size=5g
inactive=7d
use_temp_path=off;
# 缓存KEY
proxy_cache_key "$scheme$request_method$host$request_uri";
# ========== 上游服务器配置 ==========
upstream backend {
server 192.168.1.101:8080;
server 192.168.1.102:8080;
keepalive 100;
}
server {
listen 80;
server_name www.example.com;
# ========== 代理缓存配置 ==========
location / {
proxy_pass http://backend;
# 启用缓存
proxy_cache proxy_cache;
# 缓存有效期(根据HTTP状态码)
proxy_cache_valid 200 302 1h;
proxy_cache_valid 301 1d;
proxy_cache_valid 404 1m;
proxy_cache_valid any 1m;
# 缓存条件
proxy_cache_methods GET HEAD;
proxy_cache_min_uses 2; # 访问2次后才缓存
# 缓存锁(防止缓存雪崩)
proxy_cache_lock on;
proxy_cache_lock_timeout 5s;
proxy_cache_lock_age 5s;
# 陈旧缓存(后端故障时使用旧缓存)
proxy_cache_use_stale error timeout updating
http_500 http_502 http_503 http_504;
# 后台更新缓存
proxy_cache_background_update on;
# 忽略后端的缓存控制头
proxy_ignore_headers Cache-Control Expires;
# 响应头显示缓存状态
add_header X-Cache-Status $upstream_cache_status;
# HIT: 缓存命中
# MISS: 缓存未命中
# EXPIRED: 缓存已过期
# STALE: 使用陈旧缓存
# UPDATING: 正在更新缓存
# REVALIDATED: 缓存重新验证
# BYPASS: 缓存被绕过
# 不缓存的条件
proxy_cache_bypass $http_pragma $http_authorization;
proxy_no_cache $http_pragma $http_authorization;
# 代理头
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_http_version 1.1;
proxy_set_header Connection "";
}
# ========== 动态内容不缓存 ==========
location ~* .(php|jsp|cgi|asp|aspx)$ {
proxy_pass http://backend;
proxy_cache off;
}
# ========== 缓存清除接口 ==========
location ~ /purge(/.*) {
allow 127.0.0.1;
deny all;
proxy_cache_purge proxy_cache "$scheme$request_method$host$1";
}
}
}
server {
listen 80;
server_name static.example.com;
root /var/www/static;
# ========== 强缓存 ==========
# 图片、字体(1年)
location ~* .(jpg|jpeg|png|gif|ico|svg|woff|woff2|ttf|eot)$ {
expires 1y;
add_header Cache-Control "public, immutable";
access_log off;
}
# CSS、JS(1个月)
location ~* .(css|js)$ {
expires 30d;
add_header Cache-Control "public";
# 支持ETag协商缓存
etag on;
}
# HTML(不缓存,使用协商缓存)
location ~* .html$ {
expires -1;
add_header Cache-Control "no-cache";
etag on;
if_modified_since exact;
}
# ========== 协商缓存 ==========
# 启用ETag
etag on;
# Last-Modified
if_modified_since exact; # 精确匹配
}
# 需要编译ngx_http_redis_module模块
http {
upstream redis {
server 127.0.0.1:6379;
keepalive 10;
}
server {
listen 80;
location /api/ {
set $redis_key "$uri$is_args$args";
redis_pass redis;
default_type text/html;
# Redis超时
redis_connect_timeout 1s;
redis_read_timeout 1s;
# 错误处理(Redis不可用时回源)
error_page 404 502 504 = @fallback;
}
location @fallback {
proxy_pass http://backend;
}
}
}
# 1. 实时监控缓存状态
tail -f /var/log/nginx/access.log | awk '{print $(NF-1)}' | sort | uniq -c
# 输出示例:
# 1523 HIT
# 234 MISS
# 12 EXPIRED
# 2. 统计缓存命中率
awk '{print $(NF-1)}' /var/log/nginx/access.log |
awk '{count[$1]++} END {for(i in count) print i, count[i]}'
# 3. 查看缓存大小
du -sh /var/cache/nginx/*
# 4. 查看缓存文件数
find /var/cache/nginx/proxy -type f | wc -l
# 5. 缓存命中率计算
# 命中率 = HIT / (HIT + MISS) × 100%
server {
listen 443 ssl http2;
server_name www.example.com;
# ========== 证书配置 ==========
# SSL证书
ssl_certificate /etc/nginx/ssl/fullchain.pem;
ssl_certificate_key /etc/nginx/ssl/privkey.pem;
# 证书链(提高兼容性)
ssl_trusted_certificate /etc/nginx/ssl/chain.pem;
# ========== SSL协议和加密套件 ==========
# 只允许TLS 1.2和1.3
ssl_protocols TLSv1.2 TLSv1.3;
# 优先使用服务器端加密套件
ssl_prefer_server_ciphers on;
# 加密套件(安全性和性能平衡)
ssl_ciphers 'ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384';
# TLS 1.3专用加密套件
ssl_conf_command Ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256;
# ========== SSL会话缓存(关键优化!)==========
# 共享会话缓存(所有worker共享)
ssl_session_cache shared:SSL:50m;
# 会话超时
ssl_session_timeout 1d;
# TLS 1.3会话票据(Session Ticket)
ssl_session_tickets on;
ssl_session_ticket_key /etc/nginx/ssl/ticket.key;
# ========== OCSP Stapling(性能优化)==========
ssl_stapling on;
ssl_stapling_verify on;
resolver 8.8.8.8 8.8.4.4 valid=300s;
resolver_timeout 5s;
# ========== SSL缓冲区 ==========
ssl_buffer_size 4k; # 减小值提高首字节时间(适合小文件)
# ========== HTTP/2设置 ==========
http2_max_field_size 16k;
http2_max_header_size 32k;
http2_max_requests 1000;
# ========== 安全头 ==========
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload" always;
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-XSS-Protection "1; mode=block" always;
location / {
proxy_pass http://backend;
}
}
# 1. SSL握手性能测试
openssl s_time -connect www.example.com:443 -new -time 10
# 输出:
# 1234 connections in 10.00s; 123.4 connections/user sec
# 2. SSL会话复用测试
echo | openssl s_client -connect www.example.com:443 -reconnect 2>/dev/null | grep "Session-ID"
# 3. 查看SSL配置
openssl s_client -connect www.example.com:443 -tls1_3
# 4. 测试OCSP Stapling
openssl s_client -connect www.example.com:443 -status
# 5. SSL实验室测试(在线)
# https://www.ssllabs.com/ssltest/
# 1. 安装Certbot
sudo apt install certbot python3-certbot-nginx
# 2. 获取证书
sudo certbot --nginx -d www.example.com -d example.com
# 3. 自动续期
sudo certbot renew --dry-run
# 4. 配置自动续期任务
sudo crontab -e
# 添加:
0 0,12 * * * /usr/bin/certbot renew --quiet --post-hook "systemctl reload nginx"
# 5. 检查证书有效期
openssl x509 -in /etc/letsencrypt/live/example.com/fullchain.pem -noout -dates
upstream backend {
# ========== 算法1:轮询(默认)==========
# 每个请求按时间顺序分配
# server 192.168.1.101:8080;
# server 192.168.1.102:8080;
# ========== 算法2:加权轮询 ==========
# weight越大,分配的请求越多
# server 192.168.1.101:8080 weight=3;
# server 192.168.1.102:8080 weight=2;
# server 192.168.1.103:8080 weight=1;
# ========== 算法3:least_conn(最少连接)==========
# 分配给连接数最少的服务器
least_conn;
server 192.168.1.101:8080;
server 192.168.1.102:8080;
# ========== 算法4:ip_hash(IP哈希)==========
# 同一IP总是访问同一台服务器(会话保持)
# ip_hash;
# server 192.168.1.101:8080;
# server 192.168.1.102:8080;
# ========== 算法5:一致性哈希 ==========
# 基于请求URI哈希
# hash $request_uri consistent;
# server 192.168.1.101:8080;
# server 192.168.1.102:8080;
# ========== 服务器参数 ==========
# weight=N 权重
# max_fails=N 失败N次标记为不可用
# fail_timeout=Ns 失败超时时间
# backup 备份服务器
# down 标记为不可用
# max_conns=N 最大连接数限制
server 192.168.1.101:8080 max_fails=3 fail_timeout=30s max_conns=1000;
server 192.168.1.102:8080 max_fails=3 fail_timeout=30s max_conns=1000;
server 192.168.1.103:8080 backup; # 备份服务器
# ========== 长连接池 ==========
keepalive 100;
keepalive_requests 100;
keepalive_timeout 60s;
# ========== 健康检查(商业版)==========
# health_check interval=5s fails=3 passes=2;
}
# ========== 方案1:IP Hash ==========
upstream backend_iphash {
ip_hash;
server 192.168.1.101:8080;
server 192.168.1.102:8080;
}
# ========== 方案2:Cookie ==========
upstream backend_cookie {
# 需要nginx-sticky-module-ng模块
sticky cookie srv_id expires=1h domain=.example.com path=/;
server 192.168.1.101:8080;
server 192.168.1.102:8080;
}
# ========== 方案3:自定义哈希 ==========
upstream backend_custom {
# 基于Cookie中的session_id
hash $cookie_session_id consistent;
server 192.168.1.101:8080;
server 192.168.1.102:8080;
}
# ========== 方案4:后端Session共享 ==========
# 推荐使用Redis存储Session,无需Nginx层面保持
# 场景:多机房部署,就近访问
# ========== 定义机房 ==========
upstream beijing_cluster {
zone beijing 64k;
server 10.1.1.101:8080;
server 10.1.1.102:8080;
keepalive 50;
}
upstream shanghai_cluster {
zone shanghai 64k;
server 10.2.1.101:8080;
server 10.2.1.102:8080;
keepalive 50;
}
# ========== 根据来源IP分配 ==========
geo $backend_cluster {
default shanghai_cluster;
# 北京地区IP段
1.0.0.0/8 beijing_cluster;
58.0.0.0/8 beijing_cluster;
# 上海地区IP段
60.0.0.0/8 shanghai_cluster;
61.0.0.0/8 shanghai_cluster;
}
server {
listen 80;
server_name www.example.com;
location / {
proxy_pass http://$backend_cluster;
}
}
# 场景:新版本灰度发布,5%流量到新版本
split_clients "${remote_addr}" $backend_pool {
5% new_version;
* stable_version;
}
upstream stable_version {
server 192.168.1.101:8080;
server 192.168.1.102:8080;
}
upstream new_version {
server 192.168.1.201:8080;
server 192.168.1.202:8080;
}
server {
listen 80;
location / {
proxy_pass http://$backend_pool;
}
}
# ========== 架构 ==========
# Nginx-Master (VRRP Priority 100) + Keepalived
# Nginx-Backup (VRRP Priority 90) + Keepalived
# Virtual IP: 192.168.1.100
# ========== 安装Keepalived ==========
sudo apt install keepalived
# ========== Master配置 ==========
# /etc/keepalived/keepalived.conf
global_defs {
router_id LB_MASTER
}
vrrp_script check_nginx {
script "/etc/keepalived/check_nginx.sh"
interval 2
weight -20
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1234
}
virtual_ipaddress {
192.168.1.100
}
track_script {
check_nginx
}
}
# ========== Backup配置 ==========
# 同上,修改:
# state BACKUP
# priority 90
# ========== 健康检查脚本 ==========
# /etc/keepalived/check_nginx.sh
#!/bin/bash
counter=$(ps -C nginx --no-heading|wc -l)
if [ $counter -eq 0 ]; then
systemctl start nginx
sleep 2
counter=$(ps -C nginx --no-heading|wc -l)
if [ $counter -eq 0 ]; then
systemctl stop keepalived
fi
fi
chmod +x /etc/keepalived/check_nginx.sh
# ========== 启动服务 ==========
sudo systemctl enable keepalived
sudo systemctl start keepalived
# ========== 验证 ==========
ip addr show eth0 | grep 192.168.1.100
# ========== 启用stub_status ==========
server {
listen 8080;
server_name localhost;
location /nginx_status {
stub_status on;
access_log off;
allow 127.0.0.1;
deny all;
}
}
# 访问:curl http://localhost:8080/nginx_status
# 输出:
# Active connections: 291
# server accepts handled requests
# 16630948 16630948 31070465
# Reading: 6 Writing: 179 Keepalive: 106
# 指标解释:
# Active connections: 当前活跃连接数
# accepts: 总接受连接数
# handled: 总处理连接数
# requests: 总请求数
# Reading: 正在读取请求头的连接数
# Writing: 正在写响应的连接数
# Keepalive: 保持连接的空闲连接数
# ========== 安装nginx-prometheus-exporter ==========
wget https://github.com/nginxinc/nginx-prometheus-exporter/releases/download/v0.11.0/nginx-prometheus-exporter_0.11.0_linux_amd64.tar.gz
tar -xzf nginx-prometheus-exporter_0.11.0_linux_amd64.tar.gz
sudo mv nginx-prometheus-exporter /usr/local/bin/
# ========== 启动exporter ==========
nginx-prometheus-exporter -nginx.scrape-uri=http://localhost:8080/nginx_status
# ========== Prometheus配置 ==========
# prometheus.yml
scrape_configs:
- job_name: 'nginx'
static_configs:
- targets: ['localhost:9113']
labels:
instance: 'web-server-1'
# ========== Grafana仪表板 ==========
# Dashboard ID: 12708(Nginx Overview)
# ========== 实时分析访问日志 ==========
# 1. 实时QPS
tail -f /var/log/nginx/access.log | pv -l -i 1 -r > /dev/null
# 2. 状态码统计
awk '{print $9}' /var/log/nginx/access.log | sort | uniq -c | sort -rn
# 3. 响应时间分析
awk '{print $NF}' /var/log/nginx/access.log |
awk '{sum+=$1; count++} END {print "Avg:", sum/count, "Count:", count}'
# 4. Top 10 请求URI
awk '{print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head -10
# 5. Top 10 访问IP
awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head -10
# 6. 慢请求分析(响应时间>1秒)
awk '$NF > 1 {print $0}' /var/log/nginx/access.log
# 7. 错误日志统计
grep -E "error|warn" /var/log/nginx/error.log | awk '{print $9}' | sort | uniq -c
# ========== 1. strace分析系统调用 ==========
sudo strace -p $(pgrep nginx | head -1) -c
# ========== 2. perf分析CPU ==========
sudo perf record -p $(pgrep nginx | head -1) -g -- sleep 10
sudo perf report
# ========== 3. FlameGraph火焰图 ==========
git clone https://github.com/brendangregg/FlameGraph
sudo perf record -F 99 -p $(pgrep nginx | head -1) -g -- sleep 30
sudo perf script | ./FlameGraph/stackcollapse-perf.pl | ./FlameGraph/flamegraph.pl > nginx.svg
# ========== 4. SystemTap ==========
# 监控Nginx函数调用
sudo stap -e 'probe process("/usr/sbin/nginx").function("*") {
printf("%s -> %s
", thread_indent(1), probefunc())
}'
初始状态:
配置:默认配置性能:5,000 QPSCPU:40%内存:2GB优化步骤:
# 步骤1:调整Worker配置
worker_processes auto; # 8核CPU
worker_connections 65535;
worker_rlimit_nofile 65535;
# 结果:QPS提升到 8,000 (+60%)
# 步骤2:启用长连接
upstream backend {
keepalive 100;
}
proxy_http_version 1.1;
proxy_set_header Connection "";
# 结果:QPS提升到 15,000 (+87%)
# 步骤3:启用缓存
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=cache:100m;
proxy_cache cache;
proxy_cache_valid 200 1h;
# 结果:QPS提升到 35,000 (+133%),缓存命中率80%
# 步骤4:系统层优化
# BBR + 内核参数优化
echo "net.ipv4.tcp_congestion_control=bbr" >> /etc/sysctl.conf
sysctl -p
# 结果:QPS提升到 45,000 (+28%)
# 步骤5:Gzip压缩
gzip on;
gzip_comp_level 6;
gzip_types text/plain text/css application/json;
# 结果:带宽减少60%,QPS稳定在 50,000
# 最终性能:
# - QPS: 50,000 (+900%)
# - 延迟(P99): 45ms
# - CPU: 65%
# - 内存: 3GB
场景:
活动预期:100万在线用户峰值QPS:50,000静态资源:图片、CSS、JS优化方案:
# ========== 1. 静态资源分离 ==========
server {
listen 80;
server_name static.example.com;
root /data/static;
location ~* .(jpg|png|css|js)$ {
expires 1y;
add_header Cache-Control "public, immutable";
access_log off;
# 预压缩
gzip_static on;
# 零拷贝
sendfile on;
tcp_nopush on;
}
}
# ========== 2. API限流 ==========
limit_req_zone $binary_remote_addr zone=api_limit:10m rate=100r/s;
server {
listen 80;
server_name api.example.com;
location /api/ {
limit_req zone=api_limit burst=200 nodelay;
proxy_pass http://backend;
}
}
# ========== 3. 降级开关 ==========
geo $is_maintenance {
default 0;
# 白名单IP
192.168.1.100 1;
}
server {
listen 80;
if ($is_maintenance = 0) {
return 503;
}
error_page 503 @maintenance;
location @maintenance {
root /usr/share/nginx/html;
rewrite ^(.*)$ /maintenance.html break;
}
}
结果:
✅ 活动期间稳定支撑50K QPS✅ 静态资源缓存命中率95%✅ API限流有效防止雪崩✅ 降级功能保障核心服务场景:
用户分布:北京、上海、深圳服务器:北京单机房问题:南方用户延迟高(150ms+)解决方案:SD-WAN多机房部署
# ========== 架构 ==========
# 北京机房:主站(10.168.1.100)
# 上海机房:从站(10.168.2.100)
# 深圳机房:从站(10.168.3.100)
# 使用星空组网互联
# ========== 步骤1:部署星空组网 ==========
# 三个机房都安装星空组网客户端
curl -O https://dl.starrylink.cn/install.sh
sudo bash install.sh
# 加入同一个网络
sudo starrylink-cli network join <network-id>
# 验证互通
ping 10.168.2.100
ping 10.168.3.100
# ========== 步骤2:配置Nginx负载均衡 ==========
# 北京主站配置
upstream multi_region {
# 本地机房优先
server 127.0.0.1:8080 weight=10;
# 其他机房备用(通过星空组网访问)
server 10.168.2.100:8080 weight=1 backup;
server 10.168.3.100:8080 weight=1 backup;
keepalive 50;
}
# ========== 步骤3:智能DNS解析 ==========
# 使用DNSPod等支持地域解析的DNS
# 北京用户 → beijing.example.com → 10.168.1.100
# 上海用户 → shanghai.example.com → 10.168.2.100
# 深圳用户 → shenzhen.example.com → 10.168.3.100
# ========== 步骤4:数据同步 ==========
# 使用rsync通过星空组网同步静态资源
# 北京 → 上海
rsync -avz --bwlimit=10000 /data/static/
admin@10.168.2.100:/data/static/
# 北京 → 深圳
rsync -avz --bwlimit=10000 /data/static/
admin@10.168.3.100:/data/static/
优化结果:
| 地区 | 优化前延迟 | 优化后延迟 | 提升 |
|---|---|---|---|
| 北京 | 15ms | 12ms | 20% ↑ |
| 上海 | 150ms | 20ms | 87% ↑ |
| 深圳 | 180ms | 25ms | 86% ↑ |
关键优势:
✅ P2P直连,延迟降低85%+✅ 自动NAT穿透,无需公网IP配置✅ 统一虚拟网络,管理简单✅ 加密传输,安全可靠
# ========== Nginx配置检查 ==========
- [ ] worker_processes = auto 或 CPU核心数
- [ ] worker_connections >= 10000
- [ ] worker_rlimit_nofile >= 65535
- [ ] use epoll(Linux)
- [ ] multi_accept on
- [ ] sendfile on
- [ ] tcp_nopush on
- [ ] tcp_nodelay on
- [ ] keepalive_timeout 合理设置(30-65s)
- [ ] gzip on(压缩级别5-6)
- [ ] access_log 使用buffer或关闭(高并发)
- [ ] open_file_cache 已配置
# ========== 系统配置检查 ==========
- [ ] ulimit -n >= 65535
- [ ] net.core.somaxconn >= 65535
- [ ] net.ipv4.tcp_max_syn_backlog >= 65535
- [ ] net.ipv4.ip_local_port_range = 1024 65535
- [ ] net.ipv4.tcp_tw_reuse = 1
- [ ] net.ipv4.tcp_fin_timeout <= 30
- [ ] net.ipv4.tcp_congestion_control = bbr
- [ ] fs.file-max >= 2097152
- [ ] vm.swappiness = 0(可选)
# ========== 反向代理检查 ==========
- [ ] proxy_buffering on
- [ ] proxy_http_version 1.1
- [ ] proxy_set_header Connection ""
- [ ] upstream keepalive 已配置
- [ ] proxy_cache 已启用(适用场景)
- [ ] proxy_next_upstream 配置容错
# ========== SSL/TLS检查 ==========
- [ ] ssl_session_cache shared:SSL:50m
- [ ] ssl_session_timeout >= 1h
- [ ] ssl_protocols TLSv1.2 TLSv1.3
- [ ] ssl_stapling on
- [ ] http2 已启用
# ========== 监控检查 ==========
- [ ] stub_status 已启用
- [ ] 日志格式包含$request_time
- [ ] Prometheus exporter 已部署
- [ ] Grafana仪表板已配置
- [ ] 告警规则已设置
| 并发等级 | QPS | 配置要点 |
|---|---|---|
| 1K | <10K | 基础优化即可 |
| 5K | 10K-50K | + 长连接池 + 系统调优 |
| 10K | 50K-100K | + 缓存 + 负载均衡 |
| 50K | 100K-500K | + 多机房 + CDN |
| 100K+ | 500K+ | + 专业方案(LVS/F5) |
# ========== 问题1:QPS上不去 ==========
1. 检查CPU使用率(是否100%)
2. 检查网络带宽(是否跑满)
3. 检查文件描述符(Too many open files)
4. 检查后端响应时间
5. 检查是否开启长连接
# ========== 问题2:延迟高 ==========
1. 查看$request_time和$upstream_response_time
2. 检查磁盘I/O(iowait)
3. 检查网络延迟(ping/mtr)
4. 检查DNS解析时间
5. 检查SSL握手时间
# ========== 问题3:连接数异常 ==========
1. 查看TIME_WAIT数量(netstat -an | grep TIME_WAIT | wc -l)
2. 检查tcp_tw_reuse是否启用
3. 检查keepalive_timeout设置
4. 检查后端长连接
# ========== 问题4:502/504错误 ==========
1. 检查后端服务是否存活
2. 检查proxy_connect_timeout设置
3. 检查后端日志
4. 检查网络连通性
5. 检查SELinux/防火墙
# ========== 问题5:缓存不生效 ==========
1. 检查X-Cache-Status响应头
2. 查看缓存目录大小
3. 检查proxy_cache_valid设置
4. 检查后端Cache-Control头
5. 查看error.log
1. 【高】系统层优化(文件描述符、内核参数)
├─ ROI: ⭐⭐⭐⭐⭐
└─ 难度: ⭐⭐
2. 【高】Nginx基础配置(worker、长连接)
├─ ROI: ⭐⭐⭐⭐⭐
└─ 难度: ⭐
3. 【中】缓存策略(proxy_cache、浏览器缓存)
├─ ROI: ⭐⭐⭐⭐
└─ 难度: ⭐⭐⭐
4. 【中】SSL/TLS优化(会话复用、OCSP Stapling)
├─ ROI: ⭐⭐⭐
└─ 难度: ⭐⭐
5. 【低】网络层优化(网卡多队列、Ring Buffer)
├─ ROI: ⭐⭐
└─ 难度: ⭐⭐⭐⭐
| 指标 | 目标值 | 监控方式 |
|---|---|---|
| QPS | >50K | wrk/ab |
| 延迟(P99) | <50ms | $request_time |
| 错误率 | <0.01% | 4xx/5xx统计 |
| 缓存命中率 | >80% | $upstream_cache_status |
| CPU使用率 | <70% | top/htop |
| 连接数 | < worker_connections | stub_status |
如果这篇文章对你有帮助,欢迎点赞收藏!⭐
有问题欢迎在评论区讨论 👇
¥31.00
PC中文 Steam 大富翁11 Richman 11 国区激活码 CDKey 大富翁十一 大富翁10 正版大富翁游戏 电脑版
¥25.50
PC中文 Steam 辐射4 Fallout 4 辐射4年度版 辐射4季票DLC 国区全球cdkey激活码
¥175.00
xgpu兑换码 三年 3年 xbox 微软会员 一年 1年 老用户36个月 代充 xgp 金会员 1 2个月 13个月激活码充值秒发
¥99.00
PC中文正版 steam 严阵以待 Ready or Not 严正以待 国区cdkey激活码 自动发货
¥90.00
逃离塔科夫Escape From Tarkov 逃离塔克夫 黑边版 全球版 黑边升级包 塔可夫 激活码 中文正版游戏PC
¥119.00
PC中文 steam游戏 只狼影逝二度 Sekiro: Shadows Die Twice 只狼:影逝二度年度版 国区cdkey激活码