Nginx Performance Configuration Guide

2015-08-23 · Ryan · Post Comment

Overview

Nginx has many configuration directives, but only a few core ones directly impact performance. In production deployments, operations teams typically provide standard configurations. Unless specific performance requirements exist, adjusting Nginx's default settings is often unnecessary, as Nginx was designed with performance optimization in mind.

Operating System Considerations

Event Model

Nginx's event model should match the operating system. For Linux kernels 2.6 and above, the epoll model is recommended.

File Descriptor Limits

Each client connection consumes a file descriptor. Use the ulimit -a command to check system limits. For high-concurrency scenarios, first increase the kernel-level limit (e.g., by modifying /etc/security/limits.conf), then use the worker_rlimit_nofile directive in Nginx to adjust the maximum number of file descriptors a single worker process can open.

Core Nginx Performance Directives

worker_processes

Defines the number of Nginx worker processes. The default is 1. It is generally recommended to set this equal to the number of physical CPU cores. Setting it higher does not significantly improve performance and may increase CPU overhead due to process scheduling. Increase this value if your application logic involves significant blocking I/O operations.

worker_connections

Defines the maximum number of simultaneous connections a single worker process can handle. Theoretically, Nginx's maximum client connections is worker_processes * worker_connections. A common suggestion is to set this to (system file descriptor limit) / worker_processes. In practice, adjust based on actual traffic patterns, as each file descriptor consumes system resources.

worker_cpu_affinity

Binds worker processes to specific CPU cores (FreeBSD and Linux only). Modern OS process schedulers are efficient; manual CPU affinity rarely provides significant gains and can sometimes hurt performance. Use only if monitoring reveals clear CPU load imbalance.

sendfile

When enabled, Nginx can send file data directly from kernel space to the network socket, avoiding multiple data copies between user and kernel space, reducing CPU usage. It is generally recommended to keep it on. Older advice suggested turning it off for large files (>4MB), but modern Linux kernels (2.4+) support sendfile64, removing this limitation. Keep it enabled unless specific issues arise.

tcp_nodelay and tcp_nopush

These directives control TCP socket behavior, affecting data transmission timing and network efficiency.

tcp_nopush: Works with sendfile. It tells the OS to wait until a packet fills an MSS (Maximum Segment Size) or more data is available before sending, reducing small packets and improving network throughput. Based on Nagle's algorithm.
tcp_nodelay: Disables Nagle's algorithm, allowing immediate data sending to reduce latency. Important for low-latency interactive applications like WebSocket.

By default, tcp_nodelay is on and tcp_nopush is off. A common optimization for HTTP is to enable sendfile and tcp_nopush, while keeping tcp_nodelay on for keep-alive connections.

keepalive_timeout

Sets the timeout for keep-alive connections. With HTTP keep-alive enabled, clients can send multiple requests over a single TCP connection, avoiding repeated three-way handshakes, improving page load speed. Set a reasonable value based on your application to prevent malicious clients from holding connections open.

Reducing Disk I/O

Nginx is efficient with CPU and memory; disk I/O is often the bottleneck. Minimize unnecessary disk read/write operations.

open_file_cache

Caches open file descriptors, sizes, modification times, etc., significantly reducing repeated stat() system calls for static file serving. Related directives like open_file_cache_valid and open_file_cache_min_uses control cache behavior.

access_log and error_log

access_log: Logs all access requests. Under high traffic, frequent log writes can become a bottleneck. If access logs are not needed for analysis or auditing, consider disabling them. Alternatively, write to faster storage (e.g., RAM disk) or use asynchronous writing.
error_log: Logs error information. Since errors are relatively infrequent, enabling it usually has minimal performance impact. Keep it on for troubleshooting.

Buffer Size Settings

If buffers are too small, Nginx writes unprocessed data to temporary files, causing disk I/O.

client_body_buffer_size: Sets the buffer size for client request bodies (e.g., POST data). If the body exceeds this, data is written to a temporary file.
fastcgi_buffers and proxy_buffers: Set the number and size of buffers for caching responses from backends (e.g., PHP-FPM or upstream servers). If the response is too large, it is written to a temporary file. Default is typically 8 buffers of 4k or 8k (depending on system memory page size).

gzip Compression

While not directly reducing disk I/O, enabling gzip compression significantly reduces network data transfer, effectively using bandwidth and lowering latency. Default compression level is 1. Avoid very high levels (e.g., 9), as the diminishing returns in compression ratio are not worth the significant increase in CPU usage.

Other Common Directives

Request Header Buffers

client_header_buffer_size and large_client_header_buffers cache client request headers. If headers exceed limits, Nginx returns a 400 or 414 error. Default values are usually sufficient and have minimal performance impact.

Timeout Settings

client_body_timeout and client_header_timeout: Define timeouts for reading client request bodies and headers. Prevents malicious or slow clients from holding connections. Note: client_body_timeout is the timeout between two successive read operations, not the total body transfer timeout. Default is 60 seconds.
send_timeout: Sets the timeout for sending responses to the client, also the interval between two successive write operations. Default is 60 seconds.

Summary

For most applications, using Nginx's default configuration with appropriate adjustments to worker_processes and worker_connections is sufficient. Tuning other parameters should be based on actual performance monitoring and analysis, targeting specific bottlenecks. Without clear requirements, over-optimization may yield little benefit or introduce unnecessary complexity.