详解php+nginx 服务发生500 502错误排查思路

2019-10-17 16:39:11王旭

概述

当线上的服务中访问中出现500或者502错误时,需要紧急处理,排查问题,该怎么做?可以通过分析一些错误日志或者跟踪php-fpm进程来进行问题定位。

nginx error_log

nginx的error_log在nginx的配置文件中定义的

server {
 listen  80;
 server_name localhost;
 root   /var/www;

 access_log /Users/jiao/logs/default.access.log;
 error_log /Users/jiao/logs/default.error.log;
 location / {
  index index.html index.htm index.php;
  autoindex on;
 }
 location = /info {
  allow 127.0.0.1;
  deny all;
  rewrite (.*) /.info.php;
 }
 location ~ .php$ {
  root /var/www;
  fastcgi_pass 127.0.0.1:9000;
  fastcgi_index index.php;
  fastcgi_param SCRIPT_FILENAME /var/www$fastcgi_script_name;
  include /usr/local/etc/nginx/fastcgi_params;
 }
}

查看error_log

➜  tail /Users/jiao/logs/default.error.log
2019/07/17 11:08:18 [error] 77416#0: *76 kevent() reported about an closed connection (54: Connection reset by peer) while reading response header from upstream, client: 127.0.0.1, server: localhost, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "localhost"

发现出现了Connection reset by peer,连接被重置了,此时可以再查看php-fpm的error_log进一步分析问题

php-fpm error_log

php-fpm的error_log在php-fpm.conf文件中配置中定义的

; Error log file
; If it's set to "syslog", log is sent to syslogd instead of being written
; in a local file.
; Note: the default prefix is /usr/local/var
; Default Value: log/php-fpm.log
error_log = log/php-fpm.log

error_log里面的内容是这样的

➜ tail /usr/local/var/log/php-fpm.log
[17-Jul-2019 10:49:54] NOTICE: [pool www] child 81948 started
[17-Jul-2019 11:08:18] WARNING: [pool www] child 77537, script '/var/www/index.php' (request: "GET /index.php") execution timed out (3.801267 sec), terminating
[17-Jul-2019 11:08:18] WARNING: [pool www] child 77537 exited on signal 15 (SIGTERM) after 1503.113967 seconds from start
[17-Jul-2019 11:08:18] NOTICE: [pool www] child 94339 started

可以看到是请求/var/www/index.php文件出现了超时

dtruss

dtruss是动态跟踪命令,可以根据PID,name跟踪进程

mac环境下使用dtruss,linux环境可以使用strace,pstack

➜ dtruss 
USAGE: dtruss [-acdefholLs] [-t syscall] { -p PID | -n name | command | -W name }
  -p PID   # examine this PID
  -n name   # examine this process name
  -t syscall  # examine this syscall only
  -W name   # wait for a process matching this name
  -a    # print all details
  -c    # print syscall counts
  -d    # print relative times (us)
  -e    # print elapsed times (us)
  -f    # follow children
  -l    # force printing pid/lwpid
  -o    # print on cpu times
  -s    # print stack backtraces
  -L    # don't print pid/lwpid
  -b bufsize  # dynamic variable buf size