> erlang:memory().
memory()可以看到Erlang emulator分配的内存,有总的内存,atom消耗的内存,process消耗的内存等等。
Erlang process创建数量
线上系统发现主要内存消耗都在process上面,接下来要分析,是process内存泄漏了,还是process创建数量太多导致。
> erlang:system_info(process_limit). %%查看系统最多能创建多少process
> erlang:system_info(process_count). %%当前系统创建了多少process
system_info()返回当前系统的一些信息,比如系统process,port的数量。执行上面命令,大吃一惊,只有2,3k的网络连接,结果Erlang process已经有10多w了。系统process创建了,但是因为代码或者其它原因,堆积没有释放。
查看单个process的信息
既然是因为process因为某种原因堆积了,只能从process里找原因了
先要获取堆积process的pid
> i(). %%返回system信息
> i(0,61,886). %% (0,61,886)是pid
看到有很多process hang在那里,查看具体pid信息,发现message_queue有几条消息没有被处理。下面就用到强大的erlang:process_info()方法,它可以获取进程相当丰富的信息。
> erlang:process_info(pid(0,61,886), current_stacktrace).
> rp(erlang:process_info(pid(0,61,886), backtrace)).
查看进程的backtrace时,发现下面的信息
0x00007fbd6f18dbf8 Return addr 0x00007fbff201aa00 (gen_event:rpc/2 + 96)
y(0) #Ref<0.0.2014.142287>
y(1) infinity
y(2) {sync_notify,{log,{lager_msg,[], ..........}}
y(3) <0.61.886>
y(4) <0.89.0>
y(5) []
process在处理Erlang第三方的日志库lager时,hang住了。
问题原因
查看lager的文档,发现以下信息
Prior to lager 2.0, the gen_event at the core of lager operated purely in synchronous mode. Asynchronous mode is faster, but has no protection against message queue overload. In lager 2.0, the gen_event takes a hybrid approach. it polls its own mailbox size and toggles the messaging between synchronous and asynchronous depending on mailbox size.
{async_threshold, 20}, {async_threshold_window, 5}
This will use async messaging until the mailbox exceeds 20 messages, at which point synchronous messaging will be used, and switch back to asynchronous, when size reduces to 20 - 5 = 15.
If you wish to disable this behaviour, simply set it to 'undefined'. It defaults to a low number to prevent the mailbox growing rapidly beyond the limit and causing problems. In general, lager should process messages as fast as they come in, so getting 20 behind should be relatively exceptional anyway.










