搜索
您的当前位置:首页正文

Alpha卡死问题排查

来源:二三娱乐

访问Docker镜像

sudo docker exec -it mesos-00c24093-a05e-4909-ae16-6455e07dcb4b-S7.5188b4f0-ef4b-4465-a7f2-e7b4bdecd457 bash

找到进程ID

[root@2e157acaebef /]# ps -ef|grep alpha
root           1       0  0 11:01 ?        00:00:00 /bin/sh -c cd /home/alpha/alpha-exptmgr && sh run.sh prod
root           8       7  8 11:01 ?        00:02:10 java -jar alpha-exptmgr-0.0.1-SNAPSHOT.jar --spring.profiles.active=prod
root         267      92  0 11:28 ?        00:00:00 grep --color=auto alpha

查看GC情况

[root@2e157acaebef /]# jstat -gc 8
 S0C    S1C    S0U    S1U      EC       EU        OC         OU       MC     MU    CCSC   CCSU   YGC     YGCT    FGC    FGCT     GCT
17920.0 28160.0 17753.6  0.0   1626112.0 1547980.4 1563648.0   38848.9   89984.0 86442.0 11648.0 10978.0     10    0.190   3      0.337    0.528

查看类实例数

[root@d4c126c56caa logs]# jmap -histo 8|head -n 30

 num     #instances         #bytes  class name
----------------------------------------------
   1:       9062448      471906376  [C
   2:       8457140      202971360  java.lang.String
   3:       1337073       96269256  com.data.yidian.model.metrics.UserAction
   4:        169298       74048608  [B
   5:       1510227       48327264  java.util.HashMap$Node
   6:       1672306       40135344  java.lang.Long
   7:         34210       21746440  [I
   8:        192689       19464136  [Ljava.lang.Object;
   9:        119403       17023656  [Ljava.util.HashMap$Node;
  10:        478429       11482296  java.lang.Double
  11:        424516       10188384  org.apache.thrift.protocol.TField
  12:        112076        5379648  java.util.HashMap
  13:        152360        3656640  java.util.ArrayList
  14:        102229        3271328  java.util.concurrent.ConcurrentHashMap$Node
  15:         98893        3164576  java.util.concurrent.FutureTask
  16:         29904        2631552  java.lang.reflect.Method
  17:         63953        2558120  com.data.yidian.model.PredictRequest
  18:         60141        2405640  yidian.data.morpheus.neo.useraction.UserAction
  19:         21127        2340856  java.lang.Class
  20:         87998        2111952  java.util.concurrent.ConcurrentSkipListMap$Node
  21:           250        1978552  [D
  22:         98566        1577056  java.util.HashMap$KeySet
  23:         63948        1534752  com.data.yidian.model.predict.PredictTask
  24:         36950        1478000  java.util.LinkedHashMap$Entry
  25:         60141        1443384  com.yidian.userstore.data.RecentClick
  26:         87548        1400768  java.lang.Integer
  27:         82292        1316672  java.lang.Object

从上面的输出可以看出,com.data.yidian.model.metrics.UserAction类实例占用内存异常多,而这个类的实例是在执行定时任务时被大量创建,由此可以推知是定时任务造成程序卡死。

Top