访问Docker镜像
sudo docker exec -it mesos-00c24093-a05e-4909-ae16-6455e07dcb4b-S7.5188b4f0-ef4b-4465-a7f2-e7b4bdecd457 bash
找到进程ID
[root@2e157acaebef /]# ps -ef|grep alpha
root 1 0 0 11:01 ? 00:00:00 /bin/sh -c cd /home/alpha/alpha-exptmgr && sh run.sh prod
root 8 7 8 11:01 ? 00:02:10 java -jar alpha-exptmgr-0.0.1-SNAPSHOT.jar --spring.profiles.active=prod
root 267 92 0 11:28 ? 00:00:00 grep --color=auto alpha
查看GC情况
[root@2e157acaebef /]# jstat -gc 8
S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT GCT
17920.0 28160.0 17753.6 0.0 1626112.0 1547980.4 1563648.0 38848.9 89984.0 86442.0 11648.0 10978.0 10 0.190 3 0.337 0.528
查看类实例数
[root@d4c126c56caa logs]# jmap -histo 8|head -n 30
num #instances #bytes class name
----------------------------------------------
1: 9062448 471906376 [C
2: 8457140 202971360 java.lang.String
3: 1337073 96269256 com.data.yidian.model.metrics.UserAction
4: 169298 74048608 [B
5: 1510227 48327264 java.util.HashMap$Node
6: 1672306 40135344 java.lang.Long
7: 34210 21746440 [I
8: 192689 19464136 [Ljava.lang.Object;
9: 119403 17023656 [Ljava.util.HashMap$Node;
10: 478429 11482296 java.lang.Double
11: 424516 10188384 org.apache.thrift.protocol.TField
12: 112076 5379648 java.util.HashMap
13: 152360 3656640 java.util.ArrayList
14: 102229 3271328 java.util.concurrent.ConcurrentHashMap$Node
15: 98893 3164576 java.util.concurrent.FutureTask
16: 29904 2631552 java.lang.reflect.Method
17: 63953 2558120 com.data.yidian.model.PredictRequest
18: 60141 2405640 yidian.data.morpheus.neo.useraction.UserAction
19: 21127 2340856 java.lang.Class
20: 87998 2111952 java.util.concurrent.ConcurrentSkipListMap$Node
21: 250 1978552 [D
22: 98566 1577056 java.util.HashMap$KeySet
23: 63948 1534752 com.data.yidian.model.predict.PredictTask
24: 36950 1478000 java.util.LinkedHashMap$Entry
25: 60141 1443384 com.yidian.userstore.data.RecentClick
26: 87548 1400768 java.lang.Integer
27: 82292 1316672 java.lang.Object
从上面的输出可以看出,com.data.yidian.model.metrics.UserAction类实例占用内存异常多,而这个类的实例是在执行定时任务时被大量创建,由此可以推知是定时任务造成程序卡死。