Tuesday, December 30, 2014

Notes On Troubleshooting Tools For Java Program

jps

-- List all java processes and related information, like fully qualified main class names, arguments passed to main method, etc.

[hadoop@m15018 ~]$ jps -ml
30134 org.apache.hadoop.hdfs.server.datanode.DataNode
15141 org.apache.hadoop.mapred.YarnChild 192.168.7.27 28170 attempt_1417419796124_63329_m_000018_1 97
1820 org.apache.hadoop.mapred.YarnChild 192.168.7.14 36058 attempt_1417419796124_63299_m_000037_0 33
28191 org.apache.hadoop.mapred.YarnChild 192.168.7.21 55163 attempt_1417419796124_63322_m_000018_0 21
15275 org.apache.hadoop.mapreduce.v2.app.MRAppMaster
18325 org.apache.hadoop.mapred.YarnChild 192.168.7.55 6926 attempt_1417419796124_63255_m_000014_1 166
10693 org.apache.hadoop.mapred.YarnChild 192.168.7.85 20883 attempt_1417419796124_63282_m_000055_0 34
6001 org.apache.hadoop.mapred.YarnChild 192.168.7.17 9187 attempt_1417419796124_62366_m_000050_0 64
30311 org.apache.hadoop.yarn.server.nodemanager.NodeManager
30111 org.apache.hadoop.mapred.YarnChild 192.168.7.75 23820 attempt_1417419796124_63324_m_000047_0 29
28712 org.apache.hadoop.mapreduce.v2.app.MRAppMaster
4474 org.apache.hadoop.mapred.YarnChild 192.168.7.29 49996 attempt_1417419796124_62449_r_000019_0 101
6041 org.apache.hadoop.mapred.YarnChild 192.168.7.17 9187 attempt_1417419796124_62366_r_000008_0 57
6792 org.apache.hadoop.mapred.YarnChild 192.168.7.20 56878 attempt_1417419796124_63313_m_000016_0 18
25847 org.apache.hadoop.mapred.YarnChild 192.168.7.46 8277 attempt_1417419796124_63290_m_000005_0 7
6089 org.apache.hadoop.mapred.YarnChild 192.168.7.17 9187 attempt_1417419796124_62366_r_000005_0 50
20277 org.apache.hadoop.mapred.YarnChild 192.168.7.26 45093 attempt_1417419796124_63268_m_000014_0 20
5578 org.apache.hadoop.mapred.YarnChild 192.168.7.72 15929 attempt_1417419796124_63271_m_000011_0 14
26194 org.apache.hadoop.mapred.YarnChild 192.168.7.46 8277 attempt_1417419796124_63290_m_000011_0 13
18747 sun.tools.jps.Jps -ml

In which:
-m Output the arguments passed to the main method. The output may be null for embedded JVMs.
-l Output the full package name for the application’s main class or the full path name to the application’s JAR file.
-v Output the arguments passed to the JVM.

jstack

-- Show current stack trace of a running java process.

[hadoop@K1213 ~]$ jstack 19552
2014-12-31 10:21:05
Full thread dump Java HotSpot(TM) 64-Bit Server VM (23.6-b04 mixed mode):

"Attach Listener" daemon prio=10 tid=0x000000000a3ae000 nid=0x81a waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"ResponseProcessor for block BP-714842383-192.168.7.11-1393991369860:blk_1111587326_1099569551548" daemon prio=10 tid=0x00002aaab8293000 nid=0x7098 runnable [0x00000000417e5000]
   java.lang.Thread.State: RUNNABLE
 at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
 at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
 at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
 at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
 - locked <0x000000077742ab40> (a sun.nio.ch.Util$2)
 - locked <0x000000077742ab50> (a java.util.Collections$UnmodifiableSet)
 - locked <0x000000077742aaf8> (a sun.nio.ch.EPollSelectorImpl)
 at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
 at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335)
 at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
 at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
 at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
 at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
 at java.io.FilterInputStream.read(FilterInputStream.java:83)
 at java.io.FilterInputStream.read(FilterInputStream.java:83)
 at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1490)
 at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:116)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:721)

"DataStreamer for file /user/monitor/test_with_reduce_zhudi/_temporary/1/_temporary/attempt_1417419796124_63366_m_000033_0/0-m-00033 block BP-714842383-192.168.7.11-1393991369860:blk_1111587326_1099569551548" daemon prio=10 tid=0x00002aaab4cbb000 nid=0x4ca3 runnable [0x0000000040316000]
   java.lang.Thread.State: RUNNABLE
 at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
 at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
 at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
 at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
 - locked <0x000000077742e4e8> (a sun.nio.ch.Util$2)
 - locked <0x000000077742e4f8> (a java.util.Collections$UnmodifiableSet)
 - locked <0x000000077742e4a0> (a sun.nio.ch.EPollSelectorImpl)
 at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
 at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335)
 at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
 at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
 at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
 at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
 - locked <0x0000000747910230> (a java.io.BufferedOutputStream)
 at java.io.DataOutputStream.write(DataOutputStream.java:107)
 - locked <0x0000000747910248> (a java.io.DataOutputStream)
 at org.apache.hadoop.hdfs.DFSOutputStream$Packet.writeTo(DFSOutputStream.java:278)
 at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:568)

...

"main" prio=10 tid=0x0000000009fad000 nid=0x4c61 in Object.wait() [0x000000004089e000]
   java.lang.Thread.State: WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Object.wait(Object.java:503)
 at org.apache.hadoop.hdfs.DFSOutputStream.waitAndQueueCurrentPacket(DFSOutputStream.java:1475)
 - locked <0x00000007770010a0> (a java.util.LinkedList)
 at org.apache.hadoop.hdfs.DFSOutputStream.writeChunk(DFSOutputStream.java:1543)
 - locked <0x00000007770abdc8> (a org.apache.hadoop.hdfs.DFSOutputStream)
 at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:175)
 at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:150)
 - locked <0x00000007770abdc8> (a org.apache.hadoop.hdfs.DFSOutputStream)
 at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:139)
 - eliminated <0x00000007770abdc8> (a org.apache.hadoop.hdfs.DFSOutputStream)
 at org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:130)
 at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:104)
 - locked <0x00000007770abdc8> (a org.apache.hadoop.hdfs.DFSOutputStream)
 at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:59)
 at java.io.DataOutputStream.write(DataOutputStream.java:107)
 - locked <0x0000000777294690> (a org.apache.hadoop.hdfs.client.HdfsDataOutputStream)
 at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.writeObject(TextOutputFormat.java:83)
 at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter.write(TextOutputFormat.java:104)
 - locked <0x000000077722f338> (a org.apache.hadoop.mapreduce.lib.output.TextOutputFormat$LineRecordWriter)
 at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:433)
 at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:411)
 at AdMonitorDMReaderJob$Mapper.map(AdMonitorDMReaderJob.java:45)
 at AdMonitorDMReaderJob$Mapper.map(AdMonitorDMReaderJob.java:30)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:772)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

"VM Thread" prio=10 tid=0x000000000a02c000 nid=0x4c62 runnable 

"VM Periodic Task Thread" prio=10 tid=0x00002aaab401a800 nid=0x4c69 waiting on condition 

JNI global references: 276


jinfo

-- List all configuration info for current java process, including system properties and java-specified properties.

[hadoop@K1213 ~]$ jinfo 19552
Attaching to process ID 19552, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 23.6-b04
Java System Properties:

java.runtime.name = Java(TM) SE Runtime Environment
java.vm.version = 23.6-b04
sun.boot.library.path = /usr/java/jdk1.7.0_11/jre/lib/amd64
hadoop.root.logger = INFO,CLA
java.vendor.url = http://java.oracle.com/
java.vm.vendor = Oracle Corporation
path.separator = :
file.encoding.pkg = sun.io
java.vm.name = Java HotSpot(TM) 64-Bit Server VM
sun.os.patch.level = unknown
sun.java.launcher = SUN_STANDARD
user.country = CN
user.dir = /home/data4/hdfsdir/nm-local-dir/usercache/monitor/appcache/application_1417419796124_63366/container_1417419796124_63366_01_000063
java.vm.specification.name = Java Virtual Machine Specification
java.runtime.version = 1.7.0_11-b21
java.awt.graphicsenv = sun.awt.X11GraphicsEnvironment
os.arch = amd64
java.endorsed.dirs = /usr/java/jdk1.7.0_11/jre/lib/endorsed
line.separator = 

java.io.tmpdir = /home/data4/hdfsdir/nm-local-dir/usercache/monitor/appcache/application_1417419796124_63366/container_1417419796124_63366_01_000063/tmp
yarn.app.container.log.dir = /home/workspace/hadoop/logs/userlogs/application_1417419796124_63366/container_1417419796124_63366_01_000063
java.vm.specification.vendor = Oracle Corporation
os.name = Linux
log4j.configuration = container-log4j.properties
sun.jnu.encoding = UTF-8
java.library.path = /home/data4/hdfsdir/nm-local-dir/usercache/monitor/appcache/application_1417419796124_63366/container_1417419796124_63366_01_000063:/home/workspace/hadoop/lib/native:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
java.specification.name = Java Platform API Specification
java.class.version = 51.0
sun.management.compiler = HotSpot 64-Bit Tiered Compilers
os.version = 2.6.18-308.el5
yarn.app.container.log.filesize = 209715200
user.home = /home/hadoop
user.timezone = Asia/Shanghai
java.awt.printerjob = sun.print.PSPrinterJob
file.encoding = UTF-8
java.specification.version = 1.7
user.name = hadoop
java.class.path = ...
java.vm.specification.version = 1.7
sun.arch.data.model = 64
sun.java.command = org.apache.hadoop.mapred.YarnChild 192.168.7.86 14667 attempt_1417419796124_63366_m_000033_0 63
java.home = /usr/java/jdk1.7.0_11/jre
user.language = zh
java.specification.vendor = Oracle Corporation
awt.toolkit = sun.awt.X11.XToolkit
java.vm.info = mixed mode
java.version = 1.7.0_11
java.ext.dirs = /usr/java/jdk1.7.0_11/jre/lib/ext:/usr/java/packages/lib/ext
sun.boot.class.path = /usr/java/jdk1.7.0_11/jre/lib/resources.jar:/usr/java/jdk1.7.0_11/jre/lib/rt.jar:/usr/java/jdk1.7.0_11/jre/lib/sunrsasign.jar:/usr/java/jdk1.7.0_11/jre/lib/jsse.jar:/usr/java/jdk1.7.0_11/jre/lib/jce.jar:/usr/java/jdk1.7.0_11/jre/lib/charsets.jar:/usr/java/jdk1.7.0_11/jre/lib/jfr.jar:/usr/java/jdk1.7.0_11/jre/classes
java.vendor = Oracle Corporation
file.separator = /
java.vendor.url.bug = http://bugreport.sun.com/bugreport/
sun.io.unicode.encoding = UnicodeLittle
sun.cpu.endian = little
sun.cpu.isalist = 

VM Flags:

-XX:+UseSerialGC -Xms1024M -Xmx3096m -XX:PermSize=64m -XX:MaxPermSize=128M -Djava.io.tmpdir=/home/data4/hdfsdir/nm-local-dir/usercache/monitor/appcache/application_1417419796124_63366/container_1417419796124_63366_01_000063/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/workspace/hadoop/logs/userlogs/application_1417419796124_63366/container_1417419796124_63366_01_000063 -Dyarn.app.container.log.filesize=209715200 -Dhadoop.root.logger=INFO,CLA

strace

-- Show calls on function of underlying system (Linux). This is suitable for all processes in Linux.

[hadoop@K1213 ~]$ strace -p 19552
Process 19552 attached - interrupt to quit
futex(0x408a09d0, FUTEX_WAIT, 19553, NULL <unfinished ...>
Process 19552 detached
[hadoop@K1213 ~]$ strace -p 19553
Process 19553 attached - interrupt to quit
futex(0x9fada54, FUTEX_WAIT_PRIVATE, 1747, NULL) = 0
futex(0x9fada28, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x9fada54, FUTEX_WAIT_PRIVATE, 1749, NULL) = 0
futex(0x9fada28, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x9fada54, FUTEX_WAIT_PRIVATE, 1751, NULL) = 0
futex(0x9fada28, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x9fada54, FUTEX_WAIT_PRIVATE, 1753, NULL) = 0
futex(0x9fada28, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x9fada54, FUTEX_WAIT_PRIVATE, 1755, NULL) = 0
futex(0x9fada28, FUTEX_WAKE_PRIVATE, 1) = 0
read(160, "\0\2\4\4\0\31", 6)           = 6
read(160, "\t\0\0 \5\0\0\0\0\21\220\2\0\0\0\0\0\0\30\0%\0\0\2\0\243\255X\321\20\314g"..., 132121) = 68529
read(160, 0x2aaab4ca2457, 63592)        = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(175, EPOLL_CTL_ADD, 160, {EPOLLIN, {u32=160, u64=12336768231720091808}}) = 0
epoll_wait(175, {{EPOLLIN, {u32=160, u64=12336768231720091808}}}, 8192, 3600000) = 1
epoll_ctl(175, EPOLL_CTL_DEL, 160, {0, {u32=160, u64=719115788937592992}}) = 0
epoll_wait(175, {}, 8192, 0)            = 0
read(160, "\332a\375\f\237\331\235\31YD2\304\5\362;#\232A\225\37?\203<w0%\371%S/\275\232"..., 63592) = 10136
read(160, 0x2aaab4ca4bef, 53456)        = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(175, EPOLL_CTL_ADD, 160, {EPOLLIN, {u32=160, u64=4371809895922532512}}) = 0
...

jstat

-- Java Virtual Machine Statistics Monitoring Tool, which is used to monitor information on compile, GC, etc.

The most frequently-used way for me is to get GC statistical information from a current java process:
[hadoop@K1213 ~]$ jstat -gc 11036
 S0C    S1C    S0U    S1U      EC       EU        OC         OU       PC     PU    YGC     YGCT    FGC    FGCT     GCT   
34944.0 34944.0 763.4   0.0   279616.0 218484.1  699072.0   572871.8  65536.0 22317.0     12    0.423   0      0.000    0.423
[hadoop@K1213 ~]$ jstat -gcutil 11036
  S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT   
  0.00   7.17  93.65  81.95  34.06     13    0.432     0    0.000    0.432

The description for all columns is listed here, and you could find more detailed usage for 'jstat' in that webpage. As for the above command, I always pay more attention to FGC (full GC times). Chances are that there are something wrong like memory leak in our code provided that FGC is relatively a large number.

jmap

-- Retrieve current heap space info, including classes, amount of objects, occupied memory, etc. It's just like getting all the info from 'jhat' at runtime.

[hadoop@k1291 ~]$ jmap -histo 11711 | head -n 50

 num     #instances         #bytes  class name
----------------------------------------------
   1:         22834      144531152  [B
   2:        699375      123144384  [D
   3:       1153782       94102184  [Ljava.util.HashMap$Entry;
   4:       1827000       89984520  [C
   5:       1148737       64329272  java.util.HashMap
   6:        331311       58298464  [J
   7:       1423855       34172520  java.lang.String
   8:        331138       29140144  com.miaozhen.yo.tcpreporter.report.Counter
   9:        659720       21111040  java.util.HashMap$Entry
  10:        271052       13010496  java.util.StringTokenizer
  11:        151911       10937592  com.miaozhen.yo.tcpreporter.report.RFPanel
  12:        243543        9741720  java.util.TreeMap$Entry
  13:         47619        6636000  <constMethodKlass>
  14:         47619        6486248  <methodKlass>
  15:        236361        5672664  java.lang.Long
  16:        168142        5380544  com.miaozhen.yo.tcpreporter.history.HistoryCount
  17:          3844        4571976  <constantPoolKlass>
  18:        165569        3973656  com.miaozhen.yo.tcpreporter.report.RFCounter
  19:         92516        3700640  java.util.HashMap$EntryIterator
  20:        146659        3519816  java.lang.StringBuffer
  21:         75587        3023480  com.miaozhen.app.MzSequenceFile$SMeta
  22:          3844        2944464  <instanceKlassKlass>
  23:        105924        2542176  java.lang.StringBuilder
  24:          3185        2520192  <constantPoolCacheKlass>
  25:          3288        2124824  [I
  26:         42750        1710000  sun.misc.FloatingDecimal
  27:         17819        1298352  [Ljava.lang.Object;
  28:          9867        1105104  com.miaozhen.yo.tcpreporter.Purelog
  29:         41796        1003104  com.miaozhen.tools.MyDouble
  30:         51896         830336  java.util.HashMap$EntrySet
  31:         33984         815616  java.lang.Double
  32:         15043         633656  [Ljava.lang.String;
  33:          1291         593104  <methodDataKlass>
  34:         11493         551664  java.nio.HeapByteBuffer
  35:         11487         551376  java.nio.HeapCharBuffer
  36:         33945         543120  com.miaozhen.yo.tcpreporter.history.HistoryLevelOne
  37:          4158         503104  java.lang.Class
  38:         19564         469536  java.util.Date
  39:          9777         469296  java.util.TreeMap$AscendingSubMap
  40:         11487         459480  java.util.ArrayList$SubList
  41:         11487         459480  java.util.ArrayList$SubList$1
  42:          6415         407520  [S
  43:          6775         369960  [[I
  44:         11816         283584  java.util.ArrayList
  45:         15982         255712  com.miaozhen.yo.tcpreporter.history.HistoryLevelTwo
  46:         10619         254856  org.apache.hadoop.io.Text
  47:          1571         175952  org.apache.hadoop.hdfs.protocol.DatanodeInfo

[hadoop@k1291 ~]$ jmap -heap 11711 
Attaching to process ID 11711, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 23.6-b04

using thread-local object allocation.
Mark Sweep Compact GC

Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize      = 3221225472 (3072.0MB)
   NewSize          = 1310720 (1.25MB)
   MaxNewSize       = 17592186044415 MB
   OldSize          = 5439488 (5.1875MB)
   NewRatio         = 2
   SurvivorRatio    = 8
   PermSize         = 21757952 (20.75MB)
   MaxPermSize      = 85983232 (82.0MB)
   G1HeapRegionSize = 0 (0.0MB)

Heap Usage:
New Generation (Eden + 1 Survivor Space):
   capacity = 316669952 (302.0MB)
   used     = 276925888 (264.09710693359375MB)
   free     = 39744064 (37.90289306640625MB)
   87.44937315681912% used
Eden Space:
   capacity = 281542656 (268.5MB)
   used     = 266371136 (254.03131103515625MB)
   free     = 15171520 (14.46868896484375MB)
   94.61128902612896% used
From Space:
   capacity = 35127296 (33.5MB)
   used     = 10554752 (10.0657958984375MB)
   free     = 24572544 (23.4342041015625MB)
   30.04715193563433% used
To Space:
   capacity = 35127296 (33.5MB)
   used     = 0 (0.0MB)
   free     = 35127296 (33.5MB)
   0.0% used
tenured generation:
   capacity = 703594496 (671.0MB)
   used     = 625110416 (596.1517486572266MB)
   free     = 78484080 (74.84825134277344MB)
   88.84526805621856% used
Perm Generation:
   capacity = 25690112 (24.5MB)
   used     = 25664256 (24.475341796875MB)
   free     = 25856 (0.024658203125MB)
   99.89935427295919% used

8870 interned Strings occupying 738840 bytes.

By using the following command, the output HeapDump file can be analysed by 'jhat'.
jmap -dump:live,format=b,file=/home/data8/heapdump.log [PID]

HeapDumpOnOutOfMemoryError

When facing OOM in heap space error, we could simply add the following arguments to our java programming at the startup. In this way, the heap space snapshot is dumped out when OOM occurs. Apply `jhat` command to the HeapDump file for analysing.
-XX:+HeapDumpOnOutOfMemoryError  -XX:HeapDumpPath=/home/data8/heapspaceerror.out

FYI: If we intend to specify the above arguments in MapReduce program, we should do it in the following way:
hadoop jar mr.jar MainClass -Dmapreduce.map.java.opts="-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/home/data8/heapspaceerror.out" argu1 argu2

jhat

-- HAT, abbreviation for Heap Analysis Tool, which is used to analyse HeapDump file.

$ jhat heapdump.out
Reading from log.log...  
Dump file created Wed Mar 14 12:01:45 CST 2012  
Snapshot read, resolving...  
Resolving 6762 objects...  
Chasing references, expect 1 dots.  
Eliminating duplicate references.  
Snapshot resolved.  
Started HTTP server on port 7000  
Server is ready. 

When the server is ready, we could check out the result in browser: http://hostname:7000/. In the webpage, all classes which exists in heap space is listed here, but it is somewhat useless for us. Luckily, some portals which provide with more intuitive analytic result can be found at the end of the webpage.


We could step into "Show instance counts for all classes" to see the count of all objects for each class individually, which is sorted in the webpage.




As we can see in current scenario, there are too many LoggingEvent that results in OOM in heap space, which provides us with a hint where the problem might be.



© 2014-2017 jason4zhu.blogspot.com All Rights Reserved 
If transfering, please annotate the origin: Jason4Zhu

Saturday, December 20, 2014

Way To Crack WEP Wifi Via Aircrack-ng In Mac OSX

Prerequisite

Symbolic-link airport, the underlying implementation of the wireless-connection-related module, to /usr/sbin directory in Mac OSX.
sudo ln -s /System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport /usr/sbin/airport

Before installing aircrack-ng, which is our essential tool to crack WEP wifi password, we have to install MacPorts (a package management tool in Mac OSX), Xcode and XCode command line tools at first.  There're piles of solutions on how to install all the above tools from Google. Only one thing should be mentioned, that we need to execute `sudo port -v selfupdate` after installing MacPorts so as to update the source index.

Eventually, install aircrack-ng via a simple one-line command. After which, we could execute `aircrack-ng` to double-check whether it is successfully installed or not.
sudo port install aircrack-ng

Cracking Process

Firstly, use `airport` to catch BSSID and CHANNEL of the targeted WEP wifi.


Then sniff this wifi via the following command, whose result will be saved to /tmp/*.cap. (en0 is my interface name, which could be retrieved via `ifconfig` command; 6 is the CHANNEL of the targeted WEP wifi)
sudo airport en0 sniff 6

After collecting data packets for some time, invoke the following command in order to perform the cracking process. ('00:25:86:4c:c0:26' is the BSSID)
aircrack-ng -1 -a 1 -b 00:25:86:4c:c0:26 /tmp/airportSniffrekfv1.cap

If data packets is accumulated enough, we could finally crack the wifi password as below:



As we can see, the password is '3030303030' in plain text.



© 2014-2017 jason4zhu.blogspot.com All Rights Reserved 
If transfering, please annotate the origin: Jason4Zhu

Wednesday, December 10, 2014

Way To Deploy Your Own Maven Dependency On Github

Recently, there's a need to share my own maven dependency with friends. Here's what I did in order to make github as a private maven repository.

Insert the following configuration in pom.xml which is nested in the project that you are ready to share via maven.
<build>
    <plugins>
        <plugin>
            <artifactId>maven-deploy-plugin</artifactId>
            <version>2.8.1</version>
            <configuration>
                <altDeploymentRepository>internal.repo::default::file://${project.build.directory}/mvn-repo</altDeploymentRepository>
            </configuration>
        </plugin>
    </plugins>
</build>

Execute `mvn clean deploy`, then you can find "target/mvn-repo" directory from the root of your project.

Create a github repository, execute the following command to push the project to github:
git init
git add .
git commit -m "update"
git remote add origin https://github.com/[github_repository]
git push -u origin master

Eventually, add configuration in pom.xml, whose project is going to refer to that maven dependency, as below.
<repositories>
    <repository>
        <id>miaozhen-dm-sdk-repo</id>
        <url>https://github.com/judking4/com.miaozhen.dm.sdk/tree/master/target/mvn-repo</url>
    </repository>
</repositories>
...
<dependencies>
    <dependency>
        <groupId>com.myproject.dm.sdk</groupId>
        <artifactId>com.myproject.dm.sdk</artifactId>
        <version>0.0.1-SNAPSHOT</version>
    </dependency>
</dependencies>

In which, groupId, artifactId as well as version can all be found in either the mvn-repo directory, or the pom.xml of the maven dependency project.


© 2014-2017 jason4zhu.blogspot.com All Rights Reserved 
If transfering, please annotate the origin: Jason4Zhu

Monday, December 8, 2014

Some Notes On Java Access Modifiers: Public Protected, Default And Private

The following table can, in general, distinct access modifiers clearly.
Modifier    | Class | Package | Subclass | Other package
public      |  y    |    y    |    y     |   y
protected   |  y    |    y    |    y     |   n
default     |  y    |    y    |    n     |   n
private     |  y    |    n    |    n     |   n

y: accessible
n: not accessible

But there are some details that have to be noted down.

#1
A normal class has only two modifiers, namely, 'public' and 'default' (no modifier), which is obvious in semantics that a private or protected class is not congruent with the meaning of the above table. Whereas inner class can be defined by all the four modifiers.

#2
It is probably worth pointing out that in the case of 'default' (no modifier), whether or not the subclass can see it's superclass's methods/fields depends on the location of the subclass. If the subclass is in another package, then the answer is it can't. If the subclass is in the same package then it CAN access the superclass methods/fields.



© 2014-2017 jason4zhu.blogspot.com All Rights Reserved 
If transfering, please annotate the origin: Jason4Zhu

Thursday, December 4, 2014

A Record On The Process Of Adding DataNode To Hadoop Cluster

Procedure of checking and configuring linux node


1. Check on hostname

To check whether current hostname is exactly what it should be.
hostname

2. Check on the type of file system

To make sure it is the same as all the other nodes in Hadoop cluster.
df -T


3. Change owner of all data disk to hadoop user

chown -R hadoop:hadoop /home/data*

4. Check and backup disk mount info

We can simply execute the output of mount.sh in case some mounted filepaths are lost upon restart of node.

su root

--mount.sh--
n=1 ; for i in a b c d e f g h i j k l ; do a=`/sbin/blkid -s UUID | grep ^/dev/sd$i | awk '{print $2}'` ; echo mount $a /home/data$n ; n=`echo $n+1|bc` ; done

> bash mount.sh
mount UUID="09c42017-9308-45c3-9509-e77a2e99c732" /home/data1
mount UUID="72461da2-b0c0-432a-9b65-0ac5bc5bc69a" /home/data2
mount UUID="6d447f43-b2db-4f69-a3b2-a4f69f2544ea" /home/data3
mount UUID="37ca4fb8-377c-493d-9a4c-825f1500ae52" /home/data4
mount UUID="53334c93-13ff-41f5-8688-07023bd6f11a" /home/data5
mount UUID="10fa31f7-9c29-4190-8ecd-ec893d59634c" /home/data6
mount UUID="fe28b8dd-ff3b-49d9-87c6-6eee9f389966" /home/data7
mount UUID="5201d24b-9310-4cff-b3ad-5b09e47780a5" /home/data8
mount UUID="d3b85455-8b94-4817-b43e-69481f9c13c4" /home/data9
mount UUID="6f2630f1-7cfe-4cac-b52d-557f46779539" /home/data10
mount UUID="bafc742d-1477-439a-ade4-29711c5db840" /home/data11
mount UUID="bf6e36d8-1410-4547-853c-f541c9a07e52" /home/data12

We can append the output of mount.sh into /etc/rc.local, in which way the mount command will be invoked automatically every time node starts up.

5. Check on the version of operating system

lsb_release -a

6. Optimizing TCP parameters in sysctl

Append the following content to /etc/sysctl.conf.
fs.file-max = 800000
net.core.rmem_default = 12697600
net.core.wmem_default = 12697600
net.core.rmem_max = 873800000
net.core.wmem_max = 655360000
net.ipv4.tcp_rmem = 8192 262144 4096000
net.ipv4.tcp_wmem = 4096 262144 4096000
net.ipv4.tcp_mem = 196608 262144 3145728
net.ipv4.tcp_max_orphans = 300000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.ip_local_port_range = 1025 65535
net.ipv4.tcp_max_syn_backlog = 100000
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp.keepalive_time = 1200
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.netfilter.ip_conntrack_tcp_timeout_established = 1500
net.core.somaxconn=32768
vm.swappiness=0

Issue the following command to validate the previous setting.
/sbin/sysctl -p

# List all cuurent parameters to double-check
/sbin/sysctl -a

7. Max connections to a file

Check on current max connections to a file:
ulimit -n

Appending the following content to /etc/security/limits.confs so as to change it to 100000.
*      soft    nofile  100000
*      hard    nofile  100000

8. Check and sync /etc/hosts

Checkout a current /etc/hosts file from one of the existing nodes in Hadoop cluster, namely canonical node. Append the newly-added node's host in this file and synchronize it to all nodes.

9. Revise locale to en_US.UTF-8

Append in /etc/profile will just do the work.
export LANG=en_US.UTF-8
export LC_CTYPE=en_US.UTF-8
export LC_NUMERIC=en_US.UTF-8
export LC_TIME=en_US.UTF-8
export LC_COLLATE=en_US.UTF-8
export LC_MONETARY=en_US.UTF-8
export LC_MESSAGES=en_US.UTF-8
export LC_PAPER=en_US.UTF-8
export LC_NAME=en_US.UTF-8
export LC_ADDRESS=en_US.UTF-8
export LC_TELEPHONE=en_US.UTF-8
export LC_MEASUREMENT=en_US.UTF-8
export LC_IDENTIFICATION=en_US.UTF-8
export LC_ALL=en_US.UTF-8

10. Transfer .ssh directory to newly-added node

Since all nodes are sharing one set of .ssh directory, in this way, no-auth ssh among all the nodes is simple to achieve: Append id_rsa.pub in current authorized_keys and spread .ssh directory to all nodes.

11. Deploy Java & Hadoop environment

Copy java directory from canonical node to newly-added node, append the following environment variable to /etc/profile:
export JAVA_HOME=/usr/java/jdk1.7.0_11
export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export PATH=$PATH:$JAVA_HOME/bin:$JAVA_HOME/jre/bin

Likewise, do the same to hadoop directory:
export HADOOP_HOME=/home/workspace/hadoop
export JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:$HADOOP_HOME/lib/native/
export PATH=$PATH:$HADOOP_HOME/bin

At the same time, mkdir and chown the paths configured by dfs.datanode.data.dir parameter in hdfs-site.xml:
su hadoop
for i in {1..12}
do
  mkdir -p /home/data$i/hdfsdir/data
  chmod -R 755 /home/data$i/hdfsdir
done

Lastly, append newly-added host to etc/hadoop/slaves and synchronize to all nodes.
su hadoop
for i in $(cat $HADOOP_HOME/etc/hadoop/slaves  | grep -v "#")
do
 echo '';
 echo $i;
 scp $HADOOP_HOME/etc/hadoop/slaves hadoop@$i:/home/workspace/hadoop/etc/hadoop/;
done

12. Open iptables

This is well-explained in another post: Configure Firewall In Iptables For Hadoop Cluster.

13. Launch Hadoop services

Eventually, we simple invoke the following two commands so as to start DataNode and NodeManager service.
su hadoop
$HADOOP_HOME/sbin/hadoop-daemon.sh start datanode
$HADOOP_HOME/sbin/yarn-daemon.sh start nodemanager

After that, check whether the above two processes exist or not and look through the corresponding logs in $HADOOP_HOME/logs directory for the purpose of double-check.


The python script, which is the implementation of all the procedures listed above, can be found in my project in github.


© 2014-2017 jason4zhu.blogspot.com All Rights Reserved 
If transfering, please annotate the origin: Jason4Zhu

Tuesday, December 2, 2014

A Charset Encoding Problem Related With 'file.encoding' And 'sun.jnu.encoding' Parameters In Java When Executing Mapreduce Job On A Datanode

When executing MapReduce job, we found that there are some Chinese data in log is displayed as '??????', which eventually affects our final results.

We located to the suspicious DataNode and run the following Java program so as to check out the runtime encoding-related parameters.
// --FileEncodingTest.java--

import java.util.*;
import java.net.URLDecoder;
public class FileEncodingTest{
public static void main(String[] args) throws Exception {
        Properties properties = System.getProperties();
        for (Object key : properties.keySet()) {
            System.out.println(String.format("propertyName: %s, propertyValue: %s", key, properties.getProperty(key.toString())));
        }
        String originContent = "%E4%B9%90%E8%A7%86%E8%A7%86%E9%A2%91";
        System.out.println(URLDecoder.decode(originContent, "utf-8"));

    }
}

// --Shell Command--
$ javac FileEncodingTest.java
$ java FileEncodingTest | grep encod --color
propertyName: file.encoding.pkg, propertyValue: sun.io
propertyName: sun.jnu.encoding, propertyValue: ANSI_X3.4-1968
propertyName: file.encoding, propertyValue: ANSI_X3.4-1968
propertyName: sun.io.unicode.encoding, propertyValue: UnicodeLittle

We could see that the default 'sun.jnu.encoding' and 'file.encoding' parameters are "ANSI_X3.4-1968", which is not exactly what we expect.

Curiously enough, default value of the above two parameters are "UTF-8" when my colleague ssh to the same DataNode from his own machine. It is relevant to the configuration of ssh-client machine!

We found in this post that all the LC_* parameters in local machine will be carried to the remote node provided they are not explicitly set in the remote node. After checking `locale` in local machine, the trouble spot is pinpointed.
# --command invoked in local machine--
$ locale
LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=

# --command invoked in remote DataNode--
$ locale
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: ?????????
LANG=zh_CN.UTF-8
LC_CTYPE=UTF-8
LC_NUMERIC="zh_CN.UTF-8"
LC_TIME="zh_CN.UTF-8"
LC_COLLATE="zh_CN.UTF-8"
LC_MONETARY="zh_CN.UTF-8"
LC_MESSAGES="zh_CN.UTF-8"
LC_PAPER="zh_CN.UTF-8"
LC_NAME="zh_CN.UTF-8"
LC_ADDRESS="zh_CN.UTF-8"
LC_TELEPHONE="zh_CN.UTF-8"
LC_MEASUREMENT="zh_CN.UTF-8"
LC_IDENTIFICATION="zh_CN.UTF-8"
LC_ALL=

As we can see from above, 'LC_CTYPE' in local machine is set to "UTF-8", which is decent in MAC OS X. When it is carried to remote DataNode, it is not recognized by CentOS, thus the above "LC_CTYPE=UTF-8" doesn't make any sense, 'ANSI_X3.4-1968' is applied when running Java program.

There are two ways to solve this problem, the first one is less competitive than the second one:

#1. Set the above two java parameter explicitly every time we run a Java program, which is presented in this post:
java -Dsun.jnu.encoding=UTF-8 -Dfile.encoding=UTF-8 FileEncodingTest | grep encod --color

#2. Set locale parameters explicitly so as to make it not ssh-client-env-related. The only thing we need to do is to append the following content in "/etc/profile":
$ su root
$ vim /etc/profile
export LANG=en_US.UTF-8
export LC_CTYPE=en_US.UTF-8
export LC_NUMERIC=en_US.UTF-8
export LC_TIME=en_US.UTF-8
export LC_COLLATE=en_US.UTF-8
export LC_MONETARY=en_US.UTF-8
export LC_MESSAGES=en_US.UTF-8
export LC_PAPER=en_US.UTF-8
export LC_NAME=en_US.UTF-8
export LC_ADDRESS=en_US.UTF-8
export LC_TELEPHONE=en_US.UTF-8
export LC_MEASUREMENT=en_US.UTF-8
export LC_IDENTIFICATION=en_US.UTF-8
export LC_ALL=en_US.UTF-8

In this time, the above two java parameters will alway be set to "UTF-8" when running java programs no matter from which client we ssh to the DataNode.



© 2014-2017 jason4zhu.blogspot.com All Rights Reserved 
If transfering, please annotate the origin: Jason4Zhu