Thursday, January 1, 2015

Create Custom Counter In MapReduce

Hadoop MapReduce Counter provides a way to measure the progress or the number of operations that occur within MapReduce programs. Basically, MapReduce framework provides a number of built-in counters to measure basic I/O operations, such as FILE_BYTES_READ/WRITTEN and Map/Combine/Reduce input/output records. These counters are very useful especially when you evaluate some MapReduce programs. Besides, the MapReduce Counter allows users to employ your own counters. Since MapReduce Counters are automatically aggregated over Map and Reduce phases, it is one of the easiest way to investigate internal behaviors of MapReduce programs. In this post, I’m going to introduce how to use your own MapReduce Counter.

For your own MapReduce counter, you first define a enum type as follow:
private static enum JasonZhuCounter {
    RECORD_NUM
}

And then, when you want to increment your own counter, you should call the increment method as below:
public void map(LongWritable key, Text value, Context context)
    throws IOException, InterruptedException {
 context.getCounter(JasonZhuCounter.RECORD_NUM).increment(1);
 context.write(new Text(value),NullWritable.get());
}

Eventually, You can get some Counters from a finished job:
Counter recordNumCounter = job.getCounters().findCounter(JasonZhuCounter.RECORD_NUM);
System.out.println("RecordNumCounter=["+recordNumCounter.getValue()+"]");

By the way, if intending to inspect on your own counter when the MapReduce task is running, you could just check it out on the YARN monitor webpage from 'Counter' sub-webpage in the specific task.



© 2014-2017 jason4zhu.blogspot.com All Rights Reserved 
If transfering, please annotate the origin: Jason4Zhu

1 comment:

  1. Hi Jashon,
    Thanks for such a nice explanation.Could you please help me out on below point.
    Can we read mapper's counter value in reducer.

    Thanks,
    Chandra

    ReplyDelete