Monday, May 7, 2018

Spark执行卡住或过慢时从YARN监控页排查思路


  • 在YARN-Stages tab,检查卡住/很慢的stage对应的executor数量,如果executor数量很少,同时对应后面的shuffle read size或者records数量很大(图1),则很可能是因为没有开启spark.dynamicAllocation.enabled。开启配置如下: 

spark.dynamicAllocation.enabled true
spark.dynamicAllocation.initialExecutors 1
spark.dynamicAllocation.minExecutors 1
spark.dynamicAllocation.maxExecutors 300
spark.shuffle.service.enabled true
  • 如果某个很慢或者卡住的stage对应的task数量为200(图2),则应该注意是spark.sql.shuffle.partitions导致的,此param默认200,可以设置为2011等大值即可。同理,如果出现tasks数量为12,则应该是由于spark.default.parallelism参数。

  • 观察“Executor页面,如果Task Time(GC Time)背景飘红,说明gc时间过长。可以通过启动时添加set spark.executor.extraJavaOptions=-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGC打印gc日志,从executor列表后面的stdout里查看。从优化角度讲,spark推荐使用G1GC。如果G1GC依旧出现上述问题,则可能当前在一个executor里并发的task数过多(task本身是一个算子(lambda),所以可能使当前的< 输入->输出 >后数据膨胀)。比如executor.memory为12G,executor.cores为4,则一共有4个task并行,每个task平均3g内存。如果减少cores数量,则可以变相提高每个task可使用的内存量。对于当前的case,从gc日志看出,heap space已经动态expand到12G,说明task的确需要消耗很多内存,所以只好调小cores数量从而降低gc time。

  • 在YARN-Jobs tab,可以看到所有stage列表,每项后面有Shuffle Read和Shuffle Write. 前者表示从上一个stage读取的shuffle数据数量,后者表示写出到下一个stage的shuffle数据数量。从这里可以可以粗略估计下当前stage所需的tasks数量。



  • REFERENCE: 

    7 comments:

    1. Nice information, valuable and excellent in Job, as share good stuff with good ideas and concepts, lots of great information and inspiration, both of which I need, thanks to offer such a helpful information here.

      mobile phone repair in Canton
      iphone repair in Canton
      cell phone repair in Canton
      tablet repair in Canton
      ipad repair in Canton
      phone repair in Canton
      mobile phone repair canton
      iphone repair canton
      cell phone repair canton
      phone repair canton

      ReplyDelete

    2. Thanks for sharing great content, Keep going! Digital Marketing Course in Jammu offers the best digital digital marketing course in Jammu

      ReplyDelete
    3. Thanks for the view. I have gone through it and it was really informative and was consoling. If you want to rank your work and improved the quality than join Digital marketing institutes in patiala . For the better improvement in your content.

      ReplyDelete
    4. Thanks for sharing great content! Digital Marketing Training in Agra
      offers the top best digital marketing course.

      ReplyDelete