spark 调试方法总结

1 日志调试

配置 log4j.properties (只显示警告信息)

# log4j.rootCategory=INFO, console
log4j.rootCategory=WARN, console

程序中使用logger

import org.apache.log4j.Logger

object TestSpark extends App{
    val logger = Logger.getLogger("TestSpark");
    logger.debug("This is debug message.");
    logger.info("This is info message.");
    logger.error("This is error message.");

}

2 断点调试

* 修改 spark_home/bin/spark-class  (增加 java_opts)

done < <("$RUNNER" -cp "$LAUNCH_CLASSPATH" org.apache.spark.launcher.Main $JAVA_OPTS "$@")

* 在master机器上运行(开启tcp监听端口)

export JAVA_OPTS="$JAVA_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=5005"

*配置idea调试参数

Run->Edit Configuration ->remote +

* 启动job

spark-submit --master spark://localhost:7077 --class TestSpark --executor-memory 2g --num-executors 3 test.jar

*idea  debug_remote




***************

1、设置Flume启动参数,利用记事本打开apache-flume-1.6.0-src\bin\flume-ng并进行编辑,主要将以下项
JAVA_OPTS="-Xmx20m"

JAVA_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=5005"

你可能感兴趣的:(大数据)