spark 读写 parquet

SQLConf

// This is used to set the default data source
  val DEFAULT_DATA_SOURCE_NAME = buildConf("spark.sql.sources.default")
    .doc("The default data source to use in input/output.")
    .stringConf
    .createWithDefault("parquet")
...
def defaultDataSourceName: String = getConf(DEFAULT_DATA_SOURCE_NAME)
...

test case

val df = spark.read.format("ParquetFileFormat").load(parquetFile)
df.show()
df.show() debug 出来的情况

FileScanRDD.scala

private def readCurrentFile(): Iterator[InternalRow] = {
        try {
        //readFunction 函数
          readFunction(currentFile)
        } catch {
          case e: FileNotFoundExc

你可能感兴趣的:(大数据,spark,spark源码分析,spark,大数据,分布式)