Hive Serde 序列化与反序列化

Hive Serde 序列化与反序列化

hive Serde官方文档

RegEx 基于正则的匹配

CREATE TABLE apachelog (
  host STRING,
  identity STRING,
  user STRING,
  time STRING,
  request STRING,
  status STRING,
  size STRING,
  referer STRING,
  agent STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
  "input.regex" = "([^]*) ([^]*) ([^]*) (-|\\[^\\]*\\]) ([^ \"]*|\"[^\"]*\") (-|[0-9]*) (-|[0-9]*)(?: ([^ \"]*|\".*\") ([^ \"]*|\".*\"))?"
)
STORED AS TEXTFILE;

JSON数据结构

In some distributions, a reference to hive-hcatalog-core.jar is required.
ADD JAR /usr/lib/hive-hcatalog/lib/hive-hcatalog-core.jar;

从Hive 0.12版本开始
CREATE TABLE my_table(a string, b int)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
STORED AS TEXTFILE;

Hive 3.0.0新的JsonSerDe,原来的JsonSerDe也可以用
CREATE TABLE my_table(a string, b int)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.JsonSerDe'
STORED AS TEXTFILE;

Hive 4.0.0才支持JSONFILE
STORED AS JSONFILE

CSV/TSV格式

-- 所有的字段必须是string类型
CREATE TABLE my_table(a string, b string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
   "separatorChar" = "\t",
   "quoteChar"     = "'",
   "escapeChar"    = "\\"
)  
STORED AS TEXTFILE;

你可能感兴趣的:(Hive系列,hive)