impala存储和压缩

文件格式

压缩编码

Impala是否可直接创建

是否可直接插入

Parquet

Snappy(默认), GZIP;

Yes

支持:CREATE TABLE, INSERT, 查询

TextFile

LZO,gzip,bzip2,snappy

Yes. 不指定 STORED AS 子句的 CREATE TABLE 语句,默认的文件格式就是未压缩文本

支持:CREATE TABLE, INSERT, 查询。如果使用 LZO 压缩,则必须在 Hive 中创建表和加载数据

RCFile

Snappy, GZIP, deflate, BZIP2

Yes.

支持CREATE,查询,在 Hive 中加载数据

SequenceFile

Snappy, GZIP, deflate, BZIP2

Yes.

支持:CREATE TABLE, INSERT, 查询。需设置

注:impala不支持ORC格式

1.创建parquet格式的表并插入数据进行查询

[hadoop104:21000] > create table student2(id int, name string)

                  > row format delimited

                  > fields terminated by '\t'

                  > stored as PARQUET;

[hadoop104:21000] > insert into table student2 values(1001,'zhangsan');

[hadoop104:21000] > select * from student2;

2.创建sequenceFile格式的表,插入数据时报错

[hadoop104:21000] > create table student3(id int, name string)

                  > row format delimited

                  > fields terminated by '\t'

                  > stored as sequenceFile;

[hadoop104:21000] > insert into table student3 values(1001,'zhangsan');

Query: insert into table student3 values(1001,'zhangsan')

Query submitted at: 2018-10-25 20:59:31 (Coordinator: http://hadoop104:25000)

Query progress can be monitored at: http://hadoop104:25000/query_plan?query_id=da4c59eb23481bdc:26f012ca00000000

WARNINGS: Writing to table format SEQUENCE_FILE is not supported. Use query option ALLOW_UNSUPPORTED_FORMATS to override.

[hadoop104:21000] > set ALLOW_UNSUPPORTED_FORMATS=true;

[hadoop104:21000] > insert into table student3 values(1001,'zhangsan');

 

你可能感兴趣的:(Impala)