Hive编程(一)

数据库操作:

数据仓库的创建:

create database (if not exist)hivedwd;

创建时添加键值对信息:

create dabatase hivedwh with dbproperties(‘owner’=‘itcast’,‘data’=‘2021’);

数据仓库显示:

show databases;

显示仓库详细信息:

describe database extended hivedwh;

切换数据仓库:

use hivedwh;

数据仓库的修改:

alter database hivedwh  set dbproperties(‘createtime’=‘20210112’);

数据仓库的删除:

drop database hivedwh;

数据仓库强制删除:

drop database (if exists)hivedwh cascade;

数据库中表的操作:

内部表:

创建内部表:

create table table_1(id int,name string);

向表中插入数据:

insert into table_1 values (1,"张三");

创建内部表并指定字段之间的分隔符:

create table table_1(id int,name string) row format delimited fields terminated by '\t';

创建表并指定放入的路径:

create table if not exists table_1 row format delimited fields terminated by '\t' location '/user/stu2'; 

根据查询结果创建表:

create table if not exists table_1 as select * from stus

根据已经存在的表创建新的表:

create table table_1 like stu1;

查看表的详细信息:

desc formatted table_1;

删除表:

drop table table_1;

外部表:

创建外部表:

create external table if not exists table_1 (id int,name string) row format delimited fields terminated by '\t';

加载数据:

load data local inpath "文件路径" into table table_1;

加载数据并覆盖已有数据:

load data local inpath "文件路径"  overwrite into table table_1;

分区表:

创建分区表:

create table if not exists table_1(id int , name string) partitioned by (month string)  row format delimited fields terminated by'\t';

创建一个表带多个分区:

create table if not exists table_1(id int ,name string) partitioned by(month int,day string) row format delimited fields terminated by '\t';

多分区表联合查询:(union all)

select * from table_1 where month = '2021' union all select * from table_1 where month = '2020'

查看分区:

show partitions table_1;

添加一个分区:

alter table table_1 add partition(month='2021');

删除一个分区:

alter table table_1 drop partition(month='2021')

分桶表:

创建分桶表:

set hive.enforce.bucketing=true;

设置Reduce个数:

set mapreduce.job.reduce=3;

创建分桶表:

create table course (c_id string,c_name string,t_id string) clustered by(c_id) into 3 buckets row format delimited fields termainated by '\t';

数据加载时需要常见普通表,使用insert overwrite的方法进行加载数据

你可能感兴趣的:(hive,大数据)