Hive / Presto 行转列 列转行

Hive / Presto 行转列 列转行

  • 行转列
    • 1、Hive:
    • 2、Presto:
  • 列转行
    • Hive
      • 1、split将order_ids拆分成数组,lateral view explode将数组炸裂开
    • Presto
      • 1、split将order_ids拆分成数组,cross join unnest将数组炸裂开
      • 2、炸裂 + map

行转列

1、Hive:

collect_set转为数组并去重,concat_ws将数组用逗号间隔连接成字符串

select 
    fuid,
    concat_ws(',', collect_set(cast(fdeal_id as string) )) as order_ids
from tmp.tmp_test
where dt = '2022-03-31'
    and event_type = 1
group by fuid

2、Presto:

array_agg转为数组,array_distinct去重,array_join将数组用逗号间隔连接成字符串

select 
	fuid,
    array_join(array_distinct(array_agg( cast(fdeal_id as varchar) )), ',') as order_ids
from tmp.tmp_test
where dt = '2022-03-31'
	and event_type = 1
group by fuid

列转行

Hive

1、split将order_ids拆分成数组,lateral view explode将数组炸裂开

select a.fuid
    , b.fdeal_id
from tmp.tmp_test a
lateral view explode(split(order_ids, ',')) b as fdeal_id

###炸裂 + map

select  model_code,
        item_code,
        refer_enum,
        busi_cnt
from
(select model_code
       item_code,
       count(distinct if(item_value >= 2 and item_value <= 5,business_id,null)) as cnt2,
       count(distinct if(item_value >= 6 and item_value <= 9,business_id,null)) as cnt3,
       count(distinct if(item_value >= 10 and item_value <= 12,business_id,null)) as cnt4
from tmp.tmp_test
where dt = '2021-05-24'
group by model_code,
         item_code) a
lateral view explode(map('2-5', cnt2,
                         '6-9', cnt3,
                         '10-12', cnt4)) b as refer_enum, busi_cnt

Presto

1、split将order_ids拆分成数组,cross join unnest将数组炸裂开

select a.fuid
    , b.fdeal_id
from tmp.tmp_test a
cross join unnest(split(order_ids, ',')) as b(fdeal_id) 

2、炸裂 + map

select 
    t1.fuid,
    t2.lable_name, 
    t2.label_value
from (
        select t1.fuid,                                   
               cast(t1.bus_type as varchar) bus_type,        
               t1.dept_code,                    
               t1.dept_name,                    
               cast(t1.black_gold as varchar) black_gold,                         
               cast(t1.chat_tag as varchar) chat_tag                       
          from tmp.tmp_test t1
         where t1.dt = '2021-06-30'
       ) t1
 cross join unnest (
  array['bus_type', 'dept_code', 'dept_name', 'black_gold', 'chat_tag'],
  array[bus_type, dept_code, dept_name, black_gold,chat_tag]
 ) t2 (lable_name, label_value)

你可能感兴趣的:(#,HIVE,hive,hadoop,数据仓库)