hivesql 连续活跃类需求

1.求出连续活跃用户,连续活跃天数以及连续活跃用户的平均年龄

源表:

guid age time
0001 18 2021-02-25
0002 22 2021-02-25
0002 22 2021-02-26

计算:

with o as(
select
uid,
max(days) as days,
max(age) as age
from
(
  select
  uid,
  age,
  days
  from
  (
    select
    uid,
    age,
    dt,
    sum(1) over(partition by uid,date_sub(dt,row_number() over(partition by uid order by dt))) as days
    from tmp
  ) tt
  where days>=2 and dt='2021-02-26'
) ttt
group by uid
)

select
uid,
days,
avg_age
from o
full join
(
  select
  round(avg(age)) as avg_age
  from o
) t
;
+-------+-------+----------+--+
| uid   | days  | avg_age  |
+-------+-------+----------+--+
| 0002  | 2     | 22     |
+-------+-------+----------+--+

2.连续活跃用户的平均年龄

计算:

select
round(avg(age)) as `连续活跃的用户的平均年龄`
from
(
  select
  max(age) as age
  from
  (
    select
    uid,
    age
    from
    (
      select
      uid,
      age,
      dt,
      sum(1) over(partition by uid,date_sub(dt,row_number() over(partition by uid order by dt))) as days
      from tmp
    ) t
    where days>=2 and dt='2021-02-26'
  ) tt
  group by uid
) tmp
+------------------------+--+
| 连续活跃的用户的平均年龄  |
+------------------------+--+
| 22                     |
+------------------------+--+

你可能感兴趣的:(hive,大数据面试问题,大数据,hive)