在互联网行业中,用户在某段时间内开始使用某一款APP,经过一段时间后,仍然继续使用该APP的用户,被认作是留存用户。
这部分用户占当时新增用户的比例即是留存率,会按照每隔1单位时间(日、周、月)来进行统计。
留存用户和留存率是产品优质服务的体现。
次日留存率:(第一天新增的用户,在注册的第2天还登录的用户数)/第一天新增总用户数;
三日留存率:(第一天新增的用户,在注册的第3天还登录的用户数)/第一天新增总用户数;
七日留存率:(第一天新增的用户,在注册的第7天还登录的用户数)/第一天新增总用户数;
-- 用户注册表
create table user_info(user_id varchar(10) primary key,reg_time datetime);
insert into user_info values
('u_01','2020-01-01 09:15:00'),
('u_02','2020-01-01 00:04:00'),
('u_03','2020-01-01 22:16:00'),
('u_04','2020-01-01 20:32:00'),
('u_05','2020-01-01 13:59:00'),
('u_06','2020-01-01 21:28:00'),
('u_07','2020-01-01 14:03:00'),
('u_08','2020-01-01 11:00:00'),
('u_09','2020-01-01 23:57:00'),
('u_10','2020-01-01 04:46:00'),
('u_11','2020-01-02 14:21:00'),
('u_12','2020-01-02 11:15:00'),
('u_13','2020-01-02 07:26:00'),
('u_14','2020-01-02 10:34:00'),
('u_15','2020-01-02 08:22:00'),
('u_16','2020-01-02 14:23:00'),
('u_17','2020-01-03 09:20:00'),
('u_18','2020-01-03 11:21:00'),
('u_19','2020-01-03 12:17:00'),
('u_20','2020-01-03 15:26:00');
-- 登陆日志表
create table login_log(user_id varchar(10),login_time datetime,primary key(user_id,login_time));
insert into login_log values
('u_02','2020-01-02 00:14:00'),
('u_10','2020-01-02 08:32:00'),
('u_03','2020-01-02 09:20:00'),
('u_08','2020-01-02 10:07:00'),
('u_04','2020-01-02 10:29:00'),
('u_09','2020-01-02 11:45:00'),
('u_05','2020-01-02 12:19:00'),
('u_01','2020-01-02 14:29:00'),
('u_15','2020-01-03 00:26:00'),
('u_14','2020-01-03 11:18:00'),
('u_11','2020-01-03 13:18:00'),
('u_16','2020-01-03 14:33:00'),
('u_06','2020-01-04 07:51:00'),
('u_18','2020-01-04 08:11:00'),
('u_07','2020-01-04 09:27:00'),
('u_10','2020-01-04 10:59:00'),
('u_20','2020-01-04 11:51:00'),
('u_03','2020-01-04 12:37:00'),
('u_17','2020-01-04 15:07:00'),
('u_08','2020-01-04 16:35:00'),
('u_01','2020-01-04 19:29:00'),
('u_14','2020-01-05 08:03:00'),
('u_12','2020-01-05 10:27:00'),
('u_15','2020-01-05 16:33:00'),
('u_19','2020-01-06 09:03:00'),
('u_20','2020-01-06 15:26:00'),
('u_04','2020-01-08 11:03:00'),
('u_05','2020-01-08 12:54:00'),
('u_06','2020-01-08 19:22:00'),
('u_13','2020-01-09 10:20:00'),
('u_15','2020-01-09 16:40:00'),
('u_18','2020-01-10 21:34:00');
-- 查询每日新增用户数,次日/三日/七日留存用户数,次日/三日/七日留存率
select
date(reg_time) dt,
count(distinct user_info.user_id) 新增用户数,
sum(datediff(login_time,reg_time)=1) 次日留存用户数,
sum(datediff(login_time,reg_time)=3) 三日留存用户数,
sum(datediff(login_time,reg_time)=7) 七日留存用户数,
sum(datediff(login_time,reg_time)=1)/count(distinct user_info.user_id) 次日留存率,
sum(datediff(login_time,reg_time)=3)/count(distinct user_info.user_id) 三日留存率,
sum(datediff(login_time,reg_time)=7)/count(distinct user_info.user_id) 七日留存率
from user_info left join login_log on user_info.user_id=login_log.user_id
group by date(reg_time);
+------------+------------+----------------+----------------+----------------+------------+------------+------------+
| dt | 新增用户数 | 次日留存用户数 | 三日留存用户数 | 七日留存用户数 | 次日留存率 | 三日留存率 | 七日留存率 |
+------------+------------+----------------+----------------+----------------+------------+------------+------------+
| 2020-01-01 | 10 | 8 | 6 | 3 | 0.8000 | 0.6000 | 0.3000 |
| 2020-01-02 | 6 | 4 | 3 | 2 | 0.6667 | 0.5000 | 0.3333 |
| 2020-01-03 | 4 | 3 | 2 | 1 | 0.7500 | 0.5000 | 0.2500 |
+------------+------------+----------------+----------------+----------------+------------+------------+------------+