题目来源于牛客网,边刷题边记录巩固,为了方便自己复习和分享。基础篇即入门教程,有些很简单的就不解释了,一步步来,无需着急,学徒心态。(有借鉴许多人的题解链接等,知识搬运工,侵删)
~
部分规则,照顾萌新
标点符号切记要英文的,不能中文的标点符号!!!
~
描述
题目:现在运营同学想要用户的设备id对应的性别、年龄和学校的数据,请你取出相应数据
select device_id, gender, age, university
from user_profile
select *
from user_profile
或者
select id, device_id, gender, age, university, province
from user_profile
select DISTINCT university
from user_profile
distinct这个关键字用来过滤掉多余的重复记录只保留一条,可以理解为过滤。
select device_id
from user_profile limit 2
利用 limit 来限制查询结果的返回行数
select device_id as user_infos_example
from user_profile limit 2
只是在上一题基础上将列名用 as 修改了
select device_id, age
from user_profile
order by age ASC
order by 列名 按照’列名‘进行排序,asc/desc 升序/降序,根据要求使用(不写默认是asc 升序)
SELECT device_id,gpa,age from user_profile order by gpa,age;
SELECT device_id,gpa,age from user_profile order by gpa,age asc;
SELECT device_id,gpa,age from user_profile order by gpa asc,age asc;
select device_id, university
from user_profile
where university='北京大学'
where 后接条件
select device_id, gender, age, university
from user_profile
where age > 24
select device_id, gender, age
from user_profile
where age >= 20 and age <= 23
可以用 and 合并两个条件
select device_id, gender, age
from user_profile
where age BETWEEN 20 and 23
也可以用 between and 达到同样的效果
select device_id, gender, age, university
from user_profile
where university !='复旦大学'
!= 相当于 ’不等于‘的意思
还可以这样子
select device_id, gender, age, university
from user_profile
where university not in ('复旦大学')
列名 [NOT] IN(常量1, 常量2,…常量n)
IN:当列中的值与IN中的某个常量值相等时,则结果为True,表明此记录为符合查询条件的记录。
NOT IN:当列中的值与某个常量值相等时,结果为False,表明此记录为不符合查询条件的记录。
select device_id, gender, age, university
from user_profile
where age != ''
select device_id, gender, age, university
from user_profile
where age is not NULL
这里的考点就是考察数据库的空值,MySQL里的是null
select device_id,gender,age,university,gpa from user_profile
where university in('北京大学','复旦大学','山东大学');
SELECT device_id, gender, age, university, gpa
FROM user_profile
WHERE (gpa > 3.5 AND university = '山东大学') or (gpa > 3.8 AND university = '复旦大学')
有两个关系为’或‘的条件,用 or 连接起来。不用括号也行,因为 and 的优先级比 or 高。
select device_id, age, university
FROM user_profile
WHERE university like '%北京%'
这里涉及SQL通配符以及 like 运算符的知识
举例(搬运原文):
select gpa
from user_profile
WHERE university='复旦大学'
ORDER by gpa DESC LIMIT 1
select max(gpa) as gpa
from user_profile
where university='复旦大学'
两种方法,可以对复旦大学的gpa进行降序排序再限制输出,也可以使用 max()解决
select COUNT(1) as male_num, avg(gpa) as avg_gpa
from user_profile
where gender='male'
上面是我写的,我这里的count(1)是 有多少行数据就会累计加一 的意思
SELECT gender, university, COUNT(device_id) user_num,
avg(active_days_within_30) avg_active_day, avg(question_cnt) avg_question_cnt
from user_profile
GROUP BY gender, university
用合适的函数处理对应的列,按照题目分组
SELECT university, avg(question_cnt) avg_question_cnt, avg(answer_cnt) avg_answer_cnt
from user_profile
GROUP BY university
HAVING avg_question_cnt < 5 OR avg_answer_cnt < 20
注意分组条件是university,求的是平均值avg
SQL出现having的原因是,where关键字无法与聚合函数一起使用,having关键字放在group by关键字后面,针对分组后的数据进行筛选.
记住聚合函数后,还需要过滤就使用having即可
SELECT university, avg(question_cnt) avg_question_cnt
from user_profile
GROUP BY university
ORDER BY avg_question_cnt
考察基础的综合运用
JOIN 知识点 -> 链接
SELECT q.device_id, q.question_id, q.result
FROM question_practice_detail q INNER JOIN user_profile u
ON q.device_id=u.device_id
WHERE u.university='浙江大学'
SELECT u.university, COUNT(q.question_id) / COUNT(DISTINCT q.device_id) avg_answer_cnt
FROM user_profile u INNER JOIN question_practice_detail q
on u.device_id=q.device_id
GROUP BY u.university
难点:没想到要剔重,计算公式
思路:
给了三张表,分别都有涉及到里面的列,所以将他们链接在一起。
将要求的列 select ,根据之前题目的经验,并且细心发现有重复的数据,所以要剔重。
想计算公式,答题数 ÷ 答题人数
round()偷懒没写
SELECT
university,
difficult_level,
COUNT(t1.question_id) / COUNT(DISTINCT (t2.device_id)) avg_answer_cnt
FROM
question_detail t1
join question_practice_detail t2 ON t1.question_id=t2.question_id
join user_profile t3 ON t3.device_id=t2.device_id
GROUP BY university, difficult_level
只要在上一题的基础上,加上山东大学的判断即可
SELECT
t1.university,
t3.difficult_level,
COUNT(t2.question_id)/COUNT(distinct(t2.device_id)) as avg_answer_cnt
FROM
user_profile t1,
question_practice_detail t2,
question_detail t3
WHERE
t1.device_id = t2.device_id
and
t2.question_id = t3.question_id
and
t1.university = '山东大学'
GROUP BY t1.university,t3.difficult_level;
刚开始做错了,直接用 or 会去重,因为有可能一条数据同时满足这两个条件
# 做错的
SELECT device_id, gender, age, gpa
FROM user_profile
WHERE university='山东大学' OR gender='male'
所以,要引入一个新的关键词 union
union知识图原文链接
SELECT device_id,gender,age,gpa FROM user_profile
WHERE university='山东大学'
UNION ALL
SELECT device_id,gender,age,gpa FROM user_profile
WHERE gender='male'
搬运的题解链接
MySQL中提供了三种条件判断函数:IF ()、IFNULL ()与CASE
SELECT IF(age >= 25, '25岁及以上','25岁以下') age_cut, COUNT(*) number
FROM user_profile
GROUP BY age_cut
SELECT CASE WHEN age < 25 OR age IS NULL THEN '25岁以下'
WHEN age >= 25 THEN '25岁及以上'
END age_cut,COUNT(*) number
FROM user_profile
GROUP BY age_cut
case函数知识点链接
SELECT
device_id,
gender,
CASE
WHEN age < 20 THEN '20岁以下'
WHEN age >= 20 AND age <= 24 THEN '20-24岁'
WHEN age >= 25 THEN '25岁及以上'
WHEN age is NULL THEN '其他'
END age_cut
FROM user_profile
不要忘记 end 这个关键字
SELECT DAY(date)day,
COUNT(question_id)question_cnt
FROM question_practice_detail
WHERE YEAR(date)="2021" and month(date)="08"
GROUP BY day;
第二种:题解链接
SELECT DAY(date) as day,
COUNT(question_id) AS question_cnt
FROM question_practice_detail
WHERE SUBSTR(date,1,7)='2021-08'
group by day
第三种:题解链接
模糊查询,like 关键字
SELECT DAY(date) day,
COUNT(question_id) question_cnt
FROM question_practice_detail
WHERE date like '%2021-08%'
GROUP BY day;
where date like ‘%-08-%’ 这样子也可以,很多写法。不过用 like 耗内存。
日期函数知识点补充 链接
SELECT
COUNT(q2.device_id) / COUNT(q1.device_id) AS avg_ret
FROM
(SELECT DISTINCT device_id, date FROM question_practice_detail)as q1
LEFT JOIN
(SELECT DISTINCT device_id, date FROM question_practice_detail) AS q2
ON q1.device_id = q2.device_id AND q2.date = DATE_ADD(q1.date, interval 1 day)
有个小坑,用like的时候忽略了 ’,‘逗号。因为’%male%‘会包含female、male两种性别。
SELECT IF(profile LIKE '%,male%', 'male', 'female') gender, COUNT(*) number
FROM user_submit
GROUP BY gender
也可以直接用 female 去模糊匹配
SELECT IF(profile LIKE '%female%', 'female', 'male') gender, COUNT(*) number
FROM user_submit
GROUP BY gender
还有一种解法是使用SUBSTRING_INDEX,字符串截取。点击 -> 题解链接
SELECT SUBSTRING_INDEX(profile,",",-1) gender,COUNT(*) number
FROM user_submit
GROUP BY gender;
第二种解法相对来说,占用内存较大。
SELECT device_id, SUBSTRING_INDEX(blog_url, '/', -1) user_name
FROM user_submit
就是使用了上一题的截取字符串的方法
SELECT
SUBSTRING_INDEX(SUBSTRING_INDEX(profile, ',', -2),',', 1) age,
COUNT(device_id) number
FROM user_submit
GROUP BY age
截取的方式有很多,但都是再嵌套一层函数。是第二题的变式,抬走,下一题。
想获取窗口函数的详细知识点请点击上面的链接
利用此前的知识,可以这样做
SELECT device_id, university, gpa
FROM user_profile
WHERE gpa in (SELECT MIN(gpa) FROM user_profile GROUP BY university)
ORDER BY university
引入新函数——窗口函数
SELECT device_id,university,gpa
FROM
(SELECT device_id,university,gpa,
RANK() over (PARTITION BY university order by gpa) as rk
FROM user_profile) a
WHERE a.rk=1
select up.device_id, university,
count(question_id) as question_cnt,
sum(if(qpd.result='right', 1, 0)) as right_question_cnt
from user_profile as up
left join question_practice_detail as qpd
on qpd.device_id = up.device_id and month(qpd.date) = 8
where up.university = '复旦大学'
group by up.device_id
SELECT
qd.difficult_level,
sum(if(qpd.result='right', 1, 0)) / count(qpd.question_id) as correct_rate
FROM user_profile up
JOIN
question_practice_detail qpd
ON up.device_id=qpd.device_id
JOIN
question_detail qd
ON qd.question_id=qpd.question_id
WHERE up.university='浙江大学'
group by qd.difficult_level
ORDER BY correct_rate
还是有高手的呀,大意了 -> 链接
忘记了还能使用 avg函数,更简洁了。
SELECT COUNT(DISTINCT device_id) did_cnt, COUNT(question_id) question_cnt
FROM question_practice_detail
WHERE MONTH(date)=8
~
基础篇终于刷完了,不到两天,也找回了刷题的乐趣,要是早点该多好呀,不过“现在”永远不算迟,其实只要行动起来,就会解决很多焦虑、迷茫等问题。只要在路上,永远不算迟;只要能够达成目标,慢一点也无妨。共勉 Peace!