MySQL多表查询进阶实战:连接与子查询深度解析

一、多表连接核心原理

1. 七种JOIN操作

-- 创建测试数据
CREATE TABLE departments (
    dept_id INT PRIMARY KEY,
    dept_name VARCHAR(50)
);

CREATE TABLE employees (
    emp_id INT PRIMARY KEY,
    emp_name VARCHAR(50),
    dept_id INT
);

-- 1. 内连接(INNER JOIN)
SELECT e.emp_name, d.dept_name
FROM employees e
INNER JOIN departments d ON e.dept_id = d.dept_id;

-- 2. 左连接(LEFT JOIN)
SELECT e.emp_name, d.dept_name
FROM employees e
LEFT JOIN departments d ON e.dept_id = d.dept_id;

-- 3. 右连接(RIGHT JOIN)
SELECT e.emp_name, d.dept_name
FROM employees e
RIGHT JOIN departments d ON e.dept_id = d.dept_id;

-- 4. 全外连接(FULL OUTER JOIN)MySQL需用UNION模拟
SELECT e.emp_name, d.dept_name
FROM employees e LEFT JOIN departments d ON e.dept_id = d.dept_id
UNION
SELECT e.emp_name, d.dept_name
FROM employees e RIGHT JOIN departments d ON e.dept_id = d.dept_id
WHERE e.dept_id IS NULL;

二、连接查询性能优化

1. 执行计划分析

-- 查看连接查询的执行计划
EXPLAIN 
SELECT e.emp_name, d.dept_name
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id;

关键指标解读

  • typeeq_ref为最佳连接类型

  • key:确认连接字段使用索引

  • rows:乘积值越小越好

2. 索引策略

-- 为连接字段创建索引
ALTER TABLE employees ADD INDEX idx_dept (dept_id);

-- 多列连接索引优化
ALTER TABLE orders ADD INDEX idx_customer_product (customer_id, product_id);

3. 连接顺序控制

-- 使用STRAIGHT_JOIN强制连接顺序
SELECT /*+ STRAIGHT_JOIN */ e.emp_name, d.dept_name
FROM employees e
JOIN departments d ON e.dept_id = d.dept_id;

三、子查询高级技巧

1. 子查询类型对比

类型 示例 执行特点
标量子查询 SELECT (SELECT MAX(salary) FROM emp) 返回单个值
列子查询 WHERE id IN (SELECT id FROM temp) 返回单列多行
行子查询 WHERE (id,name) = (SELECT 1,'Alice') 返回单行多列
EXISTS子查询 WHERE EXISTS (SELECT 1 FROM dept...) 只返回布尔值

2. IN vs EXISTS性能对决

-- IN子查询(适合小结果集)
SELECT * FROM employees
WHERE dept_id IN (SELECT dept_id FROM departments WHERE location = '北京');

-- EXISTS子查询(适合大表关联)
SELECT e.* FROM employees e
WHERE EXISTS (
    SELECT 1 FROM departments d 
    WHERE d.dept_id = e.dept_id AND d.location = '北京'
);

优化建议

  • 外表大内表小用IN

  • 外表小内表大用EXISTS

  • MySQL 8.0+ 会自动优化为JOIN

3. 派生表优化

-- 原始低效写法
SELECT * FROM (
    SELECT * FROM employees WHERE salary > 10000
) AS high_salary_emp
WHERE dept_id = 5;

-- 优化写法(条件下推)
SELECT * FROM employees 
WHERE salary > 10000 AND dept_id = 5;

-- 必须使用派生表时(添加索引提示)
SELECT * FROM (
    SELECT /*+ INDEX(emp idx_salary) */ * 
    FROM employees 
    WHERE salary > 10000
) AS emp_with_index;

 

四、实战案例:电商数据分析

1. 多层级联查询

-- 查询购买了"电子产品"类商品的VIP用户
SELECT DISTINCT u.user_name, u.phone
FROM users u
JOIN orders o ON u.user_id = o.user_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id
JOIN categories c ON p.category_id = c.category_id
WHERE u.is_vip = 1 
  AND c.category_name = '电子产品'
  AND o.create_time > DATE_SUB(NOW(), INTERVAL 30 DAY);

2. 复杂聚合分析

-- 每个月的商品销售TOP3
SELECT month, product_name, sales_amount FROM (
    SELECT 
        DATE_FORMAT(o.create_time, '%Y-%m') AS month,
        p.product_name,
        SUM(oi.quantity * oi.price) AS sales_amount,
        RANK() OVER (PARTITION BY DATE_FORMAT(o.create_time, '%Y-%m') 
            ORDER BY SUM(oi.quantity * oi.price) DESC) AS sales_rank
    FROM orders o
    JOIN order_items oi ON o.order_id = oi.order_id
    JOIN products p ON oi.product_id = p.product_id
    GROUP BY month, p.product_id
) AS ranked_sales
WHERE sales_rank <= 3;

五、性能陷阱与解决方案

1. N+1查询问题

反模式

-- 应用程序中循环执行(伪代码)
for dept in departments:
    emps = execute("SELECT * FROM employees WHERE dept_id = ?", dept.id)

解决方案

-- 一次性JOIN查询
SELECT d.dept_name, e.emp_name
FROM departments d
LEFT JOIN employees e ON d.dept_id = e.dept_id;

 

2. 临时表过大

症状

  • Using temporary出现在EXPLAIN结果中

  • 查询速度随数据量急剧下降

优化方案

-- 增加临时表内存大小
SET tmp_table_size = 256M;
SET max_heap_table_size = 256M;

-- 优化GROUP BY字段顺序(与索引一致)
SELECT dept_id, COUNT(*) 
FROM employees 
GROUP BY dept_id;  -- 确保dept_id有索引

 

 

 

你可能感兴趣的:(MySQL多表查询进阶实战:连接与子查询深度解析)