悠悠楠杉
网站页面
正文:
在日常数据分析工作中,我们经常需要将当前行数据与前后行进行比较分析。比如计算环比增长率、识别数据趋势变化、查找连续登录用户等场景。SQL中的LEAD和LAG窗口函数正是为解决这类需求而设计的利器。
LEAD函数允许我们"向前看",获取当前行之后的指定偏移量的行数据;而LAG函数则让我们"向后看",获取当前行之前的行数据。这两个函数都属于SQL窗口函数,它们不会改变查询结果的行数,只是为每行附加额外的参考值。
基本语法结构:
LEAD(column_name, offset, default_value) OVER (PARTITION BY ... ORDER BY ...)
LAG(column_name, offset, default_value) OVER (PARTITION BY ... ORDER BY ...)
其中:
- columnname:要获取的目标列
- offset:偏移量(默认为1)
- defaultvalue:当无对应行时的默认值(默认为NULL)
场景1:计算月度销售额环比增长率
假设我们有月度销售表monthly_sales:
CREATE TABLE monthly_sales (
month DATE,
revenue DECIMAL(10,2)
);
计算环比增长率的SQL:
SELECT
month,
revenue,
LAG(revenue) OVER (ORDER BY month) AS prev_month_revenue,
ROUND((revenue - LAG(revenue) OVER (ORDER BY month)) /
LAG(revenue) OVER (ORDER BY month) * 100, 2) AS growth_rate
FROM monthly_sales
ORDER BY month;
场景2:识别用户连续登录天数
用户登录记录表user_logins:
CREATE TABLE user_logins (
user_id INT,
login_date DATE
);
找出连续登录的用户:
WITH login_gaps AS (
SELECT
user_id,
login_date,
LAG(login_date) OVER (PARTITION BY user_id ORDER BY login_date) AS prev_login
FROM user_logins
)
SELECT
user_id,
login_date,
prev_login,
DATEDIFF(login_date, prev_login) AS days_since_last_login
FROM login_gaps
WHERE DATEDIFF(login_date, prev_login) = 1;
场景3:预测下一季度业绩
使用LEAD预测未来业绩:
SELECT
quarter,
actual_sales,
LEAD(actual_sales, 1) OVER (ORDER BY quarter) AS next_quarter_projection,
LEAD(actual_sales, 2) OVER (ORDER BY quarter) AS two_quarters_ahead
FROM quarterly_results
ORDER BY quarter;
SELECT
date,
temperature,
LAG(temperature) OVER (ORDER BY date) AS prev_temp,
humidity,
LAG(humidity) OVER (ORDER BY date) AS prev_humidity
FROM weather_data;
SELECT
student_id,
test_date,
score,
LAG(score, 3) OVER (PARTITION BY student_id ORDER BY test_date) AS three_tests_ago
FROM exam_results;
SELECT
transaction_id,
amount,
LAG(amount) OVER (ORDER BY transaction_time) AS prev_amount,
CASE
WHEN amount > 1.5 * LAG(amount) OVER (ORDER BY transaction_time)
THEN 'Large Increase'
WHEN amount < 0.7 * LAG(amount) OVER (ORDER BY transaction_time)
THEN 'Significant Drop'
ELSE 'Normal'
END AS trend
FROM transactions;