240523 Today I Learn
์ด๋ ํ๊ท ์ ์ฌ์ฉํ ๋ ์ง๋ณ ๋งค์ถ ์ถ์ด ๋ณด๊ธฐ
์ด๋ํ๊ท ์ด๋ ๋ฌด์์ผ๊น? ์ ์ฌ์ฉํ ๊น?
๐ก ์ด๋ํ๊ท ๋ฒ
์ด๋ ํ๊ท ๋ฒ์ ์ ์ฒด ๋ฐ์ดํฐ ์งํฉ์ ์ฌ๋ฌ ํ์์งํฉ์ ๋ํ ์ผ๋ จ์ ํ๊ท ์ ๋ง๋ค์ด ๋ฐ์ดํฐ ์์๋ฅผ ๋ถ์ํ๋ ๊ณ์ฐ์ด๋ค.
ROWS BETWEEN lower_bound AND upper_boundโ
- `UNBOUNDED PRECEDING/ FOLLOWING`- ํ์ฌ row ์/๋ค์ ๋ชจ๋ row
- `n PRECEDING/ FOLLOWING` - ํ์ฌ row ์/๋ค์ n๊ฐ์ row
- `CURRENT ROW' - ํ์ฌ row
์ด๋ํ๊ท ๊ตฌํ๊ธฐ - 7์ผ ์ด๋ํ๊ท ๊ตฌํ๊ธฐ
SELECT dt,
SUM(purchase_amount),
AVG(SUM(purchase_amount)) OVER(ORDER BY dt ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) seven_day_avg,
CASE WHEN 7 = AVG(COUNT(*)) OVER(ORDER BY dt ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
THEN AVG(SUM(purchase_amount)) OVER(ORDER BY dt ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
ELSE NULL END seven_day_avg_strict
FROM purchase_log
GROUP BY dt
ORDER BY dt;
- `seven_day_avg`: dt๋ฅผ ๊ธฐ์ค์ผ๋ก purchase amount์ ํฉ๊ณ์ 7์ผ๊ฐ์ ํ๊ท ๊ตฌํ๊ธฐ
→ ROWS BETWEEN 6 PRECEDING AND CURRENT ROW - `seven_day_avg_strict`: CASE WHEN ๊ตฌ๋ฌธ์ ์ด์ฉํด ROWS BETWEEN 6 PRECEDING AND CURRENT ROW์ ํด๋นํ๋ ROW๊ฐ 7๊ฐ์ผ๋๋ง 7์ผ๊ฐ์ ์ด๋ํ๊ท ์ ๊ตฌํ๊ณ , ๊ทธ ์ธ์๋ NULL๋ก ํ์
์ ์ฉ ๊ณผ์ - ์ด๋ ํ๊ท ๊ตฌํ๊ธฐ (ROLLING AVERAGE)
Given a table of tweet data over a specified time period, calculate the 3-day rolling average of tweets for each user. Output the user ID, tweet date, and rolling averages rounded to 2 decimal places.
- A rolling average, also known as a moving average or running mean is a time-series technique that examines trends in data over a specified period of time.
- In this case, we want to determine how the tweet count for each user changes over a 3-day period.
SELECT user_id,
tweet_date,
ROUND(AVG(tweet_count) OVER(PARTITION BY user_id ORDER BY tweet_date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW),2) rolling_avg_3d
FROM tweets;
๋น์จ ๊ตฌํ๊ธฐ
๐ก CASE WHEN๋ฌธ์ ํ์ฉํด ์กฐ๊ฑด์ ๋ง๋ ์ ์ธ๊ธฐ
COUNT(CASE WHEN ์กฐ๊ฑด THEN 1 ELSE NULL END)โ
๋ณดํต ๋น์จ์ ๊ตฌํ๋ ๋ฐฉ๋ฒ์ `๋ด๊ฐ ์ํ๋ ์กฐ๊ฑด์ ๋ง๋ ๊ฐ / ์ ์ฒด`์ ํํ๋ก ๊ตฌํ๋ ๊ฒฝ์ฐ๊ฐ ๋ง๋ค. ์กฐ๊ฑด์ ๋ง๋ ๊ฐ์ด ๋ช๊ฐ์ธ์ง ์ ๋์๋ ๋ฐ๋ก COUNT๋ฌธ์์ CASE WHEN ๋ฌธ์ ๋ฃ์ด์ ์กฐ๊ฑด์ ๋ง๋ ๊ฒฝ์ฐ 1, ์๋ ๊ฒฝ์ฐ NULL์ ๋ฐํํ๊ฒ ํ๋ฉด ์กฐ๊ฑด์ ๋ง๋ ๊ฒฝ์ฐ์ ์๋ง 1๋ก ์ธ์ด์ค๋ค.
โ ๏ธ
์ด๋ ์ฃผ์ํด์ผ ํ ์ ์
COUNTํ ๊ฐ์ 0.1์ ๊ณฑํด์ค์ผํ๋ค๋ ๊ฒ์ด๋ค.
SQL์์๋ ์ ์๋ฅผ ๋๋๋ฉด ๊ฒฐ๊ณผ ์ญ์ ์ ์๋ก ๋ฐํ๋๋ค. ๋๋ฌธ์ 5/10์ ํ๋ ๊ฒฝ์ฐ์๋ 0.5(์ค์)๊ฐ ์๋ 0(์ ์)ํ์ผ๋ก ๋ฐํํ๊ธฐ ๋๋ฌธ์ด๋ค. ์ด ๋ฌธ์ ๋ COUNT๋ ๊ฐ์ 1.0์ ๊ณฑํด์ ๊ฐ๋จํ ํด๊ฒฐํ ์ ์๋ค. ์ด ๊ณผ์ ์์ ์ ์๋ ์ค์ํํ๋ก ๋ณํ๋๊ณ , ์ค์ํ์ผ๋ก ์ ์์ ์ผ๋ก ๊ณ์ฐ๋๋ค. ๋น์จ์ ๊ตฌํ ๋์๋ ์ด ์ ์ ํนํ ์ ์ํ๋๋ก ํ์.
Activation์ด๋?
- MAU (Monthly active users)
- WAU(Weekly active users)
- DAU(Daily active users)
์ ์ฉ ๊ณผ์ - ํ์ฑ ๋น์จ ๊ตฌํ๊ธฐ
New TikTok users sign up with their emails. They confirmed their signup by replying to the text confirmation to activate their accounts. Users may receive multiple text messages for account confirmation until they have confirmed their new account.
A senior analyst is interested to know the activation rate of specified users in the emails table. Write a query to find the activation rate. Round the percentage to 2 decimal places.
- emails table contain the information of user signup details.
- texts table contains the users' activation information.
Assumptions:
The analyst is interested in the activation rate of specific users in the emails table, which may not include all users that could potentially be found in the texts table. For example, user 123 in the emails table may not be in the texts table and vice versa.
SELECT ROUND(COUNT(CASE WHEN signup_action = 'Confirmed' THEN 1 ELSE NULL END)*1.0
/COUNT(DISTINCT user_id),2) as confirm_rate
FROM emails e LEFT JOIN texts t ON e.email_id = t.email_id;
์ ์ฉ ๊ณผ์ - ํผ์ผํธ ๊ตฌํ๊ธฐ
A phone call is considered an international call when the person calling is in a different country than the person receiving the call. What percentage of phone calls are international? Round the result to 1 decimal.
- Assumption: The caller_id in phone_info table refers to both the caller and receiver.
SELECT ROUND(COUNT(CASE WHEN p1_country <> p2_country THEN 1 ELSE NULL END)*1.0
/COUNT(*)*100, 1) international_calls_pct
FROM (
SELECT c.caller_id, c.receiver_id,
p1.country_id p1_country, p2.country_id p2_country
FROM phone_calls c
LEFT JOIN phone_info p1 ON c.caller_ID = p1.caller_ID
LEFT JOIN phone_info p2 ON c.receiver_id = p2.caller_id
)a ;
'๐ Today I Learn > ๐ My SQL' ์นดํ ๊ณ ๋ฆฌ์ ๋ค๋ฅธ ๊ธ
SQL Challenge ์ธ์ #4 ๋ฆฌํ ์ , ์ฝํธํธ ๋ถ์ (0) | 2024.05.30 |
---|---|
SQL Challenge ์ธ์ #3 ํผ๋(Funnel)๋ถ์ (0) | 2024.05.29 |
SQL Challenge ์ธ์ #1 ์๋์ฐ ํจ์, WITH์ (0) | 2024.05.27 |
Union๊ณผ Order by (0) | 2024.05.04 |
SQL ์ธ์ #1 (0) | 2024.05.04 |