Friday, April 12, 2024

2 Necessary SQL CASE WHEN Examples in 2023

Must read


Let’s begin with the query which was steered by a Sports activities Analyst. Now and again, their analytics workforce must create a factors desk based mostly on the whole matches performed between completely different groups.

And that’s why they ask this query in each knowledge analyst job interview.

It is a basic situation of changing a desk from a long-form (variety of rows > variety of columns ) to a large type (variety of columns > variety of rows). That is additionally known as Information Pivoting, which is a crucial use-case of CASE WHEN in SQL.

On this situation, you could have a desk containing the names of the groups who performed the match and the winner. You should create a factors desk the place you get details about what number of matches every workforce performed, what number of matches they gained, misplaced and what number of matches draw.

Kind of instance | Picture by Creator

Let’s see deal with this type of query —

Right here is the enter desk the place every row belongs to a match between two groups and the column winner signifies which workforce gained. The worth NULL within the winner column signifies that the match was a draw i.e. not one of the workforce gained the match.

Enter knowledge | Picture by Creator

Let’s divide this query into the next sub-tasks.

  1. Discover the whole variety of matches every workforce gained
  2. Discover the whole variety of matches every workforce misplaced
  3. Discover the whole variety of matches the place not one of the groups gained
  4. Discover the whole variety of matches every workforce performed

To grasp what number of matches a workforce gained, that you must perceive for every match which workforce gained the match. You are able to do this by evaluating the team_1 and team_2 columns with the winner column.

So, for a particular row when the values in columns team_1 and winner are equal then team_1 is the winner.

You’ll be able to translate precisely the identical logic utilizing CASE..WHEN..THEN assertion in SQL as proven beneath.

SELECT team_1
, team_2
, winner
, CASE WHEN team_1 = winner THEN 1 ELSE 0 END as win_flag
, CASE WHEN winner IS NULL THEN 1 ELSE 0 END as draw_flag
FROM analyticswithsuraj.groups

As proven within the above question, you’ll create a further column win_flag. When a workforce is a winner, you’ll assign the worth 1 to this column. Equally, if the winner column is NULL, then you definitely’ll assign the worth 1 to the draw_flag column.

So the above question will create the next output for all of the groups within the column team_1.

First partial output for team_1 | Picture by Creator

Equally, when the values within the columns team_2 and winner are equal then team_2 is the winner. So you may write precisely the same question for all of the groups in team_2

SELECT team_1
, team_2
, winner
, CASE WHEN team_2 = winner THEN 1 ELSE 0 END as win_flag
, CASE WHEN winner IS NULL THEN 1 ELSE 0 END as draw_flag
FROM analyticswithsuraj.groups

the place you’ll get the next output for the values in team_2

Partial output for team_2 | Picture by Creator

Nicely, the above two queries are simply on your understanding. In actuality, you may create a single CTE for the win_flag and draw_flag of every workforce within the columns team_1 and team_2 as proven beneath.

WITH win_draw_flag AS
(
SELECT team_1 as workforce
, CASE WHEN team_1 = winner THEN 1 ELSE 0 END as win_flag
, CASE WHEN winner IS NULL THEN 1 ELSE 0 END as draw_flag
FROM analyticswithsuraj.groups
UNION ALL
SELECT team_2 as workforce
, CASE WHEN team_2 = winner THEN 1 ELSE 0 END as win_flag
, CASE WHEN winner IS NULL THEN 1 ELSE 0 END as draw_flag
FROM analyticswithsuraj.groups
)

This can create a CTE like this — I’ve proven this solely on your understanding.

The output of CTE | Picture by Creator

Keep in mind you might be nonetheless within the lengthy type of the desk and also you now have info on whether or not or not every workforce wins the match.

Subsequent, that you must merely mixture the columns to get the whole variety of matches every workforce performed, gained, and misplaced. You are able to do it as merely as the next question.

SELECT workforce
, COUNT(*) AS matches_played
, SUM(win_flag) AS matches_won
, COUNT(*) - SUM(win_flag) - SUM(draw_flag) AS matches_lost
, SUM(draw_flag) AS matches_draw
FROM win_draw_flag
GROUP BY workforce
ORDER BY workforce

The place COUNT(*) offers you the whole variety of instances every workforce occurred within the CTE win_draw_flag and subtracting matches gained & draw from it will provide you with the whole variety of matches every workforce misplaced.

Closing output — Factors desk | Picture by Creator

With out making a CTE individually, you may also write the question like the next and move your entire CASE..WHEN question as a sub-query.

SELECT workforce
, COUNT(*) AS matches_played
, SUM(win_flag) AS matches_won
, COUNT(*) - SUM(win_flag) - SUM(draw_flag) AS matches_lost
, SUM(draw_flag) AS matches_draw
FROM
(
SELECT team_1 as workforce
, CASE WHEN team_1 = winner THEN 1 ELSE 0 END as win_flag
, CASE WHEN winner IS NULL THEN 1 ELSE 0 END as draw_flag
FROM analyticswithsuraj.groups
UNION ALL
SELECT team_2 as workforce
, CASE WHEN team_2 = winner THEN 1 ELSE 0 END as win_flag
, CASE WHEN winner IS NULL THEN 1 ELSE 0 END as draw_flag
FROM analyticswithsuraj.groups
) AS win_draw_flag
GROUP BY workforce
ORDER BY workforce

It is going to additionally lead to precisely the identical output as talked about above.

Nicely, there might be a number of methods to unravel this query — I discovered this strategy simpler. For those who discover another resolution to this query, be happy to say it within the feedback.

This is without doubt one of the basic examples of evaluating dates after which implementing If..Else logic utilizing CASE..WHEN assertion. You’ll be able to encounter one of these drawback in any firm that offers with prospects.

The situation is — You’ve an eCommerce web site the place day by day prospects go to and buy merchandise. Your process is to determine on every day what number of prospects have been new and what number of prospects have been repeated.

Kind of query | Picture by Creator

Right here is an enter desk — orders — the place you may see the purchasers with customer_id ABC101, BCD201, and ABD101 visited the web site on a number of days and bought completely different merchandise.

Enter desk with dummy knowledge | Picture by Creator

Let’s break down the query into the next sub-tasks —

  1. Discover the primary time i.e. the primary date when the shopper visited the web site
  2. Examine the primary date with the order date to resolve whether or not the shopper is a repeated or first-time customer

You’ll be able to simply remedy the primary sub-task by utilizing GROUP BY to group all of the data by customer_id and discover the minimal of the order_date, as proven beneath.

SELECT customer_id
, MIN(order_date) as first_order_date
FROM analyticswithsuraj.orders
GROUP BY customer_id
Minimal order date | Picture by Creator

This was easy!

Subsequent, to match the first_order_date with every order_date, first, that you must carry each columns in a single desk.

You’ll be able to simply try this utilizing JOIN on customer_id as proven beneath. Right here you may create a CTE utilizing the above question in order that you’re going to get a short lived desk to hitch with the enter desk.

WITH first_orders AS
(
SELECT customer_id
, MIN(order_date) as first_order_date
FROM analyticswithsuraj.orders
GROUP BY customer_id
)

SELECT t1.*
, t2.first_order_date
FROM analyticswithsuraj.orders AS t1
INNER JOIN first_orders AS t2 ON t1.customer_id = t2.customer_id

Inside Be part of output | Picture by Creator

Now, as you bought each the columns in a single desk, you may examine order_date with the first_order_date and implement the next If..Else logic.

  1. If first_order_date & order_date are the identical then the shopper is the brand new buyer
  2. If first_order_date & order_date are completely different then the shopper is repeated buyer

So ideally, that you must create two columns to implement the above two If..Else statements utilizing CASE WHEN in SQL.

You don’t have to create any separate desk, relatively you may add two extra columns within the above question the place you joined two tables. Right here is how it may be performed.

WITH first_orders AS
(
SELECT customer_id
, MIN(order_date) as first_order_date
FROM analyticswithsuraj.orders
GROUP BY customer_id
)

SELECT t1.*
, t2.first_order_date
, CASE WHEN t1.order_date=t2.first_order_date THEN 1 ELSE 0 END AS new_customer_flag
, CASE WHEN t1.order_date!=t2.first_order_date THEN 1 ELSE 0 END AS repeat_customer_flag
FROM analyticswithsuraj.orders AS t1
INNER JOIN first_orders AS t2 ON t1.customer_id = t2.customer_id

CASE WHEN Output creating two new columns | Picture by Creator

In consequence, the column new_customer_flag will likely be 1 when the columns first_order_date and order_date are equal. Equally, the column repeat_customer_flag will likely be 1 when the columns first_order_date and order_date are completely different.

Now the final step is just to group all of the data by order date and sum up the columns new_customer_flag and repeat_customer_flag.

To do that, you’ll want the above desk which you’ll be able to obtain by creating one other CTE as proven beneath.

WITH first_orders AS
(
SELECT customer_id
, MIN(order_date) as first_order_date
FROM analyticswithsuraj.orders
GROUP BY customer_id
),

prospects AS
(
SELECT t1.*
, t2.first_order_date
, CASE WHEN t1.order_date=t2.first_order_date THEN 1 ELSE 0 END AS new_customer_flag
, CASE WHEN t1.order_date!=t2.first_order_date THEN 1 ELSE 0 END AS repeat_customer_flag
FROM analyticswithsuraj.orders AS t1
INNER JOIN first_orders AS t2 ON t1.customer_id = t2.customer_id
)

SELECT order_date
, SUM(new_customer_flag) AS number_of_new_customers
, SUM(repeat_customer_flag) AS number_of_repeat_customers
FROM prospects
GROUP BY order_date
ORDER BY order_date

Variety of new and repeat prospects every day | Picture by Creator

That is the way you’ll get the required output. You’ll be able to cross-check the outcomes by evaluating them with the enter desk.

Once more, you may have a distinct strategy to fixing this query — that is the best strategy I discovered. Don’t overlook to say your strategy within the feedback beneath.



Supply hyperlink

More articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest article