Hacker Rank SQL Challenges Advanced Join
Problems
SQL Project Planning
Problem -> URL
Difficulty Level: Medium
Pseudocode:
If taskstart_date
not inend_date
then taskstart_date
will be considered as projectstart_date
and similary the if taskend_date
not instart_date
then it will be considered projectend_date
. New project will not be started until existing project completes.
- We will find all the projects
start_date
and assignrow_number
to it.- We will find all the projects
end_date
and assignrow_number
to it.- Join result of step-1 and step-2 on
row_num
column.- Order the result by
end_date
-start_date
andstart_date
in ascending order.Solution:
WITH project_start AS ( SELECT start_date, ROW_NUMBER() OVER(ORDER BY start_date) AS row_num FROM projects WHERE start_date NOT IN ( SELECT end_date FROM projects GROUP BY end_date ) ), project_end AS ( SELECT end_date, ROW_NUMBER() OVER(ORDER BY start_date) AS row_num FROM projects WHERE end_date NOT IN ( SELECT start_date FROM projects GROUP BY start_date ) ) SELECT s.start_date, e.end_date FROM project_start s INNER JOIN project_end e ON s.row_num = e.row_num ORDER BY e.end_date - s.start_date, s.start_date;
Placements
Problem -> URL
Difficulty Level: Medium
Pseudocode:
- Join
friends
withpackages
onid
column to get the student salary.- Join
friends
withpackages
onf.friend_id = p.id
to get the student best friend salary.- Apply filter friend_salary greater than student_salary.
- Join the
result
withstudents
onid
to get the name of the student.- Order the
result
byfriend_salary
.Solution:
WITH ids AS ( SELECT f.id, f.friend_id, student_package.salary as student_salary, friend_package.salary as friend_salary FROM friends f INNER JOIN packages student_package ON f.id = student_package.id INNER JOIN packages friend_package ON f.friend_id = friend_package.id WHERE friend_package.salary > student_package.salary ) SELECT s.name FROM students s INNER JOIN ids i ON s.id = i.id ORDER BY i.friend_salary;
Symmetric Pairs
Problem -> URL
Difficulty Level: Medium
Pseudocode:
- We will assign row number for each row. So, while joining records we will not join records with same row number.
- Perform non equi self join with symmetric pair condition and
t1.row_num != t2.row_num
- We should only return rows
x <= y
appling filter conditionx <= y
.- Remove duplicates by applying
GROUP BY x, y
- Order the
result
byx
.Solution:
WITH pairs AS ( SELECT x, y, ROW_NUMBER() OVER(ORDER BY 1) row_num FROM functions ) SELECT t1.x, t1.y FROM pairs t1 INNER JOIN pairs t2 ON t1.x = t2.y AND t2.x = t1.y AND t1.row_num != t2.row_num WHERE t1.x <= t1.y GROUP BY t1.x, t1.y ORDER BY t1.x;
Interviews
Problem -> URL
Difficulty Level: Hard
Pseudocode:
sum_of_views on college_id
- Join
challenges
withviews_stats
onchallenge_id
.- GROUP BY
college_id
.- Aggregate
SUM
oftotal_views
andtotal_unique_views
.sum_of_submissions on college_id
- Join
challenges
withsubmission_stats
onchallenge_id
.- GROUP BY
college_id
.- Aggregate
SUM
oftotal_submissions
andtotal_accepted_submissions
.contest_sum on contest_id
- Join
colleges
withsum_of_views
oncollege_id
.- Join
colleges
withsum_of_submissions
oncollege_id
.- GROUP BY
contest_id
.- Aggregate
SUM
oftotal_views
,total_unique_views
,total_submissions
andtotal_accepted_submissions
.result
- Join
contests
withcontest_sum
oncontest_id
.- Apply filter sum of
total_submissions_sum
,total_accepted_submissions_sum
,total_views_sum
andtotal_unique_views_sum
not equal to 0.- Order the result by
contest_id
.Solution:
WITH sum_of_views AS ( SELECT ch.college_id, SUM(vs.total_views) AS total_views_sum, SUM(vs.total_unique_views) AS total_unique_views_sum FROM challenges ch INNER JOIN view_stats vs ON ch.challenge_id = vs.challenge_id GROUP BY ch.college_id ), sum_of_submissions AS ( SELECT ch.college_id, SUM(ss.total_submissions) AS total_submissions_sum, SUM(ss.total_accepted_submissions) AS total_accepted_submissions_sum FROM challenges ch INNER JOIN submission_stats ss ON ch.challenge_id = ss.challenge_id GROUP BY ch.college_id ), contest_sum AS ( SELECT ch.contest_id, SUM(ss.total_submissions_sum) AS total_submissions_sum, SUM(ss.total_accepted_submissions_sum) AS total_accepted_submissions_sum, SUM(sv.total_views_sum) AS total_views_sum, SUM(sv.total_unique_views_sum) AS total_unique_views_sum FROM colleges ch INNER JOIN sum_of_views sv ON ch.college_id = sv.college_id INNER JOIN sum_of_submissions ss ON ch.college_id = ss.college_id GROUP BY ch.contest_id ), result AS ( SELECT cn.contest_id, cn.hacker_id, cn.name, cs.total_submissions_sum, cs.total_accepted_submissions_sum, cs.total_views_sum, cs.total_unique_views_sum FROM contests cn INNER JOIN contest_sum cs ON cn.contest_id = cs.contest_id WHERE total_submissions_sum + total_accepted_submissions_sum + total_views_sum + total_unique_views_sum <> 0 ORDER BY cn.contest_id ) SELECT * FROM result;
15 Days of Learning SQL
Problem -> URL
Difficulty Level: Medium
Pseudocode:Finding unique hackers who submitted for consecutive days
- submissions_rank
- We will assign
RANK
for each day usingDENSE_RANK
analytical function by ordering the dataset bysubmission_date
.- We will assign
RANK
for hacker’s each day submission usingDENSE_RANK
analytical function Partitioning the dataset byhacker_id
and ordering the partitioned dataset bysubmission_date
.- By assiging
RANK
we will get know if the hacker has submitted for consecutive days.- total_unique_hackers
- From
submissions_rank
, If thecurrent_day
and hacker’s consecutive daysdays_submitted
rank matches then we will consider thosehacker_id
to be included in our count.- Apply filter,
current_day
equal todays_submitted
.- GROUP BY
submission_date
.- Aggregate, Count distinct of
hacker_id
.Finding hacker_id with highest submission for each day
- hacker_submission_count
- From
submissions
, Selectsubmission_date
andhacker_id
.- GROUP BY
submission_date
andhacker_id
.- Aggregate, count distinct of
submission_id
.- with_highest_submissions
- From
hacker_submission_count
, we will assignROW_NUMBER
usingROW_NUMBER
analytical function by partitioning the dataset bysubmission_date
order the partitioned dataset bysubmission_count
in descending andhacker_id
in ascending order.Result
- Join
total_unique_hackers
withwith_highest_submissions
onsubmission_date
.- Join
with_highest_submissions
withhackers
onhacker_id
to get the name of the hacker.- Filter,
row_num
equal to 1 to get only the hackers with highest per day submissions.- Order the result by
submission_date
.Solution:
WITH submissions_rank AS ( SELECT submission_date, hacker_id, DENSE_RANK() OVER(ORDER BY submission_date) AS current_day, DENSE_RANK() OVER(PARTITION BY hacker_id ORDER BY submission_date) AS days_submitted FROM submissions ), total_unique_hackers AS ( SELECT submission_date, COUNT(DISTINCT hacker_id) AS distinct_hacker_count FROM submissions_rank WHERE current_day = days_submitted GROUP BY submission_date ), hacker_submission_count AS ( SELECT submission_date, hacker_id, COUNT(DISTINCT submission_id) AS submission_count FROM submissions GROUP BY submission_date, hacker_id ), with_highest_submissions AS ( SELECT submission_date, hacker_id, ROW_NUMBER() OVER(PARTITION BY submission_date ORDER BY submission_count DESC, hacker_id ASC) AS row_num FROM hacker_submission_count ), result AS ( SELECT uh.submission_date, uh.distinct_hacker_count, hs.hacker_id, h.name FROM total_unique_hackers uh INNER JOIN with_highest_submissions hs ON uh.submission_date = hs.submission_date INNER JOIN hackers h on hs.hacker_id = h.hacker_id WHERE hs.row_num = 1 ORDER BY uh.submission_date ) SELECT * FROM result;