SQL for Time-Series: How to JOIN Tables on Overlapping Date Ranges
Most SQL JOINs use the `=` sign (equi-joins). However, real-world data, especially time-series and event data, requires a **non-equi JOIN**—a join based on comparison operators like $>, <, \ge, \le$ or $BETWEEN$. This is crucial for matching a single point in time to a larger date range.
A classic use case is matching a transaction date to an active promotion period, or an employee's task to their scheduled shift time.
The Non-Equi JOIN Syntax
To join two tables, `Events` (with a single event date) and `Periods` (with a start and end date), the `ON` clause must check if the event date is within the period's range.
Example: Event within a Period
We want to find which promotional period (P) was active when a customer placed an order (E):
SELECT
E.OrderID,
E.OrderDate,
P.PromoName
FROM
Orders E
INNER JOIN
Promotions P
ON
E.OrderDate BETWEEN P.StartDate AND P.EndDate;
Alternatively, the `BETWEEN` can be written using two standard comparison operators (which is sometimes more performant):
ON
E.OrderDate >= P.StartDate AND E.OrderDate <= P.EndDate;
Handling Overlapping Ranges (The Complex Join)
What if you want to join two tables where both contain date *ranges* (e.g., matching a room booking to a cleaning schedule)? You need to check for any overlap.
The simplest way to check for overlap is to ensure that **neither range starts after the other one ends**.
SELECT
B.BookingID,
C.CleanerID
FROM
Bookings B
INNER JOIN
CleaningSchedules C
ON
B.StartDate <= C.EndDate AND C.StartDate <= B.EndDate;
If the booking start date is before the cleaning end date AND the cleaning start date is before the booking end date, an overlap is guaranteed.
Performance Consideration
Non-equi joins can be resource-intensive, especially on large datasets. Always ensure your date columns are properly **indexed** to allow the database to quickly narrow down the matching records.

Comments
Post a Comment