Skip to main content

📝 Latest Blog Post

SQL for Time-Series: How to JOIN Tables on Overlapping Date Ranges

SQL for Time-Series: How to JOIN Tables on Overlapping Date Ranges

SQL for Time-Series: How to JOIN Tables on Overlapping Date Ranges

Most SQL JOINs use the `=` sign (equi-joins). However, real-world data, especially time-series and event data, requires a **non-equi JOIN**—a join based on comparison operators like $>, <, \ge, \le$ or $BETWEEN$. This is crucial for matching a single point in time to a larger date range.

A classic use case is matching a transaction date to an active promotion period, or an employee's task to their scheduled shift time.

The Non-Equi JOIN Syntax

To join two tables, `Events` (with a single event date) and `Periods` (with a start and end date), the `ON` clause must check if the event date is within the period's range.

Example: Event within a Period

We want to find which promotional period (P) was active when a customer placed an order (E):

SELECT
  E.OrderID,
  E.OrderDate,
  P.PromoName
FROM
  Orders E
INNER JOIN
  Promotions P
ON
  E.OrderDate BETWEEN P.StartDate AND P.EndDate;

Alternatively, the `BETWEEN` can be written using two standard comparison operators (which is sometimes more performant):

ON
  E.OrderDate >= P.StartDate AND E.OrderDate <= P.EndDate;

Handling Overlapping Ranges (The Complex Join)

What if you want to join two tables where both contain date *ranges* (e.g., matching a room booking to a cleaning schedule)? You need to check for any overlap.

The simplest way to check for overlap is to ensure that **neither range starts after the other one ends**.

SELECT
  B.BookingID,
  C.CleanerID
FROM
  Bookings B
INNER JOIN
  CleaningSchedules C
ON
  B.StartDate <= C.EndDate AND C.StartDate <= B.EndDate;

If the booking start date is before the cleaning end date AND the cleaning start date is before the booking end date, an overlap is guaranteed.

Performance Consideration

Non-equi joins can be resource-intensive, especially on large datasets. Always ensure your date columns are properly **indexed** to allow the database to quickly narrow down the matching records.

Mastering the non-equi join on date ranges is an essential step for any analyst working with real-world time-series data.

Comments

🔗 Related Blog Post

🌟 Popular Blog Post