Window Functions#
Window functions in Daft SQL allow you to perform calculations across a set of rows that are related to the current row, similar to aggregate functions but without collapsing the result into a single row.
Warning
Window function support in Daft SQL is currently limited. Full SQL window function support is under development.
Basic Syntax#
The general syntax for window functions in Daft is:
1 2 3 4 5 |
|
Where:
function_name
is the name of the window functionPARTITION BY
divides the result set into partitions to which the window function is appliedORDER BY
defines the logical order of rows within each partition- Note: NULL values are positioned at the end for ascending order (default) and at the beginning for descending order
frame_clause
defines a subset of rows in the current partition (called the window frame)
Supported Window Functions#
The following window functions are currently supported:
Ranking Functions#
-
ROW_NUMBER()
: Returns the sequential row number starting from 1 within the partition.1 2 3 4 5
SELECT category, value, ROW_NUMBER() OVER (PARTITION BY category ORDER BY value) as row_num FROM sales
-
RANK()
: Returns the rank of the current row within a partition, with gaps in the ranking sequence when there are ties.1 2 3 4 5
SELECT category, value, RANK() OVER (PARTITION BY category ORDER BY value) as rank FROM sales
-
DENSE_RANK()
: Returns the rank of the current row within a partition, without gaps in the ranking sequence when there are ties.1 2 3 4 5
SELECT category, value, DENSE_RANK() OVER (PARTITION BY category ORDER BY value) as dense_rank FROM sales
Offset Functions#
-
LAG(value [, offset [, default]])
: Returns the value from a row that is offset rows before the current row. If no such row exists, returns the default value. The offset parameter defaults to 1 if not specified.1 2 3 4 5
SELECT date, value, LAG(value, 1, 0) OVER (ORDER BY date) as previous_value FROM time_series
-
LEAD(value [, offset [, default]])
: Returns the value from a row that is offset rows after the current row. If no such row exists, returns the default value. The offset parameter defaults to 1 if not specified.1 2 3 4 5
SELECT date, value, LEAD(value, 1, 0) OVER (ORDER BY date) as next_value FROM time_series
Aggregate Functions#
All Daft aggregate functions can be used as window functions. Common examples include:
SUM([expr])
: Returns the sum of expression values.AVG([expr])
: Returns the average of expression values.COUNT([expr])
: Returns the count of non-null expression values.MIN([expr])
: Returns the minimum expression value.MAX([expr])
: Returns the maximum expression value.
Example:
1 2 3 4 5 6 |
|
Note
When using aggregate functions with both PARTITION BY
and ORDER BY
, the default window frame includes all rows from the start of the partition up to the current row — equivalent to ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
.
Window Frame Specification#
When using aggregate functions as window functions, you can specify a window frame to define which rows to include in the aggregation:
1 2 3 4 5 6 |
|
Where:
ROWS
indicates that the frame is defined by physical row countRANGE
indicates that the frame is defined by logical value (not fully supported yet)-
frame_start
andframe_end
can be one of:UNBOUNDED PRECEDING
: All rows before the current row (only valid forframe_start
)n PRECEDING
: n rows before the current rowCURRENT ROW
: The current rown FOLLOWING
: n rows after the current rowUNBOUNDED FOLLOWING
: All rows after the current row (only valid forframe_end
)
Examples:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
Limitations#
- Global partitions (window functions without
PARTITION BY
) are not yet supported - Named window specifications (
WINDOW
clause) are not supported IGNORE NULLS
andRESPECT NULLS
options are not supported