diff --git a/documentation/query/functions/window-functions/overview.md b/documentation/query/functions/window-functions/overview.md index 3a533344d..8e023e213 100644 --- a/documentation/query/functions/window-functions/overview.md +++ b/documentation/query/functions/window-functions/overview.md @@ -2,7 +2,7 @@ title: Window Functions Overview sidebar_label: Overview description: Introduction to window functions in QuestDB - perform calculations across related rows without collapsing results. -keywords: [window functions, over, partition by, moving average, running total, rank, dense_rank, percent_rank, row_number, lag, lead, analytics, ema, vwema, ksum, exponential moving average, stddev, variance, covariance, correlation] +keywords: [window functions, over, partition by, moving average, running total, rank, dense_rank, percent_rank, ntile, cume_dist, row_number, lag, lead, nth_value, analytics, ema, vwema, ksum, exponential moving average, stddev, variance, covariance, correlation] --- Window functions perform calculations across sets of table rows related to the current row. Unlike aggregate functions that return a single result for a group of rows, window functions return a value for **every row** while considering a "window" of related rows defined by the `OVER` clause. @@ -44,23 +44,26 @@ Arithmetic operations on window functions (e.g., `sum(...) OVER (...) / sum(...) | Function | Description | Respects Frame | |----------|-------------|----------------| | [`avg()`](reference.md#avg) | Average value in window (also supports EMA and VWEMA) | Yes (standard) / No (EMA/VWEMA) | +| [`corr()`](reference.md#corr) | Pearson correlation coefficient | Yes | | [`count()`](reference.md#count) | Count rows or non-null values | Yes | -| [`sum()`](reference.md#sum) | Sum of values in window | Yes | -| [`ksum()`](reference.md#ksum) | Sum with Kahan precision | Yes | -| [`min()`](reference.md#min) | Minimum value in window | Yes | -| [`max()`](reference.md#max) | Maximum value in window | Yes | -| [`stddev_pop()` / `stddev_samp()` / `stddev()`](reference.md#stddev) | Standard deviation (population or sample) | Yes | -| [`var_pop()` / `var_samp()` / `variance()`](reference.md#variance) | Variance (population or sample) | Yes | | [`covar_pop()` / `covar_samp()`](reference.md#covariance) | Covariance between two columns | Yes | -| [`corr()`](reference.md#corr) | Pearson correlation coefficient | Yes | -| [`first_value()`](reference.md#first_value) | First value in window | Yes | -| [`last_value()`](reference.md#last_value) | Last value in window | Yes | -| [`row_number()`](reference.md#row_number) | Sequential row number | No | -| [`rank()`](reference.md#rank) | Rank with gaps for ties | No | +| [`cume_dist()`](reference.md#cume_dist) | Cumulative distribution (0 to 1] | No | | [`dense_rank()`](reference.md#dense_rank) | Rank without gaps | No | -| [`percent_rank()`](reference.md#percent_rank) | Relative rank (0 to 1) | No | +| [`first_value()`](reference.md#first_value) | First value in window | Yes | +| [`ksum()`](reference.md#ksum) | Sum with Kahan precision | Yes | | [`lag()`](reference.md#lag) | Value from previous row | No | +| [`last_value()`](reference.md#last_value) | Last value in window | Yes | | [`lead()`](reference.md#lead) | Value from following row | No | +| [`max()`](reference.md#max) | Maximum value in window | Yes | +| [`min()`](reference.md#min) | Minimum value in window | Yes | +| [`nth_value()`](reference.md#nth_value) | N-th value in window (`double` only) | Yes | +| [`ntile()`](reference.md#ntile) | Bucket number from 1 to N | No | +| [`percent_rank()`](reference.md#percent_rank) | Relative rank (0 to 1) | No | +| [`rank()`](reference.md#rank) | Rank with gaps for ties | No | +| [`row_number()`](reference.md#row_number) | Sequential row number | No | +| [`stddev_pop()` / `stddev_samp()` / `stddev()`](reference.md#stddev) | Standard deviation (population or sample) | Yes | +| [`sum()`](reference.md#sum) | Sum of values in window | Yes | +| [`var_pop()` / `var_samp()` / `variance()`](reference.md#variance) | Variance (population or sample) | Yes | **Respects Frame**: Functions marked "Yes" use the frame clause (`ROWS`/`RANGE BETWEEN`). Functions marked "No" operate on the entire partition regardless of frame specification. diff --git a/documentation/query/functions/window-functions/reference.md b/documentation/query/functions/window-functions/reference.md index 39e13be6d..00e3b57f0 100644 --- a/documentation/query/functions/window-functions/reference.md +++ b/documentation/query/functions/window-functions/reference.md @@ -1,8 +1,8 @@ --- title: Window Functions Reference sidebar_label: Function Reference -description: Complete reference for all window functions in QuestDB including avg, sum, ksum, count, stddev, variance, covariance, correlation, rank, dense_rank, percent_rank, row_number, lag, lead, EMA, VWEMA, and more. -keywords: [window functions, avg, sum, ksum, count, stddev, stddev_pop, stddev_samp, var_pop, var_samp, variance, covar_pop, covar_samp, corr, correlation, rank, dense_rank, percent_rank, row_number, lag, lead, first_value, last_value, min, max, ema, vwema, exponential moving average] +description: Complete reference for all window functions in QuestDB including avg, sum, ksum, count, stddev, variance, covariance, correlation, rank, dense_rank, percent_rank, ntile, cume_dist, row_number, lag, lead, nth_value, EMA, VWEMA, and more. +keywords: [window functions, avg, sum, ksum, count, stddev, stddev_pop, stddev_samp, var_pop, var_samp, variance, covar_pop, covar_samp, corr, correlation, rank, dense_rank, percent_rank, ntile, cume_dist, row_number, lag, lead, first_value, last_value, nth_value, min, max, ema, vwema, exponential moving average] --- This page provides detailed documentation for each window function. For an introduction to window functions and how they work, see the [Overview](overview.md). For syntax details on the `OVER` clause, see [OVER Clause Syntax](syntax.md). @@ -29,12 +29,12 @@ avg(value, kind, param, volume) OVER (window_definition) **Arguments:** - `value`: Numeric column (`short`, `int`, `long`, `float`, `double`) to calculate the average of -- `kind` (EMA/VWEMA): Smoothing mode - `'alpha'`, `'period'`, or a time unit (`'second'`, `'minute'`, `'hour'`, `'day'`, `'week'`) +- `kind` (EMA/VWEMA): Smoothing mode. One of `'alpha'`, `'period'`, or a time unit (`'second'`, `'minute'`, `'hour'`, `'day'`, `'week'`) - `param` (EMA/VWEMA): Parameter for the smoothing mode (see below) - `volume` (VWEMA only): Numeric column representing volume weights **Return value:** -- `double` - The average of `value` for rows in the window frame +- `double`. The average of `value` for rows in the window frame **Description:** @@ -59,7 +59,7 @@ SELECT ROWS BETWEEN 3 PRECEDING AND CURRENT ROW ) AS moving_avg FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` #### Exponential Moving Average (EMA) @@ -88,7 +88,7 @@ SELECT ORDER BY timestamp ) AS ema_alpha FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` ```questdb-sql title="10-period EMA" demo @@ -101,7 +101,7 @@ SELECT ORDER BY timestamp ) AS ema_10 FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` ```questdb-sql title="Time-weighted EMA with 5-minute decay" demo @@ -114,7 +114,7 @@ SELECT ORDER BY timestamp ) AS ema_5min FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` :::note EMA behavior @@ -148,7 +148,7 @@ SELECT ORDER BY timestamp ) AS vwema_alpha FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` ```questdb-sql title="10-period VWEMA" demo @@ -161,7 +161,7 @@ SELECT ORDER BY timestamp ) AS vwema_10 FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` ```questdb-sql title="Time-weighted VWEMA with 1-hour decay" demo @@ -174,7 +174,7 @@ SELECT ORDER BY timestamp ) AS vwema_1h FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` :::note VWEMA behavior @@ -195,13 +195,13 @@ corr(y, x) OVER (window_definition) ``` **Arguments:** -- `y`: Numeric column - the dependent variable -- `x`: Numeric column - the independent variable +- `y`: Numeric column, the dependent variable +- `x`: Numeric column, the independent variable Rows where either `x` or `y` is `NULL` are excluded from the computation. **Return value:** -- `double` - The Pearson correlation coefficient. Returns `NULL` when there are fewer than 2 valid pairs, or when either variable has zero variance (all values identical). +- `double`. The Pearson correlation coefficient. Returns `NULL` when there are fewer than 2 valid pairs, or when either variable has zero variance (all values identical). **Description:** @@ -254,7 +254,7 @@ count(value) OVER (window_definition) - `value`: Counts non-null values only **Return value:** -- `long` - Number of rows or non-null values in the window frame +- `long`. Number of rows or non-null values in the window frame **Description:** @@ -276,7 +276,7 @@ SELECT RANGE BETWEEN '1' SECOND PRECEDING AND CURRENT ROW ) AS trades_last_second FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` --- @@ -292,13 +292,13 @@ covar_samp(y, x) OVER (window_definition) ``` **Arguments:** -- `y`: Numeric column - the dependent variable -- `x`: Numeric column - the independent variable +- `y`: Numeric column, the dependent variable +- `x`: Numeric column, the independent variable Rows where either `x` or `y` is `NULL` are excluded from the computation. **Return value:** -- `double` - The covariance of `y` and `x` for rows in the window frame. Returns `NULL` when there are fewer than 1 (pop) or 2 (samp) valid pairs. +- `double`. The covariance of `y` and `x` for rows in the window frame. Returns `NULL` when there are fewer than 1 (pop) or 2 (samp) valid pairs. **Description:** @@ -341,7 +341,7 @@ OVER ([PARTITION BY partition_expression] - `RESPECT NULLS` (default): Include null values **Return value:** -- Same type as input - The first value in the window frame (or first non-null with `IGNORE NULLS`) +- Same type as input. The first value in the window frame (or first non-null with `IGNORE NULLS`) **Description:** @@ -367,7 +367,7 @@ SELECT ORDER BY timestamp ) AS first_non_null_price FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` --- @@ -385,7 +385,7 @@ ksum(value) OVER (window_definition) - `value`: Numeric column (`short`, `int`, `long`, `float`, `double`) to sum **Return value:** -- `double` - The sum of `value` for rows in the window frame with improved precision +- `double`. The sum of `value` for rows in the window frame with improved precision **Description:** @@ -410,7 +410,7 @@ SELECT ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS cumulative_price FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` ```questdb-sql title="Sliding window sum with precision" demo @@ -423,7 +423,7 @@ SELECT ROWS BETWEEN 3 PRECEDING AND CURRENT ROW ) AS rolling_sum FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` --- @@ -446,7 +446,7 @@ OVER ([PARTITION BY partition_expression] - `RESPECT NULLS` (default): Include null values **Return value:** -- Same type as input - The last value in the window frame (or last non-null with `IGNORE NULLS`) +- Same type as input. The last value in the window frame (or last non-null with `IGNORE NULLS`) **Description:** @@ -476,7 +476,7 @@ SELECT ORDER BY timestamp ) AS last_non_null_price FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` This example: @@ -499,7 +499,7 @@ max(value) OVER (window_definition) - `value`: Numeric column (`short`, `int`, `long`, `float`, `double`) **Return value:** -- Same type as input - The maximum value (excluding null) in the window frame +- Same type as input. The maximum value (excluding null) in the window frame **Description:** @@ -522,7 +522,7 @@ SELECT ROWS BETWEEN 3 PRECEDING AND CURRENT ROW ) AS highest_price FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` --- @@ -540,7 +540,7 @@ min(value) OVER (window_definition) - `value`: Numeric column (`short`, `int`, `long`, `float`, `double`) **Return value:** -- Same type as input - The minimum value (excluding null) in the window frame +- Same type as input. The minimum value (excluding null) in the window frame **Description:** @@ -563,11 +563,77 @@ SELECT ROWS BETWEEN 3 PRECEDING AND CURRENT ROW ) AS lowest_price FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` --- +### nth_value() {#nth_value} + +Returns the `n`-th value (1-based) within the current window frame. + +**Syntax:** +```questdb-sql +nth_value(value, n) OVER (window_definition) +``` + +**Arguments:** +- `value`: `double` column or expression to retrieve +- `n`: Positive integer constant, the 1-based position within the frame + +**Return value:** +- `double`. The `n`-th value in the window frame, or `NULL` when the frame contains fewer than `n` rows + +**Description:** + +`nth_value()` is similar to `lag()`, but while `lag()` counts the offset relative to the current row, `nth_value()` counts from the start of the frame. For each row, it looks at the rows currently in the frame and returns the `n`-th one. When the frame is smaller than `n` (e.g. `n = 3` but only 2 rows are in scope), the result is `NULL`. + +Common use cases include: + +- **Reference value within a window**: Compare the current row to a fixed slot in the window (e.g. the third price in the last 10 trades) +- **Anchor points**: Pick out a specific row from each partition, such as the second observation in a session +- **Quantile-style spot checks**: Combine with frame clauses to read a specific position in a sliding range + +**Behavior:** +- `n` must be a compile-time constant. A non-constant expression for `n` is rejected at parse time +- `n = 1` returns the same value as `first_value(value)` for the same frame +- `IGNORE NULLS` / `RESPECT NULLS` are not supported +- Supports both `ROWS` and `RANGE` frames, bounded and unbounded +- For `RANGE` frames, the query must be ordered by the designated timestamp + +**Example:** +```questdb-sql title="3rd most recent price in 5-row window" demo +SELECT + symbol, + price, + timestamp, + nth_value(price, 3) OVER ( + PARTITION BY symbol + ORDER BY timestamp + ROWS 4 PRECEDING + ) AS third_price +FROM trades +WHERE timestamp IN '$today'; +``` + +```questdb-sql title="Compare nth_value with first_value" demo +SELECT + symbol, + price, + timestamp, + first_value(price) OVER w AS first_price, + nth_value(price, 1) OVER w AS nth_1, + nth_value(price, 2) OVER w AS nth_2, + nth_value(price, 3) OVER w AS nth_3 +FROM trades +WHERE timestamp IN '$today' AND symbol = 'BTC-USDT' +WINDOW w AS (ORDER BY timestamp ROWS 2 PRECEDING); +``` + +With a 3-row frame, `nth_3` always equals the current row's `price` because it is the last position in the frame. `nth_1` and `nth_2` return `NULL` until the frame has enough rows to fill those positions. + +--- + ### stddev_pop() / stddev_samp() / stddev() {#stddev} Calculates the standard deviation of values over the window frame. `stddev_pop()` computes population standard deviation (divides by N), `stddev_samp()` computes sample standard deviation (divides by N-1). `stddev()` is an alias for `stddev_samp()`. @@ -583,7 +649,7 @@ stddev(value) OVER (window_definition) - `value`: Numeric column (`short`, `int`, `long`, `float`, `double`) **Return value:** -- `double` - The standard deviation of `value` for rows in the window frame. Returns `NULL` when there are no values (or no non-null values). `stddev_samp()` and `stddev()` also return `NULL` when there is only one value (since N-1 = 0). +- `double`. The standard deviation of `value` for rows in the window frame. Returns `NULL` when there are no values (or no non-null values). `stddev_samp()` and `stddev()` also return `NULL` when there is only one value (since N-1 = 0). **Description:** @@ -612,7 +678,7 @@ SELECT ROWS BETWEEN 19 PRECEDING AND CURRENT ROW ) AS volatility_20 FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` ```questdb-sql title="Z-score via subquery" @@ -642,7 +708,7 @@ sum(value) OVER (window_definition) - `value`: Numeric column (`short`, `int`, `long`, `float`, `double`) **Return value:** -- `double` - The sum of `value` for rows in the window frame +- `double`. The sum of `value` for rows in the window frame **Description:** @@ -665,7 +731,7 @@ SELECT ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS cumulative_amount FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` --- @@ -685,7 +751,7 @@ variance(value) OVER (window_definition) - `value`: Numeric column (`short`, `int`, `long`, `float`, `double`) **Return value:** -- `double` - The variance of `value` for rows in the window frame. Returns `NULL` when there are no values. `var_samp()` and `variance()` also return `NULL` for a single value. +- `double`. The variance of `value` for rows in the window frame. Returns `NULL` when there are no values. `var_samp()` and `variance()` also return `NULL` for a single value. **Description:** @@ -707,14 +773,79 @@ SELECT ROWS BETWEEN 9 PRECEDING AND CURRENT ROW ) AS price_variance FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` --- ## Ranking functions -These functions assign ranks or row numbers. They ignore the frame clause and operate on the entire partition. +These functions assign ranks, row numbers, or partition-scoped distribution values. They ignore the frame clause and operate on the entire partition. + +### cume_dist() {#cume_dist} + +Returns the cumulative distribution: the number of rows at or before the current row (including peers) divided by the total number of rows in the partition. The result lies in the range (0, 1]. + +**Syntax:** +```questdb-sql +cume_dist() OVER (window_definition) +``` + +**Arguments:** +- None required + +**Return value:** +- `double`. The cumulative distribution value for the current row's peer group + +**Description:** + +`cume_dist()` is closely related to `percent_rank()`. Where `percent_rank()` reports relative position using `(rank - 1) / (total_rows - 1)`, `cume_dist()` reports the fraction of rows with `ORDER BY` values *at or before* the current row's value. All peer rows (rows with identical `ORDER BY` values) receive the same `cume_dist`, equal to the position of the last peer divided by total rows. + +Use `cume_dist()` to express thresholds in terms of how much of the partition has been seen so far. Common use cases include: + +- **Top/bottom percentile filters**: Keep only rows with `cume_dist <= 0.1` to grab the bottom 10% of a distribution +- **Histogram bucketing**: Group rows by `cume_dist` ranges to build empirical CDFs +- **Anomaly thresholds**: Flag rows that fall outside the bulk of the partition's distribution + +**Behavior:** +- Without `ORDER BY`, all rows are peers and `cume_dist` is `1.0` for every row +- The last peer group in a partition always evaluates to `1.0` +- `ROWS`, `RANGE`, and `EXCLUDE` are rejected. `cume_dist` is always partition-scoped + +**Example:** +```questdb-sql title="Cumulative distribution by price" demo +SELECT + symbol, + price, + timestamp, + cume_dist() OVER ( + PARTITION BY symbol + ORDER BY price + ) AS price_cdf +FROM trades +WHERE timestamp IN '$today' AND symbol = 'BTC-USDT' +ORDER BY price DESC; +``` + +The highest prices appear first with `price_cdf = 1.0`, meaning 100% of trades are at or below that price. As you scroll down, the value decreases, showing what fraction of the partition each price level covers. + +```questdb-sql title="cume_dist with peer rows" +SELECT ts, val, + cume_dist() OVER (ORDER BY val) AS cd +FROM tab; +``` + +| ts | val | cd | +|----|-----|-----| +| 2026-05-08T09:30:00.000000Z | 1 | 0.4 | +| 2026-05-08T09:30:01.000000Z | 1 | 0.4 | +| 2026-05-08T09:30:02.000000Z | 2 | 0.8 | +| 2026-05-08T09:30:03.000000Z | 2 | 0.8 | +| 2026-05-08T09:30:04.000000Z | 3 | 1.0 | + +The two rows with `val = 1` are peers, so they share `cume_dist = 2 / 5 = 0.4`. Likewise the rows with `val = 2` share `cume_dist = 4 / 5 = 0.8`. + +--- ### dense_rank() @@ -753,11 +884,75 @@ SELECT ORDER BY price DESC ) AS price_rank FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` --- +### ntile() {#ntile} + +Distributes the rows of an ordered partition into `n` approximately equal buckets and returns the 1-based bucket number for each row. + +**Syntax:** +```questdb-sql +ntile(n) OVER (window_definition) +``` + +**Arguments:** +- `n`: Positive integer constant, the number of buckets + +**Return value:** +- `long`. Bucket number from `1` to `n` + +**Description:** + +When the partition row count divides evenly by `n`, every bucket has the same size. When it doesn't, the larger buckets come first: with 10 rows and `n = 3`, the buckets contain 4, 3, and 3 rows. + +Use `ntile()` to build distribution-based groupings. Common use cases include: + +- **Quartiles, deciles, percentiles**: Use `ntile(4)`, `ntile(10)`, or `ntile(100)` to bucket rows by an ordered measure +- **Even-sized batches**: Split a partition into `n` worker batches without writing manual range logic +- **Tiered classification**: Assign records to numbered tiers (top tier, middle tier, bottom tier) by some ranked metric + +**Behavior:** +- `n` must be a compile-time constant. A non-constant expression is rejected at parse time +- `n` must be a positive integer; `0`, negative values, or `NULL` are rejected +- Without `ORDER BY`, rows are bucketed in table-scan order +- When `n` exceeds the partition row count, each row gets its own bucket (numbered `1` through row count) and the higher bucket numbers are unused +- `ROWS`, `RANGE`, and `EXCLUDE` are rejected. `ntile` is always partition-scoped + +**Example:** +```questdb-sql title="Quartiles per symbol" demo +SELECT + symbol, + price, + timestamp, + ntile(4) OVER ( + PARTITION BY symbol + ORDER BY price + ) AS price_quartile +FROM trades +WHERE timestamp IN '$today'; +``` + +```questdb-sql title="ntile with uneven distribution" +SELECT ts, val, + ntile(3) OVER (ORDER BY ts) AS bucket +FROM tab; +``` + +| ts | val | bucket | +|----|-----|--------| +| 2026-05-08T09:30:00.000000Z | 10.0 | 1 | +| 2026-05-08T09:30:01.000000Z | 20.0 | 1 | +| 2026-05-08T09:30:02.000000Z | 30.0 | 2 | +| 2026-05-08T09:30:03.000000Z | 40.0 | 2 | +| 2026-05-08T09:30:04.000000Z | 50.0 | 3 | + +With 5 rows and `n = 3`, the leading buckets (1 and 2) get an extra row each. + +--- + ### percent_rank() Returns the relative rank of the current row within its partition as a value between 0 and 1. @@ -805,7 +1000,7 @@ SELECT ORDER BY price DESC ) AS price_percentile FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` ```questdb-sql title="Compare rank functions" demo @@ -816,7 +1011,7 @@ SELECT rank() OVER (ORDER BY price DESC) AS rank, percent_rank() OVER (ORDER BY price DESC) AS percent_rank FROM trades -WHERE timestamp IN '[$today]' +WHERE timestamp IN '$today' AND symbol = 'BTC-USDT'; ``` @@ -869,7 +1064,7 @@ SELECT ORDER BY price DESC ) AS price_rank FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` --- @@ -911,7 +1106,7 @@ SELECT ORDER BY timestamp ) AS trade_number FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` --- @@ -958,7 +1153,7 @@ OVER ([PARTITION BY partition_expression] [ORDER BY sort_expression]) - `RESPECT NULLS` (default): Include null values in offset counting **Return value:** -- Same type as input - Value from the specified previous row +- Same type as input. Value from the specified previous row **Description:** @@ -988,7 +1183,7 @@ SELECT ORDER BY timestamp ) AS price_two_rows_back FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` This example: @@ -1016,7 +1211,7 @@ OVER ([PARTITION BY partition_expression] [ORDER BY sort_expression]) - `RESPECT NULLS` (default): Include null values in offset counting **Return value:** -- Same type as input - Value from the specified following row +- Same type as input. Value from the specified following row **Description:** @@ -1046,7 +1241,7 @@ SELECT ORDER BY timestamp ) AS price_after_next FROM trades -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` This example: @@ -1072,7 +1267,7 @@ SELECT ROWS BETWEEN 3 PRECEDING AND CURRENT ROW ) AS bid_moving_avg FROM market_data -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` ### Cumulative bid size @@ -1090,7 +1285,7 @@ SELECT ROWS BETWEEN 5 PRECEDING AND CURRENT ROW ) AS bid_volume_l1_5rows FROM market_data -WHERE timestamp IN '[$today]'; +WHERE timestamp IN '$today'; ``` ### Time-based rolling sum @@ -1106,7 +1301,7 @@ SELECT RANGE BETWEEN '1' MINUTE PRECEDING AND CURRENT ROW ) AS bid_volume_1min FROM market_data -WHERE timestamp IN '[$today]' AND symbol = 'GBPUSD'; +WHERE timestamp IN '$today' AND symbol = 'GBPUSD'; ``` ### Trade frequency analysis @@ -1121,7 +1316,7 @@ SELECT COUNT(CASE WHEN side = 'buy' THEN 1 END) OVER w AS buys_per_minute, COUNT(CASE WHEN side = 'sell' THEN 1 END) OVER w AS sells_per_minute FROM trades -WHERE timestamp IN '[$today]' AND symbol = 'BTC-USDT' +WHERE timestamp IN '$today' AND symbol = 'BTC-USDT' WINDOW w AS (ORDER BY timestamp RANGE BETWEEN 60000000 PRECEDING AND CURRENT ROW); ``` @@ -1130,7 +1325,6 @@ WINDOW w AS (ORDER BY timestamp RANGE BETWEEN 60000000 PRECEDING AND CURRENT ROW ## Notes - The order of rows in the result set is not guaranteed to be consistent across query executions. Use an `ORDER BY` clause outside the `OVER` clause to ensure consistent ordering. -- Ranking functions (`row_number`, `rank`, `dense_rank`, `percent_rank`) and offset functions (`lag`, `lead`) ignore frame specifications. +- Ranking functions (`row_number`, `rank`, `dense_rank`, `percent_rank`, `cume_dist`, `ntile`) and offset functions (`lag`, `lead`) ignore frame specifications. - For time-based calculations, consider using `RANGE` frames with timestamp columns. - Aggregate window functions (`avg`, `sum`, `ksum`, `count`, `min`, `max`) support numeric types: `short`, `int`, `long`, `float`, `double`. The `decimal` type is not supported. -- `ntile()` and `cume_dist()` are not currently supported.