When does AI correctly suggest rewriting a correlated subquery as a window function?

The item asks the candidate to recognize when an AI assistant correctly identifies a correlated subquery that should be rewritten as a window function, and when the suggested rewrite is misguided. The competence under test is not “can you write a window function” but rather “can you read AI-suggested SQL and tell whether the rewrite preserves semantics, improves the query plan, or quietly introduces a regression.” That distinction matters because generative tools propose window-function rewrites aggressively and often correctly — but the failure modes, when they occur, are silent and easy to merge.

What this question tests

The concept being tested is the equivalence between row-by-row correlated subqueries and partitioned window functions, and the conditions under which the rewrite is semantically sound and plan-equivalent. Most modern relational engines — PostgreSQL, SQL Server, Oracle, BigQuery, Snowflake — implement window functions with a single partition-and-sort pass, while a correlated subquery is typically executed once per outer row unless the optimizer can detect and rewrite the pattern itself. AI assistants are confidently aware of the generic pattern; they are inconsistent at recognizing the preconditions under which the rewrite is safe.

The probe also tests whether the candidate has a working mental model of partition boundaries, frame clauses, and ordering stability — the three places where the naive rewrite goes wrong.

Why this is the right answer

Consider a classic correlated-subquery anti-pattern: for each order, fetch the running total of order amounts for the same customer up to and including the current order date.

-- Correlated subquery: O(N^2) without optimizer rewriting
SELECT
  o.order_id,
  o.customer_id,
  o.order_date,
  o.amount,
  (SELECT SUM(o2.amount)
   FROM orders o2
   WHERE o2.customer_id = o.customer_id
     AND o2.order_date <= o.order_date) AS running_total
FROM orders o;

A correctly-suggested AI rewrite produces:

-- Window function: single partition-and-sort pass
SELECT
  order_id,
  customer_id,
  order_date,
  amount,
  SUM(amount) OVER (
    PARTITION BY customer_id
    ORDER BY order_date
    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  ) AS running_total
FROM orders;

The two queries return the same rows when (customer_id, order_date) is unique. When ties exist on order_date, the correlated subquery includes every tied row in the running total at every tied position, while the default window frame (RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) does the same — but ROWS framing does not, and AI tools sometimes emit ROWS without recognizing the tie semantics. The right answer documents both that the rewrite is preferred and that the framing clause is the load-bearing detail.

A second correctness condition: the correlated subquery returns NULL for outer rows where no inner rows match. The window-function rewrite can produce 0 or NULL depending on whether SUM is applied over an empty frame. When the correlated subquery’s outer predicate filters down to a subset where some customers have no qualifying history, the rewrite must either preserve the filter or wrap the window function in a COALESCE to match the original semantics.

What the wrong answers reveal

The three incorrect options each map to a common but mistaken mental model:

“Always prefer the window function — it’s faster in every case.” This is the AI-confidence failure mode. Window functions usually win on large tables, but on tiny tables or when the correlated subquery hits a covering index that the window plan can’t use, the correlated form is competitive or faster. Respondents picking this option have absorbed the generic “window functions are better” rule without the query-plan caveat.
“Never trust the AI rewrite — correlated subqueries always produce different results than window functions.” This is the over-cautious failure mode. The two forms are semantically equivalent under specific conditions; the candidate’s job is to verify those conditions, not to reject the rewrite categorically. Respondents picking this option likely have been burned by a bad AI rewrite once and over- generalize from that experience.
“The rewrite is safe whenever the inner query has the same WHERE predicate as the outer query.” This conflates predicate equivalence with frame equivalence. Two queries can share a WHERE clause and still differ on tie-handling, empty-frame behavior, and ordering stability. Respondents picking this option have a partial mental model that misses the frame-clause subtleties.

The wrong-answer patterns cluster into AI over-trust, AI under-trust, and partial pattern-matching — three distinct gaps that the full assessment can score separately.

How the sample test scores you

In the AIEH 5-question AI-Augmented SQL sample, this item contributes one of five datapoints aggregated into a single ai_augmented_sql score via the W3.2 normalize-by-count threshold. Binary scoring per item: 5 for the correct option, 1 for any of the three wrong options. With 5 binary items, the average ranges 1–5 and the level threshold maps avg ≤ 2 to low, ≤ 4 to mid, > 4 to high.

Data Notice: Sample-test results are directional indicators only. A 5-question sample can’t reliably distinguish “reads AI SQL critically” from “got lucky on these specific items”; for a verified Skills Passport credential, take the full 50-question assessment.

The full assessment probes correlated-subquery rewrites, implicit join detection, NULL handling, GROUP BY semantics, and index-hint review at depth across PostgreSQL, MySQL, SQL Server, and BigQuery dialects. See the scoring methodology for how AI-Augmented SQL scores map onto the AIEH 300–850 Skills Passport scale.

Optimizer-driven subquery decorrelation. Modern PostgreSQL and SQL Server can sometimes decorrelate correlated subqueries automatically, producing a plan equivalent to the window-function form. The AI assistant’s rewrite suggestion is therefore not always a performance win — sometimes the optimizer was already going to do the same thing. Reading the EXPLAIN plan for both forms is the only reliable way to distinguish “AI found a real win” from “AI found a stylistic preference.”
Frame clause defaults across dialects. PostgreSQL and SQL Server default to RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW when no frame is specified; MySQL 8 follows the same default. This default differs from ROWS framing in the presence of ties — the difference that AI-suggested rewrites most commonly get wrong.
LATERAL joins as a third path. When the correlated subquery is too complex for a window-function rewrite (e.g., it returns multiple columns or applies non-aggregable logic), a LATERAL join (PostgreSQL, SQL Server, MySQL 8.0.14+) is often the right intermediate form. AI assistants frequently miss this option entirely, jumping from correlated subquery to window function without considering LATERAL.
Cost-based optimizer hints and plan stability. Even when the rewrite is semantically correct, plan stability matters for production workloads. A query that ran in 50ms with the correlated form may run in 50ms with the window form too — but on a different plan that responds differently to statistics drift.

For the broader AI-Augmented SQL lineup including the full 50-question assessment when it ships, see the tests catalog. For how SQL fluency relates to other hiring signals, see skills-based hiring evidence and the data engineering interview prep guide.

Sources

ISO/IEC 9075-2:2023. Information technology — Database languages — SQL — Part 2: Foundation (SQL/Foundation). International Organization for Standardization. — Window function semantics defined in clause 4.16 and 7.15.
PostgreSQL Global Development Group. (2024). PostgreSQL Documentation: Window Functions. https://www.postgresql.org/docs/current/tutorial-window.html
Microsoft. (2024). SQL Server Documentation: SELECT — OVER Clause (Transact-SQL). https://learn.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql
Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124(2), 262–274. — Job-knowledge tests (which technical SQL items operationalize) show validity coefficients in the .48–.51 range for predicting job performance.

What this question tests

Why this is the right answer

What the wrong answers reveal

How the sample test scores you

Related concepts

Sources

Try the question yourself