Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -253,9 +253,11 @@ Containers: **Products** (flat), **Orders** (nested + arrays), **Events** (spars
| ID | Query |
| ----- | -------------------------------------------------------------------------------------------------------- |
| SQ-01 | `SELECT c.id, ARRAY(SELECT VALUE i.name FROM i IN c.items) AS itemNames FROM c` |
| SQ-02 | `SELECT c.id, FIRST(SELECT VALUE i FROM i IN c.items ORDER BY i.unitPrice DESC) AS mostExpensive FROM c` |
| SQ-02 | `SELECT c.id, FIRST(SELECT VALUE i FROM i IN c.items) AS firstItem FROM c` |
| SQ-03 | `SELECT c.id, LAST(SELECT VALUE i FROM i IN c.items) AS lastItem FROM c` |
| SQ-04 | `SELECT c.id, (SELECT VALUE COUNT(1) FROM i IN c.items) AS itemCount FROM c` |
| SQ-05 | `SELECT c.id, FIRST(SELECT VALUE i FROM i IN c.items ORDER BY i.unitPrice DESC) AS mostExpensive FROM c` |
| SQ-06 | `SELECT c.id, LAST(SELECT VALUE i FROM i IN c.items ORDER BY i.unitPrice DESC) AS cheapest FROM c` |

---

Expand Down
29 changes: 20 additions & 9 deletions packages/nosql-language-service/docs/test-suite.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,22 +91,33 @@ node scripts/import-seed.mjs --all --endpoint https://localhost:8081

Some fixtures are marked with `knownLimitation` in their definition. These tests **still run** against the emulator but a failure is printed as `console.warn` rather than failing the test. The parser correctly accepts all of these — the limitation is in the emulator only.

| ID | Query feature | Reason |
| ------ | ------------------ | ------------------------------------------------------------------- |
| STR-12 | `TRIM()` | Not implemented in vnext-preview |
| M-07 | `LOG(0)` | Produces `-Infinity` → JSON error 4001 |
| M-13 | `LOG10(0)` | Produces `-Infinity` → JSON error 4001 |
| SQ-02 | `FIRST()` subquery | Not supported in vnext-preview |
| UDF-01 | UDF in SELECT | "Server-side scripts are not supported in this emulator" (HTTP 400) |
| UDF-02 | UDF in WHERE | "Server-side scripts are not supported in this emulator" (HTTP 400) |
| UDF-03 | UDF multiple args | "Server-side scripts are not supported in this emulator" (HTTP 400) |
| ID | Query feature | Reason |
| ------ | ----------------- | ------------------------------------------------------------------- |
| STR-12 | `TRIM()` | Not implemented in vnext-preview |
| M-07 | `LOG(0)` | Produces `-Infinity` → JSON error 4001 |
| M-13 | `LOG10(0)` | Produces `-Infinity` → JSON error 4001 |
| UDF-01 | UDF in SELECT | "Server-side scripts are not supported in this emulator" (HTTP 400) |
| UDF-02 | UDF in WHERE | "Server-side scripts are not supported in this emulator" (HTTP 400) |
| UDF-03 | UDF multiple args | "Server-side scripts are not supported in this emulator" (HTTP 400) |

> **Note:** The vnext-preview Linux emulator (PGSQL backend) does not support any server-side scripts — UDFs, stored procedures, and triggers all return HTTP 400 with `"Server-side scripts are not supported in this emulator"`. The UDF registration step in `import-seed.mjs` is kept for use against production CosmosDB or a future emulator version.

When Microsoft ships a stable Linux emulator that supports these features, remove the `knownLimitation` field from the corresponding fixture.

---

## Cosmos DB language limitations (not emulator-specific)

These fixtures parse successfully (the native `sql.y` grammar accepts them) but are rejected by **both** the emulator and production Azure Cosmos DB with HTTP 400. They are **not** emulator gaps, so they will not be fixed by a future emulator — the language service flags them statically instead.

| ID | Query feature | Reason |
| ------------- | ------------------------ | -------------------------------------------------------------------------------------------------------------------- |
| SQ-05 / SQ-06 | `ORDER BY` in a subquery | Cosmos DB does not support `ORDER BY` inside any subquery (`FIRST`/`LAST`/`ARRAY`/`EXISTS`/`(SELECT …)`/`FROM (…)`). |

> **`ORDER BY` in subqueries:** the scalar subquery expressions `FIRST()`, `LAST()`, and `ARRAY()` work (SQ-01…SQ-04 pass) — but a nested `ORDER BY` inside any subquery is invalid. This was originally mis-reported upstream as "`FIRST()` unsupported" ([Azure/azure-cosmos-db-emulator-docker#311](https://github.com/Azure/azure-cosmos-db-emulator-docker/issues/311)); the actual discriminator is the subquery `ORDER BY`. Top-level `ORDER BY` (O/P series) is fully supported. The grammar permits the construct, so the language service surfaces it as the `ORDER_BY_IN_SUBQUERY` diagnostic (severity Error) — see `src/diagnostics/orderByInSubquery.ts`.

---

## Seed data

| Container | Documents | Size |
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
/*---------------------------------------------------------------------------------------------
* Copyright (c) Microsoft Corporation. All rights reserved.
* Licensed under the MIT License. See License.txt in the project root for license information.
*--------------------------------------------------------------------------------------------*/

import { describe, expect, it } from 'vitest';
import { parse } from '../index.js';
import { SqlLanguageService } from '../services/SqlLanguageService.js';
import { DiagnosticSeverity } from '../services/types.js';
import { detectOrderByInSubquery } from './orderByInSubquery.js';

// ========================== detectOrderByInSubquery unit tests ================

function detect(query: string) {
return detectOrderByInSubquery(parse(query).ast);
}

describe('detectOrderByInSubquery', () => {
// ─── Should flag ─────────────────────────────────────────────────

it('flags ORDER BY inside a FIRST subquery', () => {
const errors = detect(
'SELECT c.id, FIRST(SELECT VALUE i FROM i IN c.items ORDER BY i.unitPrice DESC) AS m FROM c',
);
expect(errors).toHaveLength(1);
expect(errors[0].message).toContain('ORDER BY');
});

it('flags ORDER BY inside a LAST subquery', () => {
const errors = detect(
'SELECT c.id, LAST(SELECT VALUE i FROM i IN c.items ORDER BY i.unitPrice DESC) AS m FROM c',
);
expect(errors).toHaveLength(1);
});

it('flags ORDER BY inside an ARRAY subquery', () => {
const errors = detect('SELECT c.id, ARRAY(SELECT VALUE i FROM i IN c.items ORDER BY i.unitPrice) AS r FROM c');
expect(errors).toHaveLength(1);
});

it('flags ORDER BY inside an EXISTS subquery', () => {
const errors = detect(
'SELECT c.id FROM c WHERE EXISTS(SELECT VALUE i FROM i IN c.items WHERE i.unitPrice > 100 ORDER BY i.unitPrice DESC)',
);
expect(errors).toHaveLength(1);
});

it('flags ORDER BY inside a scalar (SELECT …) subquery', () => {
const errors = detect(
'SELECT c.id, (SELECT VALUE COUNT(1) FROM i IN c.items ORDER BY i.unitPrice) AS n FROM c',
);
expect(errors).toHaveLength(1);
});

it('flags ORDER BY inside a FROM-clause subquery', () => {
const errors = detect('SELECT s.id FROM (SELECT VALUE c FROM c ORDER BY c.id) AS s');
expect(errors).toHaveLength(1);
});

it('flags multiple offending subqueries independently', () => {
const errors = detect(
'SELECT FIRST(SELECT VALUE i FROM i IN c.items ORDER BY i.p) AS a, ' +
'LAST(SELECT VALUE j FROM j IN c.items ORDER BY j.p) AS b FROM c',
);
expect(errors).toHaveLength(2);
});

it('points the range at the inner ORDER BY, not the whole query', () => {
const query = 'SELECT c.id, FIRST(SELECT VALUE i FROM i IN c.items ORDER BY i.unitPrice DESC) AS m FROM c';
const [err] = detect(query);
expect(query.slice(err.range.start.offset, err.range.end.offset)).toContain('ORDER BY');
});

// ─── Should NOT flag ─────────────────────────────────────────────

it('does not flag top-level ORDER BY', () => {
expect(detect('SELECT * FROM c ORDER BY c.price DESC')).toHaveLength(0);
});

it('does not flag subqueries without ORDER BY', () => {
expect(detect('SELECT c.id, FIRST(SELECT VALUE i FROM i IN c.items) AS m FROM c')).toHaveLength(0);
expect(detect('SELECT c.id, ARRAY(SELECT VALUE i FROM i IN c.items) AS r FROM c')).toHaveLength(0);
expect(detect('SELECT c.id FROM c WHERE EXISTS(SELECT VALUE i FROM i IN c.items WHERE i.p > 1)')).toHaveLength(
0,
);
});

it('flags only the inner ORDER BY when the outer query also sorts', () => {
const errors = detect(
'SELECT c.id, FIRST(SELECT VALUE i FROM i IN c.items ORDER BY i.p) AS m FROM c ORDER BY c.id',
);
expect(errors).toHaveLength(1);
});

it('returns empty for an unparseable query (parser reports it instead)', () => {
expect(detect('SELECT FROM')).toHaveLength(0);
});
});

// ========================== Service integration ===============================

describe('SqlLanguageService.getDiagnostics — ORDER BY in subquery', () => {
it('emits an Error diagnostic with the ORDER_BY_IN_SUBQUERY code', () => {
const service = new SqlLanguageService();
const diags = service.getDiagnostics(
'SELECT c.id, FIRST(SELECT VALUE i FROM i IN c.items ORDER BY i.unitPrice DESC) AS m FROM c',
);
const subqueryErrors = diags.filter((d) => d.code === 'ORDER_BY_IN_SUBQUERY');
expect(subqueryErrors).toHaveLength(1);
expect(subqueryErrors[0].severity).toBe(DiagnosticSeverity.Error);
});

it('does not emit for a valid top-level ORDER BY query', () => {
const service = new SqlLanguageService();
const diags = service.getDiagnostics('SELECT * FROM c ORDER BY c.price DESC');
expect(diags.filter((d) => d.code === 'ORDER_BY_IN_SUBQUERY')).toHaveLength(0);
});

it('reports document-level offsets in multi-query mode', () => {
const service = new SqlLanguageService({ multiQuery: true });
const text = 'SELECT * FROM c;\nSELECT FIRST(SELECT VALUE i FROM i IN c.items ORDER BY i.p) AS m FROM c';
const diags = service.getDiagnostics(text).filter((d) => d.code === 'ORDER_BY_IN_SUBQUERY');
expect(diags).toHaveLength(1);
// Offset must land within the second statement, after the newline.
expect(diags[0].range.startOffset).toBeGreaterThan(text.indexOf('\n'));
});
});
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
/*---------------------------------------------------------------------------------------------
* Copyright (c) Microsoft Corporation. All rights reserved.
* Licensed under the MIT License. See License.txt in the project root for license information.
*--------------------------------------------------------------------------------------------*/

// ---------------------------------------------------------------------------
// ORDER BY inside a subquery detection.
//
// Azure Cosmos DB NoSQL does **not** support `ORDER BY` inside any subquery —
// scalar subqueries (`ARRAY`, `FIRST`, `LAST`, `(SELECT …)`), `EXISTS`, and
// subqueries in the `FROM` clause. The backend rejects such queries at
// execution time with HTTP 400, even though the grammar (the native C++
// `sql.y`) accepts them syntactically: every subquery form embeds a full
// `sql_query`, which carries an optional `opt_orderby_clause`. The restriction
// is therefore semantic, not grammatical, and is not documented in the
// language reference — so we surface it as a static diagnostic instead.
//
// See: https://github.com/Azure/azure-cosmos-db-emulator-docker/issues/311
//
// Only the **outer** query may use `ORDER BY`; any `ORDER BY` on a nested
// query is flagged as an error.
// ---------------------------------------------------------------------------

import { type SqlProgram, type SqlQuery } from '../ast/nodes.js';
import { type SourceRange } from '../errors/SqlError.js';

// ========================== Public types ======================================

export interface OrderByInSubqueryError {
/** Source range of the offending `ORDER BY` clause. */
range: SourceRange;
/** Human-readable error message. */
message: string;
}

export const ORDER_BY_IN_SUBQUERY_MESSAGE =
'ORDER BY is not supported inside a subquery in Azure Cosmos DB NoSQL. ' +
'Remove it, or move the ordering to the outermost query.';

// ========================== Main entry point ==================================

/**
* Walk a parsed AST and report every `ORDER BY` clause that appears inside a
* subquery (i.e. on any query other than the outermost one).
*
* Takes the already-parsed {@link SqlProgram} rather than the query string so
* callers can reuse their existing parse result (and to avoid an import cycle
* with the `parse` entry point). Returns an empty array when `ast` is
* undefined or contains no nested `ORDER BY`.
*/
export function detectOrderByInSubquery(ast: SqlProgram | undefined): OrderByInSubqueryError[] {
if (!ast) return [];

const errors: OrderByInSubqueryError[] = [];

// Generic AST walk. The root query (`ast.query`) is allowed to use ORDER BY;
// every other `Query` node is, by the grammar, reachable only through a
// subquery construct, so its ORDER BY is illegal. `isRoot` flips to false as
// soon as we descend past the outermost query.
const walk = (value: unknown, isRoot: boolean): void => {
if (Array.isArray(value)) {
for (const item of value) walk(item, isRoot);
return;
}
if (!value || typeof value !== 'object') return;

const node = value as Record<string, unknown>;
if (typeof node.kind !== 'string') return; // e.g. a SourceRange — no children of interest

if (node.kind === 'Query' && !isRoot) {
const query = node as unknown as SqlQuery;
if (query.orderBy) {
const range = query.orderBy.range ?? query.range;
if (range) {
errors.push({ range, message: ORDER_BY_IN_SUBQUERY_MESSAGE });
}
}
}

for (const key of Object.keys(node)) {
if (key === 'range') continue; // SourceRange holds no AST children
walk(node[key], false);
}
};

walk(ast.query, true);
return errors;
}
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@ ORDER BY RANK score_function(...)

- Default sort order is ascending (ASC).
- `ORDER BY RANK` is used with full-text and vector search scoring functions.
- **Not allowed inside a subquery.** Azure Cosmos DB rejects `ORDER BY` within any
subquery — `FIRST(…)`, `LAST(…)`, `ARRAY(…)`, `EXISTS(…)`, `(SELECT …)`, and
`FROM (SELECT …)`. Only the outermost query may sort. (The grammar accepts it, but
the engine returns HTTP 400.)

---

Expand Down
2 changes: 2 additions & 0 deletions packages/nosql-language-service/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,8 @@ export { getCompletions } from './completion/SqlCompletion.js';
export type { CompletionItem, CompletionItemKind, CompletionRequest, JSONSchema } from './completion/SqlCompletion.js';
export { detectTypos } from './diagnostics/typoDetection.js';
export type { TypoWarning } from './diagnostics/typoDetection.js';
export { detectOrderByInSubquery, ORDER_BY_IN_SUBQUERY_MESSAGE } from './diagnostics/orderByInSubquery.js';
export type { OrderByInSubqueryError } from './diagnostics/orderByInSubquery.js';
export * from './errors/SqlError.js';
export { sqlToString } from './printer/SqlPrinter.js';
export * from './visitor/SqlVisitor.js';
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
import { type IToken, type TokenType } from 'chevrotain';
import { getCompletions, type CompletionItem, type JSONSchema } from '../completion/SqlCompletion.js';
import { detectBetweenAmbiguity } from '../diagnostics/betweenAmbiguity.js';
import { detectOrderByInSubquery } from '../diagnostics/orderByInSubquery.js';
import { detectTypos } from '../diagnostics/typoDetection.js';
import { parse, type ParseResult } from '../index.js';
import { SqlLexer } from '../lexer/SqlLexer.js';
Expand Down Expand Up @@ -186,7 +187,7 @@ export class SqlLanguageService {
}

private getSingleQueryDiagnostics(query: string): Diagnostic[] {
const { errors } = parse(query);
const { ast, errors } = parse(query);
const diagnostics: Diagnostic[] = errors.map((e) => ({
range: {
startOffset: e.range.start.offset,
Expand Down Expand Up @@ -238,6 +239,24 @@ export class SqlLanguageService {
});
}

// Append ORDER BY-in-subquery errors (semantic; backend rejects with HTTP 400)
for (const e of detectOrderByInSubquery(ast)) {
diagnostics.push({
range: {
startOffset: e.range.start.offset,
endOffset: e.range.end.offset,
startLine: e.range.start.line,
startColumn: e.range.start.col,
endLine: e.range.end.line,
endColumn: e.range.end.col,
},
message: e.message,
severity: DiagnosticSeverity.Error,
code: 'ORDER_BY_IN_SUBQUERY',
source: 'cosmosdb-sql',
});
}

return this.filterSchemaDiagnostics(diagnostics);
}

Expand Down Expand Up @@ -317,6 +336,29 @@ export class SqlLanguageService {
source: 'cosmosdb-sql',
});
}

// ORDER BY-in-subquery errors for this region
for (const e of detectOrderByInSubquery(region.parseResult.ast)) {
const docStartOffset = region.startOffset + e.range.start.offset;
const docEndOffset = region.startOffset + e.range.end.offset;
const { line: startLine, col: startColumn } = offsetToLineCol(query, docStartOffset);
const { line: endLine, col: endColumn } = offsetToLineCol(query, docEndOffset);

diagnostics.push({
range: {
startOffset: docStartOffset,
endOffset: docEndOffset,
startLine,
startColumn,
endLine,
endColumn,
},
message: e.message,
severity: DiagnosticSeverity.Error,
code: 'ORDER_BY_IN_SUBQUERY',
source: 'cosmosdb-sql',
});
}
}

return this.filterSchemaDiagnostics(diagnostics);
Expand Down
Loading
Loading