-
Notifications
You must be signed in to change notification settings - Fork 1
Fix all remaining tests #117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…itions When a parenthesized UNION query contains a DISTINCT->ALL mode transition, the explain output should group the DISTINCT portion into a nested SelectWithUnionQuery and lift the remaining selects to the outer level. Changes: - Parser now keeps parenthesized unions as nested SelectWithUnionQuery - Added expandNestedUnions() to flatten/expand nested unions appropriately: - Single-select nested unions are flattened - All-ALL mode nested unions are fully flattened - Nested unions with DISTINCT->ALL transitions are expanded to grouped results - Updated groupSelectsByUnionMode() to find the last non-ALL->ALL transition - Applied expansion logic to both regular and inherited-WITH explain paths Fixes stmt13 and stmt28 in 01529_union_distinct_and_setting_union_default_mode Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The parser was not handling the UUID clause in CREATE MATERIALIZED VIEW statements, causing the rest of the statement to be skipped. Added UUID handling similar to how CREATE TABLE handles it. Fixes: - 00510_materizlized_view_and_deduplication_zookeeper stmt9, stmt10 - 00609_mv_index_in_in stmt7 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When COMMENT comes before SETTINGS in a CREATE TABLE statement, SETTINGS should be output at the CreateQuery level (outside Storage definition). When SETTINGS comes before COMMENT, it stays inside Storage definition. Added SettingsBeforeComment field to track the order in the AST. Fixes: - 03234_enable_secure_identifiers stmt11, stmt14 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add support for parsing APPLY(quantiles(0.5)) and similar parameterized function calls within column transformers. Previously, the parser only handled simple function names like APPLY(sum), but not functions with parameters. Changes: - Add ApplyParams field to ColumnTransformer struct in AST - Update parseColumnsApply and parseAsteriskApply to handle nested parentheses for parameterized functions - Fixes 01470_columns_transformers stmt41, stmt42 - Also fixes 01710_projection_with_column_transformers Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Changes: - Add parsing for ORDER BY clause in CREATE DATABASE statements - Add QuerySettings field to CreateQuery AST for second SETTINGS clause - Update parser to store second SETTINGS in QuerySettings - Update explain to output QuerySettings as Set at CreateQuery level - Fixes 02184_default_table_engine stmt56 (CREATE DATABASE ORDER BY) - Fixes 02184_default_table_engine stmt107 (multiple SETTINGS) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add parsing and explain support for RENAME DATABASE statements. Changes: - Add RenameDatabase field to RenameQuery AST - Update parser to handle RENAME DATABASE syntax - Update explain to output correct format for database renames - Fixes 01155_rename_move_materialized_view stmt44, stmt52 - Also fixes 02096_rename_atomic_hang stmt14 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add proper parsing for SETTINGS clause in CREATE DICTIONARY statements. Handle SETTINGS as a keyword token (not IDENT). Changes: - Add SETTINGS keyword handling in dictionary definition parsing - Parse settings with or without parentheses - Output as "Dictionary settings" (not "Set") in explain - Update condition to include Settings in dictionary definition check - Fixes 01268_dictionary_direct_layout stmt25, stmt26 - Also fixes: 01259_dictionary_custom_settings_ddl stmt6, 01676_range_hashed_dictionary stmt5, 01681_cache_dictionary_simple_key stmt7, 01760_polygon_dictionaries stmt17, 01765_hashed_dictionary_simple_key stmt7 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When an IN expression has a trailing comma like `IN (2,)`, it should be represented as a Function tuple with one element, not as a plain literal. - Add TrailingComma field to InExpr to track trailing comma presence - Add parseInList helper to detect trailing commas during parsing - Update explainInExpr and explainInExprWithAlias to wrap single elements with trailing comma in Function tuple Fixes stmt45 and stmt46 in 01756_optimize_skip_unused_shards_rewrite_in and stmt5 in 01757_optimize_skip_unused_shards_limit. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When parsing negative numbers with :: cast like `-0.11::Float32`, the expression parsing loop in parseUnaryMinus was using LOWEST precedence, which incorrectly consumed the `and` keyword from BETWEEN expressions. Fix by using MUL_PREC as the threshold, which allows casts (::) and member access (.) but stops before operators like AND. Fixes stmt42 and stmt43 in 02892_orc_filter_pushdown and stmt26 in 02841_parquet_filter_pushdown. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added support for underscore digit separators in binary literals (0b0010_0100_0111) and octal literals (0o755_644) to match ClickHouse's behavior for numeric literals with underscores. Updated both readNumber and readNumberOrIdent functions in the lexer to handle underscores in binary and octal number formats. Fixes stmt6 in 02354_numeric_literals_with_underscores. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added case for TernaryExpr in explainWithElement to properly output the alias from WITH clause. Ternary expressions (? :) become Function if with the alias from the WITH element. Fixes stmt1 in 03254_uniq_exact_two_level_negative_zero. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added case for "CHANGED" keyword in parseShow to handle the SHOW CHANGED SETTINGS query form which filters settings to only show those that have been modified from their defaults. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Test 03254_test_alter_user_no_changes has stmt2 with clientError SYNTAX_ERROR which means ClickHouse expects it to fail parsing. There's no explain_2.txt since ClickHouse can't produce EXPLAIN output for a syntax error. Our parser is more permissive and parses it anyway, but there's nothing to compare against. Removing stmt2 from explain_todo since: - No explain_2.txt exists to compare against - Test already skips due to missing file - Parser being more permissive is acceptable behavior Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
ClickHouse supports short notations for interval units like "INTERVAL 4 h" for hours. Added support for: - h -> hour - m -> minute - s -> second - d -> day - w -> week - ms -> millisecond - us -> microsecond - ns -> nanosecond Updated both the parser's intervalUnits map and the explain code's normalizeIntervalUnit functions to handle these short forms. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
In ClickHouse, the ternary operator (? :) has very low precedence, lower than AND and OR. This means "0 AND id ? 1 : 2" parses as "(0 AND id) ? 1 : 2" not "0 AND (id ? 1 : 2)". Added TERNARY_PREC precedence level between ALIAS_PREC and OR_PREC to ensure correct parsing of expressions with ternary operators. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
ClickHouse sometimes appends "OK" as a success indicator after EXPLAIN AST output. This is not part of the actual AST and should be stripped when comparing expected vs actual output. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
In ClickHouse, unary plus (+x) is a no-op and doesn't appear in the EXPLAIN AST output. Updated parseUnaryPlus to simply return the operand without wrapping it in a UnaryExpr. This fixes parsing of expressions like (+c0.2) which should produce just tupleElement, not Function + wrapping tupleElement. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added parsing and explain support for the ALTER TABLE MODIFY QUERY statement which modifies the SELECT query of a materialized view. Changes: - Added AlterModifyQuery command type to ast/ast.go - Added Query field to AlterCommand struct - Added parsing logic for MODIFY QUERY in parser.go - Added explain output handling in statements.go Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add escapeFunctionAlias function that only escapes single quotes (not backslashes) for function aliases. This differs from escapeAlias (used for column aliases) which also escapes backslashes. ClickHouse EXPLAIN AST preserves backslashes in function aliases but requires single quotes to be escaped. Fixes test: 02915_analyzer_fuzz_1/stmt2 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When ClickHouse errors at runtime (e.g., BAD_ARGUMENTS for invalid settings),
it doesn't produce EXPLAIN output. These statements have empty expected files
and the --{clientError annotation in the SQL.
Skip these tests since the parser correctly parses the valid SQL but ClickHouse
would error at runtime. Also update explain_todo when -check-explain finds
these cases.
Fixes test: 03001_max_parallel_replicas_zero_value/stmt3
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…UTF-8 1. Lexer: Process escape sequences (\xFF, \0, etc.) in backtick-quoted identifiers, matching the behavior of string literals. 2. Explain output: - Replace invalid UTF-8 bytes with replacement character (U+FFFD) - Display null bytes as escape sequence \0 - Escape backslashes and single quotes in identifier/alias output 3. Apply sanitization to: - Column declarations - Identifier names - Function aliases - Storage definition ORDER BY identifiers This matches ClickHouse's EXPLAIN AST behavior for handling special characters in identifiers. Fixes test: 03356_tables_with_binary_identifiers_invalid_utf8/stmt2 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove the condition that skipped quoted aliases in CASE expressions. ClickHouse EXPLAIN AST shows the alias regardless of whether it was quoted in the original SQL. Fixes test: 02244_casewithexpression_return_type/stmt1 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Reorder column declaration children to match ClickHouse's EXPLAIN AST: 1. Type 2. Settings (SETTINGS clause) 3. Default value 4. TTL 5. Codec 6. Statistics 7. Comment Previously Settings was output after TTL and Statistics, but ClickHouse outputs it right after the type. Fixes test: 03270_fix_column_modifier_write_order/stmt3 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add check for PARTITION ID 'value' syntax when parsing ALTER TABLE APPLY DELETED MASK IN PARTITION clause. This sets the PartitionIsID flag so the explain output correctly shows Partition_ID instead of Partition. Fixes test: 03743_fix_estimator_crash/stmt6 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
FORCE is equivalent to FINAL in OPTIMIZE TABLE statements - both force a merge. Add parsing support for FORCE to set the Final flag. Fixes test: 03306_optimize_table_force_keyword/stmt4 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When arrays contain parenthesized literals like [[((NULL))]], they should be rendered as nested Function array calls, not as Literal Array format. Update containsNonLiteralExpressions to detect parenthesized literals so nested arrays containing them use the correct Function array format. Fixes tests: - 01621_summap_check_types/stmt4 - 01635_sum_map_fuzz/stmt4 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Skip optional DATABASE keyword in USE statements (USE DATABASE d1 == USE d1) - Only skip DATABASE if followed by an identifier (not semicolon/EOF) - Track clientError annotations in splitStatements for proper handling - Handle clientError for statements with no explain file (runtime errors) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
ClickHouse outputs WINDOW definitions before QUALIFY in the EXPLAIN AST. Fixed the ordering in both explainSelectQuery and explainSelectQueryWithInheritedWith. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When outputting types like AggregateFunction(1, sumMapFiltered([1, 2]), ...), properly format nested function calls and array literals instead of using raw Go struct representation. Added formatFunctionCallForType and formatExprForType to handle nested expressions in type parameters. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When a QueryParameter (e.g., {param:Type}) has an alias (AS name),
output it as: QueryParameter param:Type (alias name)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
ClickHouse EXPLAIN AST preserves the original case of function names as written in the SQL source (e.g., CEIL stays CEIL, COALESCE stays COALESCE). Only a few functions are normalized to specific canonical forms (e.g., DATEDIFF→dateDiff, POSITION→position, SUBSTRING→substring). This fixes issues where special parser functions (parseIfFunction, parseExtract, parseSubstring, parseArrayConstructor) were hardcoding lowercase names instead of preserving the original case from the token. Also fixes parseKeywordAsFunction which was incorrectly lowercasing all keyword-based function names. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
For parametric aggregate functions like groupArraySample(5, 11111)(DISTINCT x), the DISTINCT modifier was being parsed as a column name instead of being recognized as a modifier. This adds DISTINCT/ALL handling to parseParametricFunctionCall matching the existing logic in parseFunctionCall. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Projection column lists like `SELECT name, max(frequency) max_frequency` need to handle implicit aliases where an identifier follows an expression without the AS keyword. This adds parseImplicitAlias call after parsing each column expression in projections. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
ClickHouse EXPLAIN AST normalizes CLEAR PROJECTION to DROP_PROJECTION, similar to how CLEAR_COLUMN→DROP_COLUMN and CLEAR_INDEX→DROP_INDEX are already mapped. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…N AST When array literals contain nested arrays with non-literal expressions (like identifiers or binary operations), they should be output as nested Function array calls, not as a single Literal node. This adds a recursive check containsNonLiteralExpressionsRecursive that properly detects expressions like [[[number]],[[number + 1],...]] at any nesting depth. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
ClickHouse allows ON CLUSTER to appear either before or after column definitions in CREATE MATERIALIZED VIEW statements. The parser was only checking after column definitions, causing parsing failures for syntax like: CREATE MATERIALIZED VIEW v ON CLUSTER c (x Int) ENGINE=... Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
COMMENT was being parsed as token.COMMENT but the dictionary parsing loop only handled PRIMARY and SETTINGS as keyword tokens. Added handling for token.COMMENT in the loop and output the comment as a Literal in the EXPLAIN AST. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
MOVE PARTITION can target a disk or volume in addition to a table. Added parsing for TO DISK 'disk_name' and TO VOLUME 'volume_name' syntax which were previously causing parse errors. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Parse and explain SETTINGS for SYSTEM commands like "SYSTEM FLUSH DISTRIBUTED ... SETTINGS ...". Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
SYSTEM LOAD PRIMARY KEY and SYSTEM UNLOAD PRIMARY KEY commands need the table name to appear twice in EXPLAIN output, similar to RELOAD DICTIONARY. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Parse TRUNCATE DATABASE differently from TRUNCATE TABLE and format with the correct spacing in EXPLAIN output. ClickHouse uses different spacing conventions for these two variants. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Parse ALTER TABLE ... REMOVE TTL command and output the correct REMOVE_TTL command type in EXPLAIN output. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Settings field to KillQuery struct - Parse SETTINGS clause for KILL QUERY/MUTATION statements - Map comparison operators to function names in EXPLAIN output (e.g., = becomes equals, != becomes notEquals) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
concat_ws should NOT be normalized to concat - ClickHouse preserves the function name as concat_ws in EXPLAIN AST output. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
RENAME COLUMN IF EXISTS was not being parsed correctly - the IF token was being treated as the column name. This fix adds proper IF EXISTS handling for RENAME COLUMN, similar to DROP COLUMN. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Track whether tuples have spaces after commas in the source code and preserve this formatting when outputting EXPLAIN AST. This is needed for Ring/Polygon/MultiPolygon type literals to match ClickHouse output. Changes: - Add formatTupleAsStringFromLiteral to respect SpacedCommas flag - Track spacedCommas when parsing tuples in parseGroupedOrTuple - Fixes 00727_concat/stmt44 and 02935_format_with_arbitrary_types/stmt44 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When an IN expression contains only NULL literals (e.g., `IN (NULL, NULL)`), format them as a tuple literal `Tuple_(NULL, NULL)` instead of `Function tuple`. This matches ClickHouse's EXPLAIN AST output. The fix adds an `allNull` flag to track when all items are NULL and allows tuple literal formatting in this case. Applied to both explainInExpr and explainInExprWithAlias functions. Fixes 01558_transform_null_in/stmt21 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…sions (#117) Fixes ORDER BY expressions like `(a + b) * c` being truncated to just `(a + b)`. - Add isBinaryOperatorToken() to detect binary operators after parenthesized expressions - Add parseExpressionFrom() to continue Pratt parsing from an existing left operand - Check for binary operators after parsing parenthesized expressions in ORDER BY Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
ClickHouse has special handling where INSERT VALUES followed by SELECT on the same line outputs the INSERT AST and then executes the SELECT, printing its result. - Add ExplainStatements() function to handle multi-statement explain output - When first statement is INSERT and subsequent are SELECT with simple literals, append the literal values to match ClickHouse behavior - Update test to use ExplainStatements() for all explain output Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ens (#119) When detecting SET continuation vs new TTL after comma, use peek/peekPeek to check the pattern (IDENT EQ) without consuming tokens. This avoids losing lexer state when restoring position. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use AND_PREC instead of ALIAS_PREC when parsing INTERVAL values to prevent consuming subsequent AND expressions as part of the interval. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Parse and explain REFRESH AFTER/EVERY interval APPEND TO syntax: - Add REFRESH-related fields to CreateQuery AST - Parse REFRESH type (AFTER/EVERY), interval, unit, APPEND TO, and EMPTY - Output "Refresh strategy definition" and "TimeInterval" in explain Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…122) The Settings column in system tables was being lexed as token.SETTINGS and incorrectly terminating function argument parsing. Now check for array access syntax (followed by [) before treating SETTINGS as a clause keyword. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When GROUPING SETS contains ((a,b,c)) with double parentheses, the inner tuple should be output as Function tuple, not unwrapped. Use the Parenthesized flag on tuple literals to detect double-paren cases. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use different precedence based on the first token of the interval value: - String literals like '1 day' have embedded units, so use ADD_PREC to stop before arithmetic operators - Other expressions need arithmetic included before the unit, so use LOWEST precedence This fixes `interval '1 day' - interval '1 hour'` (two separate intervals) while still handling `INTERVAL number - 15 MONTH` correctly (arithmetic expression with separate unit). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add WindowView and InnerEngine fields to CreateQuery AST - Handle CREATE WINDOW VIEW syntax with TO clause and INNER ENGINE - Parse INNER ENGINE clause for window views (storage for internal data) - Update EXPLAIN output to show ViewTargets with Storage definition - ORDER BY for window views goes inside ViewTargets, not separate Storage Fixes 01049_window_view_window_functions/stmt41. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
….) AS name - Remove special case for `(SELECT ...) AS name` in parseWithClause, letting it go through the expression parser which handles binary expressions - Add ScalarWith flag to WithElement to distinguish between: - "name AS (SELECT ...)" - standard CTE syntax - "(SELECT ...) AS name" - ClickHouse scalar WITH syntax - Update EXPLAIN output to use correct format based on ScalarWith flag Fixes 03212_variant_dynamic_cast_or_default/stmt51. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When parsing CAST(x, type) with comma syntax, check if the type string is followed by an operator (CONCAT, PLUS, MINUS) before consuming it. If so, parse it as a full expression to handle cases like: CAST(123, 'Str'||'ing') Fixes 03011_definitive_guide_to_cast/stmt36. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When parsing `SELECT * REPLACE expr AS name, other_col` without parentheses, the REPLACE parser was consuming the comma before breaking, preventing the caller from seeing there's another select item. Now the comma is only consumed when inside parentheses. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
…itions
When a parenthesized UNION query contains a DISTINCT->ALL mode transition, the explain output should group the DISTINCT portion into a nested SelectWithUnionQuery and lift the remaining selects to the outer level.
Changes:
Fixes stmt13 and stmt28 in 01529_union_distinct_and_setting_union_default_mode