forked from apache/datafusion
-
Notifications
You must be signed in to change notification settings - Fork 0
Timestamp 17998 proposal #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Omega359
wants to merge
502
commits into
kosiew:timestamp-17998
Choose a base branch
from
Omega359:timestamp-17998
base: timestamp-17998
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…che#19062) Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action) from 2.62.60 to 2.62.61. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/releases">taiki-e/install-action's releases</a>.</em></p> <blockquote> <h2>2.62.61</h2> <ul> <li> <p>Update <code>cargo-deny@latest</code> to 0.18.7.</p> </li> <li> <p>Update <code>cargo-careful@latest</code> to 0.4.9.</p> </li> <li> <p>Update <code>uv@latest</code> to 0.9.14.</p> </li> <li> <p>Update <code>vacuum@latest</code> to 0.20.4.</p> </li> <li> <p>Update <code>cargo-valgrind@latest</code> to 2.4.0.</p> </li> <li> <p>Update <code>mise@latest</code> to 2025.11.11.</p> </li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md">taiki-e/install-action's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <p>All notable changes to this project will be documented in this file.</p> <p>This project adheres to <a href="https://semver.org">Semantic Versioning</a>.</p> <!-- raw HTML omitted --> <h2>[Unreleased]</h2> <ul> <li> <p>Update <code>uv@latest</code> to 0.9.15.</p> </li> <li> <p>Update <code>knope@latest</code> to 0.21.6.</p> </li> </ul> <h2>[2.62.61] - 2025-12-02</h2> <ul> <li> <p>Update <code>cargo-deny@latest</code> to 0.18.7.</p> </li> <li> <p>Update <code>cargo-careful@latest</code> to 0.4.9.</p> </li> <li> <p>Update <code>uv@latest</code> to 0.9.14.</p> </li> <li> <p>Update <code>vacuum@latest</code> to 0.20.4.</p> </li> <li> <p>Update <code>cargo-valgrind@latest</code> to 2.4.0.</p> </li> <li> <p>Update <code>mise@latest</code> to 2025.11.11.</p> </li> </ul> <h2>[2.62.60] - 2025-11-30</h2> <ul> <li> <p>Update <code>zizmor@latest</code> to 1.18.0.</p> </li> <li> <p>Update <code>cargo-shear@latest</code> to 1.7.0.</p> </li> <li> <p>Update <code>wasm-bindgen@latest</code> to 0.2.106.</p> </li> </ul> <h2>[2.62.59] - 2025-11-28</h2> <ul> <li> <p>Update <code>mise@latest</code> to 2025.11.10.</p> </li> <li> <p>Update <code>uv@latest</code> to 0.9.13.</p> </li> <li> <p>Update <code>typos@latest</code> to 1.40.0.</p> </li> </ul> <h2>[2.62.58] - 2025-11-26</h2> <ul> <li>Update <code>cargo-shear@latest</code> to 1.6.6.</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/taiki-e/install-action/commit/92e6dd1c202153a204d471a3c509bf1e03dcfa44"><code>92e6dd1</code></a> Release 2.62.61</li> <li><a href="https://github.com/taiki-e/install-action/commit/0ab43d9e3d56e39b5c2f425526fd91644dbc77f6"><code>0ab43d9</code></a> Update <code>cargo-deny@latest</code> to 0.18.7</li> <li><a href="https://github.com/taiki-e/install-action/commit/2a6eb2213fcb7d9b30d8a539b6757eaa134156cb"><code>2a6eb22</code></a> Update knope manifest</li> <li><a href="https://github.com/taiki-e/install-action/commit/cde677a057268f8ce93925a716f687229a3a7aa7"><code>cde677a</code></a> Update <code>cargo-careful@latest</code> to 0.4.9</li> <li><a href="https://github.com/taiki-e/install-action/commit/6bce10ece58b44687950a92c8919157bbf8d32e8"><code>6bce10e</code></a> Update <code>uv@latest</code> to 0.9.14</li> <li><a href="https://github.com/taiki-e/install-action/commit/91601689b69d386be18bd2947c95e88f3d45e41c"><code>9160168</code></a> Update <code>vacuum@latest</code> to 0.20.4</li> <li><a href="https://github.com/taiki-e/install-action/commit/5caeef472993585f5693d3d71ff4168b463f8bc1"><code>5caeef4</code></a> Update <code>cargo-valgrind@latest</code> to 2.4.0</li> <li><a href="https://github.com/taiki-e/install-action/commit/6eea626a2b58146fe84c8a6be6a1cf01deb99f5e"><code>6eea626</code></a> ci: Test ubuntu-slim</li> <li><a href="https://github.com/taiki-e/install-action/commit/1184949f42d77acf1129cbce8f83e2ce951d883e"><code>1184949</code></a> Apply zizmor and update scripts</li> <li><a href="https://github.com/taiki-e/install-action/commit/8a3e6f31fc2e7f008e86c046a3881d190b260785"><code>8a3e6f3</code></a> Update <code>mise@latest</code> to 2025.11.11</li> <li>See full diff in <a href="https://github.com/taiki-e/install-action/compare/3575e532701a5fc614b0c842e4119af4cc5fd16d...92e6dd1c202153a204d471a3c509bf1e03dcfa44">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/stale](https://github.com/actions/stale) from 10.1.0 to 10.1.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/stale/releases">actions/stale's releases</a>.</em></p> <blockquote> <h2>v10.1.1</h2> <h2>What's Changed</h2> <h3>Bug Fix</h3> <ul> <li>Add Missing Input Reading for <code>only-issue-types</code> by <a href="https://github.com/Bibo-Joshi"><code>@Bibo-Joshi</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1298">actions/stale#1298</a></li> </ul> <h3>Improvement</h3> <ul> <li>Improves error handling when rate limiting is disabled on GHES. by <a href="https://github.com/chiranjib-swain"><code>@chiranjib-swain</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1300">actions/stale#1300</a></li> </ul> <h3>Dependency Upgrades</h3> <ul> <li>Upgrade eslint-config-prettier from 8.10.0 to 10.1.8 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1276">actions/stale#1276</a></li> <li>Upgrade <code>@types/node</code> from 20.10.3 to 24.2.0 and document breaking changes in v10 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1280">actions/stale#1280</a></li> <li>Upgrade actions/publish-action from 0.3.0 to 0.4.0 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1291">actions/stale#1291</a></li> <li>Upgrade actions/checkout from 4 to 6 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/stale/pull/1306">actions/stale#1306</a></li> </ul> <h2>New Contributors</h2> <ul> <li><a href="https://github.com/chiranjib-swain"><code>@chiranjib-swain</code></a> made their first contribution in <a href="https://redirect.github.com/actions/stale/pull/1300">actions/stale#1300</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/stale/compare/v10...v10.1.1">https://github.com/actions/stale/compare/v10...v10.1.1</a></p> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/actions/stale/commit/997185467fa4f803885201cee163a9f38240193d"><code>9971854</code></a> build(deps): bump actions/checkout from 4 to 6 (<a href="https://redirect.github.com/actions/stale/issues/1306">#1306</a>)</li> <li><a href="https://github.com/actions/stale/commit/5611b9defa6b7799a950489b00163db69f7a3ece"><code>5611b9d</code></a> build(deps): bump actions/publish-action from 0.3.0 to 0.4.0 (<a href="https://redirect.github.com/actions/stale/issues/1291">#1291</a>)</li> <li><a href="https://github.com/actions/stale/commit/fad0de84e50d1aba7b0236cdaf0ea98a43286849"><code>fad0de8</code></a> Improves error handling when rate limiting is disabled on GHES. (<a href="https://redirect.github.com/actions/stale/issues/1300">#1300</a>)</li> <li><a href="https://github.com/actions/stale/commit/39bea7de61dd70ce4705a976f904f33d5e1e0f49"><code>39bea7d</code></a> Add Missing Input Reading for <code>only-issue-types</code> (<a href="https://redirect.github.com/actions/stale/issues/1298">#1298</a>)</li> <li><a href="https://github.com/actions/stale/commit/e46bbabb3ede15841d25946157759558dd16306e"><code>e46bbab</code></a> build(deps-dev): bump <code>@types/node</code> from 20.10.3 to 24.2.0 and document breakin...</li> <li><a href="https://github.com/actions/stale/commit/65d1d4804d3060875fff9f9fa8a49e27f71ce7f0"><code>65d1d48</code></a> build(deps-dev): bump eslint-config-prettier from 8.10.0 to 10.1.8 (<a href="https://redirect.github.com/actions/stale/issues/1276">#1276</a>)</li> <li>See full diff in <a href="https://github.com/actions/stale/compare/5f858e3efba33a5ca4407a664cc011ad407f2008...997185467fa4f803885201cee163a9f38240193d">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> Part of apache#17964. ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> - Fix return type of spark `array` function when data type is null to be consistent with that returned in Spark. - Reuse functions shared by both `make_array` and spark `array`. ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Yes ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> No
Bumps [actions/checkout](https://github.com/actions/checkout) from 5.0.0 to 6.0.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/checkout/releases">actions/checkout's releases</a>.</em></p> <blockquote> <h2>v6.0.1</h2> <h2>What's Changed</h2> <ul> <li>Update all references from v5 and v4 to v6 by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2314">actions/checkout#2314</a></li> <li>Add worktree support for persist-credentials includeIf by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2327">actions/checkout#2327</a></li> <li>Clarify v6 README by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2328">actions/checkout#2328</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v6...v6.0.1">https://github.com/actions/checkout/compare/v6...v6.0.1</a></p> <h2>v6.0.0</h2> <h2>What's Changed</h2> <ul> <li>Update README to include Node.js 24 support details and requirements by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2248">actions/checkout#2248</a></li> <li>Persist creds to a separate file by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2286">actions/checkout#2286</a></li> <li>v6-beta by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2298">actions/checkout#2298</a></li> <li>update readme/changelog for v6 by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2311">actions/checkout#2311</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v5.0.0...v6.0.0">https://github.com/actions/checkout/compare/v5.0.0...v6.0.0</a></p> <h2>v6-beta</h2> <h2>What's Changed</h2> <p>Updated persist-credentials to store the credentials under <code>$RUNNER_TEMP</code> instead of directly in the local git config.</p> <p>This requires a minimum Actions Runner version of <a href="https://github.com/actions/runner/releases/tag/v2.329.0">v2.329.0</a> to access the persisted credentials for <a href="https://docs.github.com/en/actions/tutorials/use-containerized-services/create-a-docker-container-action">Docker container action</a> scenarios.</p> <h2>v5.0.1</h2> <h2>What's Changed</h2> <ul> <li>Port v6 cleanup to v5 by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2301">actions/checkout#2301</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/checkout/compare/v5...v5.0.1">https://github.com/actions/checkout/compare/v5...v5.0.1</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/actions/checkout/blob/main/CHANGELOG.md">actions/checkout's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <h2>v6.0.0</h2> <ul> <li>Persist creds to a separate file by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2286">actions/checkout#2286</a></li> <li>Update README to include Node.js 24 support details and requirements by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2248">actions/checkout#2248</a></li> </ul> <h2>v5.0.1</h2> <ul> <li>Port v6 cleanup to v5 by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2301">actions/checkout#2301</a></li> </ul> <h2>v5.0.0</h2> <ul> <li>Update actions checkout to use node 24 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li> </ul> <h2>v4.3.1</h2> <ul> <li>Port v6 cleanup to v4 by <a href="https://github.com/ericsciple"><code>@ericsciple</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2305">actions/checkout#2305</a></li> </ul> <h2>v4.3.0</h2> <ul> <li>docs: update README.md by <a href="https://github.com/motss"><code>@motss</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li> <li>Add internal repos for checking out multiple repositories by <a href="https://github.com/mouismail"><code>@mouismail</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li> <li>Documentation update - add recommended permissions to Readme by <a href="https://github.com/benwells"><code>@benwells</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li> <li>Adjust positioning of user email note and permissions heading by <a href="https://github.com/joshmgross"><code>@joshmgross</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li> <li>Update README.md by <a href="https://github.com/nebuk89"><code>@nebuk89</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li> <li>Update CODEOWNERS for actions by <a href="https://github.com/TingluoHuang"><code>@TingluoHuang</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li> <li>Update package dependencies by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li> </ul> <h2>v4.2.2</h2> <ul> <li><code>url-helper.ts</code> now leverages well-known environment variables by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li> <li>Expand unit test coverage for <code>isGhes</code> by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li> </ul> <h2>v4.2.1</h2> <ul> <li>Check out other refs/* by commit if provided, fall back to ref by <a href="https://github.com/orhantoy"><code>@orhantoy</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li> </ul> <h2>v4.2.0</h2> <ul> <li>Add Ref and Commit outputs by <a href="https://github.com/lucacome"><code>@lucacome</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1180">actions/checkout#1180</a></li> <li>Dependency updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a>- <a href="https://redirect.github.com/actions/checkout/pull/1777">actions/checkout#1777</a>, <a href="https://redirect.github.com/actions/checkout/pull/1872">actions/checkout#1872</a></li> </ul> <h2>v4.1.7</h2> <ul> <li>Bump the minor-npm-dependencies group across 1 directory with 4 updates by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1739">actions/checkout#1739</a></li> <li>Bump actions/checkout from 3 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1697">actions/checkout#1697</a></li> <li>Check out other refs/* by commit by <a href="https://github.com/orhantoy"><code>@orhantoy</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1774">actions/checkout#1774</a></li> <li>Pin actions/checkout's own workflows to a known, good, stable version. by <a href="https://github.com/jww3"><code>@jww3</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1776">actions/checkout#1776</a></li> </ul> <h2>v4.1.6</h2> <ul> <li>Check platform to set archive extension appropriately by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1732">actions/checkout#1732</a></li> </ul> <h2>v4.1.5</h2> <ul> <li>Update NPM dependencies by <a href="https://github.com/cory-miller"><code>@cory-miller</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1703">actions/checkout#1703</a></li> <li>Bump github/codeql-action from 2 to 3 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1694">actions/checkout#1694</a></li> <li>Bump actions/setup-node from 1 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1696">actions/checkout#1696</a></li> <li>Bump actions/upload-artifact from 2 to 4 by <a href="https://github.com/dependabot"><code>@dependabot</code></a> in <a href="https://redirect.github.com/actions/checkout/pull/1695">actions/checkout#1695</a></li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/actions/checkout/commit/8e8c483db84b4bee98b60c0593521ed34d9990e8"><code>8e8c483</code></a> Clarify v6 README (<a href="https://redirect.github.com/actions/checkout/issues/2328">#2328</a>)</li> <li><a href="https://github.com/actions/checkout/commit/033fa0dc0b82693d8986f1016a0ec2c5e7d9cbb1"><code>033fa0d</code></a> Add worktree support for persist-credentials includeIf (<a href="https://redirect.github.com/actions/checkout/issues/2327">#2327</a>)</li> <li><a href="https://github.com/actions/checkout/commit/c2d88d3ecc89a9ef08eebf45d9637801dcee7eb5"><code>c2d88d3</code></a> Update all references from v5 and v4 to v6 (<a href="https://redirect.github.com/actions/checkout/issues/2314">#2314</a>)</li> <li><a href="https://github.com/actions/checkout/commit/1af3b93b6815bc44a9784bd300feb67ff0d1eeb3"><code>1af3b93</code></a> update readme/changelog for v6 (<a href="https://redirect.github.com/actions/checkout/issues/2311">#2311</a>)</li> <li><a href="https://github.com/actions/checkout/commit/71cf2267d89c5cb81562390fa70a37fa40b1305e"><code>71cf226</code></a> v6-beta (<a href="https://redirect.github.com/actions/checkout/issues/2298">#2298</a>)</li> <li><a href="https://github.com/actions/checkout/commit/069c6959146423d11cd0184e6accf28f9d45f06e"><code>069c695</code></a> Persist creds to a separate file (<a href="https://redirect.github.com/actions/checkout/issues/2286">#2286</a>)</li> <li><a href="https://github.com/actions/checkout/commit/ff7abcd0c3c05ccf6adc123a8cd1fd4fb30fb493"><code>ff7abcd</code></a> Update README to include Node.js 24 support details and requirements (<a href="https://redirect.github.com/actions/checkout/issues/2248">#2248</a>)</li> <li>See full diff in <a href="https://github.com/actions/checkout/compare/v5...8e8c483db84b4bee98b60c0593521ed34d9990e8">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
) ## Which issue does this PR close? - Follow on to apache#18923 ## Rationale for this change I was confused about the argument meaning to `PartitionPruningStatistics` so let's add an example ## What changes are included in this PR? Add a doc example ## Are these changes tested? By CI ## Are there any user-facing changes? New docs --------- Co-authored-by: Oleks V <comphead@users.noreply.github.com>
…prior to parquet 57.1.0 upgrade (apache#19003) ~Draft until apache#18820 is merged~ ## Which issue does this PR close? - Follow on to apache#18820 ## Rationale for this change The parquet 57.1.0 upgrade includes a new adaptive filter from @hhhizzz : - apache/arrow-rs#8733 Our testing shows this is faster in all cases, but I want to have an escape valve for people to turn it off if they hit some issue. I had originally included this in apache#18820 but @rluvaton suggested it would be easier to understand as its own PR in apache#18820 (review) ## What changes are included in this PR? 1. Add a `force_filter_selections` config setting 2. Add configuration guide 3. Add tests ## Are these changes tested? Yes ## Are there any user-facing changes? A new boolean flag
…#19047) ## Which issue does this PR close? This addresses part of apache#15804 but does not close it. ## Rationale for this change Now that we are on MSRV 1.88 we can use rust edition 2024, which brings let chains and other nice features. It also improves `unsafe` checking. In order to introduce these changes in slower way instead of one massive PR that is too difficult to manage we are updating a few crates at a time. ## What changes are included in this PR? Updates 2 crates to 2024. - expr - execution ## Are these changes tested? Existing unit tests. There are no functional code changes. ## Are there any user-facing changes? None. Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
…ema (apache#19070) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes apache#19069 ## Rationale for this change Differences in physical/logical schema metadata can cause aggregate physical planning to fail, but these differences are not shown in the error output. ### Previous message ``` Error while planning query: Internal error: Physical input schema should be the same as the one converted from logical input schema. Differences: .` ``` ### Example of updated message ``` Physical input schema should be the same as the one converted from logical input schema. Differences: - field metadata at index 0 [usage_idle]: (physical) {"iox::column::type": "iox::column_type::field::float"} vs (logical) {} - field metadata at index 1 [usage_system]: (physical) {"iox::column::type": "iox::column_type::field::float"} vs (logical) {}. ``` <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Yes ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> Minor improvements to error messages. <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
d45259f to
527ec9f
Compare
Refactor shared helpers to ensure naïve strings are interpreted using the configured execution zone. Extend unit tests to cover naïve and formatted inputs respecting non-UTC execution timezones.
527ec9f to
eed97cf
Compare
…ts (apache#19389) ## Summary This PR extends `get_field` to accept multiple field name arguments for nested struct/map access, enabling `get_field(col, 'a', 'b', 'c')` as equivalent to `col['a']['b']['c']`. **The primary motivation is to make it easier for downstream optimizations to match on and optimize struct/map field access patterns.** By representing `col['a']['b']['c']` as a single `get_field(col, 'a', 'b', 'c')` call rather than nested `get_field(get_field(get_field(col, 'a'), 'b'), 'c')` calls, optimization rules can more easily identify and transform field access patterns. This is related / maybe prep work for apache#19387 but I think is a good improvement in its own right. ## Changes - **Variadic signature**: `get_field` now accepts 2+ arguments (base + one or more field names) - **Type validation at planning time**: Accessing a field on a non-struct/map type (e.g., `get_field({a: 1}, 'a', 'b')`) fails during planning with a clear error message indicating which argument position caused the failure - **Bracket syntax optimization**: The `FieldAccessPlanner` now merges consecutive bracket accesses into a single `get_field` call (e.g., `s['a']['b']` → `get_field(s, 'a', 'b')`) - **Mixed access handling**: Array index access correctly breaks the batching (e.g., `s['a'][0]['b']` → `get_field(array_element(get_field(s, 'a'), 0), 'b')`) ## Example ```sql -- Direct function call with nested access SELECT get_field(my_struct, 'outer', 'inner', 'value'); -- Equivalent bracket syntax (now optimized to single get_field) SELECT my_struct['outer']['inner']['value']; -- EXPLAIN shows single get_field call EXPLAIN SELECT s['a']['b'] FROM t; -- Projection: get_field(t.s, Utf8("a"), Utf8("b")) ``` ## Backwards Compatibility - The original 2-argument form `get_field(struct, 'field')` continues to work unchanged - Existing queries using bracket syntax will automatically benefit from the optimization ## Test plan - [x] Backwards compatibility test for 2-argument form - [x] Multi-level get_field with 2, 3, and 5 levels of nesting - [x] Type validation error tests at argument positions 2, 3, 4 - [x] Non-existent field error tests - [x] Null handling (null at base, null in middle of chain) - [x] Mixed array/struct access (verifies array index breaks batching) - [x] Nullable parent propagation - [x] EXPLAIN test verifying single get_field call for bracket syntax - [x] Minimum argument validation (0 and 1 argument cases) 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
…e#19474) Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action) from 2.65.1 to 2.65.2. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/releases">taiki-e/install-action's releases</a>.</em></p> <blockquote> <h2>2.65.2</h2> <ul> <li> <p>Update <code>prek@latest</code> to 0.2.24.</p> </li> <li> <p>Update <code>wasmtime@latest</code> to 40.0.0.</p> </li> <li> <p>Update <code>vacuum@latest</code> to 0.21.7.</p> </li> <li> <p>Update <code>tombi@latest</code> to 0.7.10.</p> </li> <li> <p>Update <code>syft@latest</code> to 1.39.0.</p> </li> <li> <p>Update <code>cargo-binstall@latest</code> to 1.16.5.</p> </li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md">taiki-e/install-action's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <p>All notable changes to this project will be documented in this file.</p> <p>This project adheres to <a href="https://semver.org">Semantic Versioning</a>.</p> <!-- raw HTML omitted --> <h2>[Unreleased]</h2> <h2>[2.65.2] - 2025-12-23</h2> <ul> <li> <p>Update <code>prek@latest</code> to 0.2.24.</p> </li> <li> <p>Update <code>wasmtime@latest</code> to 40.0.0.</p> </li> <li> <p>Update <code>vacuum@latest</code> to 0.21.7.</p> </li> <li> <p>Update <code>tombi@latest</code> to 0.7.10.</p> </li> <li> <p>Update <code>syft@latest</code> to 1.39.0.</p> </li> <li> <p>Update <code>cargo-binstall@latest</code> to 1.16.5.</p> </li> </ul> <h2>[2.65.1] - 2025-12-21</h2> <ul> <li> <p>Update <code>tombi@latest</code> to 0.7.9.</p> </li> <li> <p>Update <code>vacuum@latest</code> to 0.21.6.</p> </li> <li> <p>Update <code>prek@latest</code> to 0.2.23.</p> </li> </ul> <h2>[2.65.0] - 2025-12-20</h2> <ul> <li> <p>Support <code>cargo-insta</code>. (<a href="https://redirect.github.com/taiki-e/install-action/pull/1372">#1372</a>, thanks <a href="https://github.com/CommanderStorm"><code>@CommanderStorm</code></a>)</p> </li> <li> <p>Update <code>vacuum@latest</code> to 0.21.2.</p> </li> </ul> <h2>[2.64.2] - 2025-12-19</h2> <ul> <li> <p>Update <code>zizmor@latest</code> to 1.19.0.</p> </li> <li> <p>Update <code>mise@latest</code> to 2025.12.12.</p> </li> </ul> <h2>[2.64.1] - 2025-12-18</h2> <ul> <li>Update <code>tombi@latest</code> to 0.7.8.</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/taiki-e/install-action/commit/50cee16bd6b97b2579572f83cfa1c0a721b1e336"><code>50cee16</code></a> Release 2.65.2</li> <li><a href="https://github.com/taiki-e/install-action/commit/71c43df374deb4e987a853401d56672726b34ecd"><code>71c43df</code></a> Update <code>prek@latest</code> to 0.2.24</li> <li><a href="https://github.com/taiki-e/install-action/commit/73bd9d0e1c3d9775f7f4b673d4ac89c3cc914b14"><code>73bd9d0</code></a> Update <code>wasmtime@latest</code> to 40.0.0</li> <li><a href="https://github.com/taiki-e/install-action/commit/072fd7e631ab33f76ce13784f12153625b1ddde3"><code>072fd7e</code></a> Update <code>vacuum@latest</code> to 0.21.7</li> <li><a href="https://github.com/taiki-e/install-action/commit/7d7e3b737d71ae6fb6ebcc91171e20c79054fddb"><code>7d7e3b7</code></a> Update <code>tombi@latest</code> to 0.7.10</li> <li><a href="https://github.com/taiki-e/install-action/commit/4574e21caf851d43909442ad8c4f79678e4261b4"><code>4574e21</code></a> Update <code>syft@latest</code> to 1.39.0</li> <li><a href="https://github.com/taiki-e/install-action/commit/300b834288f5053ff7f6c56a2318db756a3a8bcd"><code>300b834</code></a> Update <code>cargo-binstall@latest</code> to 1.16.5</li> <li>See full diff in <a href="https://github.com/taiki-e/install-action/compare/b9c5db3aef04caffaf95a1d03931de10fb2a140f...50cee16bd6b97b2579572f83cfa1c0a721b1e336">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ic SELECT list support (apache#19221) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> * Closes apache#18991. ## Rationale for this change The current unparser behavior materializes an explicit `1` literal for empty projection lists, generating SQL of the form `SELECT 1 FROM ...` even for dialects (such as PostgreSQL and DataFusion) that support `SELECT FROM ...` with an empty select list. For external or federated sources, this can lead to: * Mismatches between the logical plan schema (empty projection) and the physical schema produced by the generated SQL (single `1` column), which then becomes confusing when converting to Arrow. * Misleading semantics in downstream consumers (e.g. plans that logically represent "no columns" suddenly gain a synthetic column). * Unnecessary data movement / computation when the intent is to operate only on row counts or existence checks. This PR updates the unparser to: * Preserve the empty projection semantics for dialects that support `SELECT FROM ...`, and * Provide a dialect hook so that other backends can continue to use a compatible fallback such as `SELECT 1 FROM ...`. This aligns the generated SQL more closely with the logical plan, improves compatibility with PostgreSQL, and reduces surprises around schema shape for aggregate-style queries over external data sources. ## What changes are included in this PR? This PR makes the following changes: 1. **SelectBuilder semantics for projections** * Change `SelectBuilder.projection` from `Vec<ast::SelectItem>` to `Option<Vec<ast::SelectItem>>` to distinguish: * `None`: projection has not yet been set, * `Some(vec![])`: explicitly empty projection, and * `Some(vec![...])`: non-empty projection. * Update `projection()` to set `Some(value)` and `pop_projections()` to `take()` the projection (returning an empty vec by default). * Redefine `already_projected()` to return `true` whenever the projection has been explicitly set (including the empty case), by checking `projection.is_some()`. * Adjust `build()` and `Default` to work with the new `Option`-typed projection (defaulting to `None` and using `unwrap_or_default()` when building the AST). 2. **Dialect capability: empty select list support** * Extend the `Dialect` trait with a new method: * `fn supports_empty_select_list(&self) -> bool { false }` * Document the intended semantics and behavior across common SQL engines, with the default returning `false` for maximum compatibility. * Override this method in `PostgreSqlDialect` to return `true`, allowing `SELECT FROM ...` to be generated. 3. **Unparser handling of empty projections** * Add a helper on `Unparser`: * `fn empty_projection_fallback(&self) -> Vec<Expr>` * Returns an empty vec if `supports_empty_select_list()` is `true`. * Returns `vec![Expr::Literal(ScalarValue::Int64(Some(1)), None)]` otherwise. * Update `unparse_table_scan_pushdown` to: * Take `&self` instead of being a purely static helper, so it can consult the dialect. * When encountering a `TableScan` with `Some(vec![])` as projection and `already_projected == false`, use `self.empty_projection_fallback()` instead of hard-coding a `1` literal. * Update the few call sites of `unparse_table_scan_pushdown` to call the instance method (`self.unparse_table_scan_pushdown(...)`). 4. **Tests** * Add snapshot tests covering both PostgreSQL and the default dialect for empty projection table scans: * `test_table_scan_with_empty_projection_in_plan_to_sql_postgres` * Asserts `SELECT FROM "table"` for `UnparserPostgreSqlDialect`. * `test_table_scan_with_empty_projection_in_plan_to_sql_default_dialect` * Asserts `SELECT 1 FROM "table"` for `UnparserDefaultDialect`. * Add tests for empty projection with filters: * `test_table_scan_with_empty_projection_and_filter_postgres` * Asserts `SELECT FROM "table" WHERE ("table"."id" > 10)`. * `test_table_scan_with_empty_projection_and_filter_default_dialect` * Asserts `SELECT 1 FROM "table" WHERE ("table".id > 10)`. * These tests complement the existing `table_scan_with_empty_projection_in_plan_to_sql_*` coverage to exercise both dialect-specific behavior and interaction with filters. ## Are these changes tested? Running the [reproducer case](apache@ccdda46) in apache#18991 `cargo run --example empty_select` ``` use datafusion::error::Result; use datafusion::prelude::SessionContext; use datafusion::sql::unparser::{self, Unparser}; #[tokio::main] async fn main() -> Result<()> { let ctx = SessionContext::new(); ctx.sql("create table t (k int, v int)") .await? .collect() .await?; let df = ctx.sql("select from t").await?; let plan = df.into_optimized_plan()?; println!("{}", plan.display_indent()); let sql = Unparser::new(&unparser::dialect::PostgreSqlDialect {}).plan_to_sql(&plan)?; println!("{sql}"); Ok(()) } ``` ``` TableScan: t projection=[] SELECT FROM "t" ``` Yes. * New snapshot tests have been added in `plan_to_sql.rs` to cover: * Empty projections for both the PostgreSQL and default dialects. * Empty projections combined with a filter predicate. * Existing `plan_to_sql` tests continue to pass, ensuring that behavior for non-empty projections and other dialect features is unchanged. ## Are there any user-facing changes? Yes, for users of the SQL unparser: * For dialects that support empty select lists (currently PostgreSQL via `PostgreSqlDialect`): * Logical plans with an explicitly empty projection will now unparse to `SELECT FROM ...` instead of `SELECT 1 FROM ...`. * This more accurately reflects the logical schema (no columns) and avoids introducing a synthetic literal column. * For dialects that do **not** support empty select lists: * The behavior remains effectively the same: the unparser still emits a non-empty projection (currently `SELECT 1 FROM ...`). * The behavior is now routed through the new `supports_empty_select_list` hook, so dialects can opt into different fallbacks in the future if needed. The new `supports_empty_select_list` method on `Dialect` has a default implementation, so existing dialect implementations remain source-compatible and do not require changes. ## LLM-generated code disclosure This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.
…19383) ## Which issue does this PR close? Related to apache#16756 ## Rationale for this change The existing `sql_dialect.rs` example demonstrates `COPY ... STORED AS ...`, which is actually already fully supported by the standard `DFParser`. This PR replaces it with the example from apache#16756: `CREATE EXTERNAL CATALOG ... STORED AS ... LOCATION ...` with automatic table discovery. ## What changes are included in this PR? The first commit updates `dialect.rs` to show that `DFParser` already handles `COPY ... STORED AS`, making it clear this syntax doesn't need customization. Example output from `cargo run --example sql_ops -- dialect`: ``` Query: COPY source_table TO 'file.fasta' STORED AS FASTA --- Parsing without extension --- Standard DFParser: Parsed as Statement::CopyTo: COPY source_table TO file.fasta STORED AS FASTA --- Parsing with extension --- Custom MyParser: Parsed as MyStatement::MyCopyTo: COPY source_table TO 'file.fasta' STORED AS FASTA ``` The second commit adds a new `custom_sql_parser.rs` example that implements `CREATE EXTERNAL CATALOG my_catalog STORED AS <format> LOCATION '<url>'` with automatic table discovery from object storage. It also removes the old `dialect.rs` example. ## Are these changes tested? Yes, the new example is runnable with `cargo run --example sql_ops -- custom_sql_parser` and demonstrates the full flow from parsing custom DDL through registering the catalog to querying discovered tables. Example output: ``` === Part 1: Standard DataFusion Parser === Parsing: CREATE EXTERNAL CATALOG parquet_testing STORED AS parquet LOCATION 'local://workspace/parquet-testing/data' OPTIONS ( 'schema_name' = 'staged_data', 'format.pruning' = 'true' ) Error: SQL error: ParserError("Expected: TABLE, found: CATALOG at Line: 1, Column: 17") === Part 2: Custom Parser === Parsing: CREATE EXTERNAL CATALOG parquet_testing STORED AS parquet LOCATION 'local://workspace/parquet-testing/data' OPTIONS ( 'schema_name' = 'staged_data', 'format.pruning' = 'true' ) Target Catalog: parquet_testing Data Location: local://workspace/parquet-testing/data Resolved Schema: staged_data Registered 69 tables into schema: staged_data Executing: SELECT id, bool_col, tinyint_col FROM parquet_testing.staged_data.alltypes_plain LIMIT 5 +----+----------+-------------+ | id | bool_col | tinyint_col | +----+----------+-------------+ | 4 | true | 0 | | 5 | false | 1 | | 6 | true | 0 | | 7 | false | 1 | | 2 | true | 0 | +----+----------+-------------+ ``` ## Are there any user-facing changes? Documentation only. I replaced the `sql_dialect.rs` example with `custom_sql_parser.rs` and updated the README. No API changes.
) ## Which issue does this PR close? - Closes apache#19423. ## Rationale for this change The functions `arrow_select::merge::merge` and `arrow_select::merge::merge_n` were first implemented for DataFusion in `case.rs`. They have since been generalised and moved to `arrow-rs`. Now that an `arrow-rs` is available that contains these functions, DataFusion should make use of them. ## What changes are included in this PR? - Remove `merge` and `merge_n` from `case.rs` along with the unit tests for those functions - Adapt code for their equivalents from `arrow-rs` ## Are these changes tested? Covered by existing unit tests and SLTs ## Are there any user-facing changes? No
It gives a name (the table name) to each `WorkTable`. This way `WorkTableExec` can recognize its own `WorkTable` Note that it doesn't allow multiple occurrences of the same CTE name: it's not possible to implement things like "join with itself" correctly with only the work table. ## Which issue does this PR close? - Closes apache#18955. ## Rationale for this change Support nested recursive CTEs without co-recursion. This is useful to e.g. implement SPARQL or other graph query languages. ## What changes are included in this PR? ## Are these changes tested? Yes! There is a nested recursive query in the test file ## Are there any user-facing changes? Nested recursive queries are now allowed instead of failing with a "not implemented" error
Fixes apache/datafusion apache#19162 The SparkAbs UDF was using the default is_nullable=true for all outputs, even when inputs were non-nullable. This commit implements return_field_from_args to properly propagate nullability from input arguments. Changes: - Add return_field_from_args implementation to SparkAbs - Output nullability now matches input nullability - Handle edge case where scalar argument is explicitly null - Add tests for nullability behavior ## Which issue does this PR close? Closes apache#19162 ## Rationale for this change [SparkAbs](cci:2://file:///Users/batman/datafusion/datafusion/spark/src/function/math/abs.rs:41:0-43:1) was always returning `nullable=true` even for non-nullable inputs. ## What changes are included in this PR? Implement [return_field_from_args](cci:1://file:///Users/batman/datafusion/datafusion/expr/src/udf.rs:210:4-215:5) to propagate nullability from input arguments. ## Are these changes tested? Yes, added 2 tests for nullability behavior. ## Are there any user-facing changes? No. --------- Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
## Which issue does this PR close? - Closes apache#19173 ## What changes are included in this PR? - includes custom nullability for `format_string`. --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
…ment types (apache#19442) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes apache#19119 ## Rationale for this change to_unixtime lacks the support for several data types. <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? Expanded `to_unixtime` support to all signed ints (Int8/16/32/64), all unsigned ints (UInt8/16/32/64), all floats (Float16/32/64), all UTF8 variants (Utf8/Utf8View/LargeUtf8), <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? Added sqllogictest <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> --------- Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
## Which issue does this PR close? - Closes apache#19465. ## Rationale for this change To add Support scalarvalue APIs for f16 regarding Pi values ## What changes are included in this PR? Added half::f16 support to ScalarValue mathematical APIs. ## Are these changes tested? Yes ## Are there any user-facing changes? Yes
…che#19480) ## Which issue does this PR close? Part of apache#18881 ## Rationale for this change Implement clippy::allow_attributes lint datafusion-ffi crate ## What changes are included in this PR? datafusion-ffi crate modified ## Are these changes tested? Yes ## Are there any user-facing changes? No
…apache#19294) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - part of #apache#19294. ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> --------- Co-authored-by: Sergey Zhukov <szhukov@aligntech.com>
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> N/A ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> | Benchmark | Original (main) | After reusable buffer | After no-builder | Total Improvement | |------------------|-----------------|-----------------------|------------------|-------------------| | size=1024 | | | | | | i32_random | 15.53 µs | 7.32 µs | 6.41 µs | -58.7% | | i64_random | 16.37 µs | 8.01 µs | 6.92 µs | -57.7% | | i64_large_values | 15.65 µs | 8.34 µs | 7.06 µs | -54.9% | | size=4096 | | | | | | i32_random | 57.11 µs | 28.39 µs | 24.66 µs | -56.8% | | i64_random | 62.19 µs | 31.10 µs | 27.62 µs | -55.6% | | i64_large_values | 61.10 µs | 30.60 µs | 26.98 µs | -55.8% | | size=8192 | | | | | | i32_random | 118.71 µs | 67.62 µs | 51.45 µs | -56.7% | | i64_random | 141.29 µs | 62.20 µs | 54.19 µs | -61.6% | | i64_large_values | 123.05 µs | 60.14 µs | 53.45 µs | -56.6% | ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> Avoid string allocations and re-use a mutable buffer. ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
## Which issue does this PR close? N/A ## Rationale for this change A legacy inline insta snapshot was using an outdated format that does not start and end with a newline. Newer versions of insta warn about this format and will fail to match such snapshots in the future. Updating the snapshot to the modern multiline format ensures forward compatibility and avoids future snapshot failures. ## What changes are included in this PR? - Converted a legacy inline insta snapshot to the modern multiline raw snapshot format. - Added leading and trailing newlines as required by current insta snapshot conventions. - No changes were made to the underlying logic or behavior. ## Are these changes tested? Yes ## Are there any user-facing changes? No
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> N/A ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> Use a re-usable string buffer instead of allocating a new string for each input value. | Benchmark | Main (µs) | Optimized (µs) | Improvement | |----------------------|-----------|----------------|-----------------| | size=1024, repeat=3 | | | | | repeat_string_view | 76.51 | 70.14 | -8.3% | | repeat_string | 78.63 | 71.41 | -9.2% | | repeat_large_string | 76.40 | 71.08 | -7.0% | | size=1024, repeat=30 | | | | | repeat_string_view | 109.02 | 93.51 | -14.2% | | repeat_string | 108.46 | 92.12 | -15.1% | | repeat_large_string | 105.99 | 91.66 | -13.5% | | size=4096, repeat=3 | | | | | repeat_string_view | 139.44 | 113.95 | -18.3% | | repeat_string | 133.62 | 112.25 | -16.0% | | repeat_large_string | 131.94 | 108.41 | -17.8% | | size=4096, repeat=30 | | | | | repeat_string_view | 251.77 | 193.95 | -23.0% | | repeat_string | 250.58 | 191.86 | -23.4% | | repeat_large_string | 248.88 | 188.43 | -24.3% | | overflow tests | | | | | size=1024 | 58.14 | 58.02 | ~0% (no change) | | size=4096 | 58.26 | 58.08 | ~0% (no change) | ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
…e#19499) Bumps [taiki-e/install-action](https://github.com/taiki-e/install-action) from 2.65.2 to 2.65.3. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/releases">taiki-e/install-action's releases</a>.</em></p> <blockquote> <h2>2.65.3</h2> <ul> <li>Update <code>tombi@latest</code> to 0.7.11.</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md">taiki-e/install-action's changelog</a>.</em></p> <blockquote> <h1>Changelog</h1> <p>All notable changes to this project will be documented in this file.</p> <p>This project adheres to <a href="https://semver.org">Semantic Versioning</a>.</p> <!-- raw HTML omitted --> <h2>[Unreleased]</h2> <h2>[2.65.3] - 2025-12-26</h2> <ul> <li>Update <code>tombi@latest</code> to 0.7.11.</li> </ul> <h2>[2.65.2] - 2025-12-23</h2> <ul> <li> <p>Update <code>prek@latest</code> to 0.2.24.</p> </li> <li> <p>Update <code>wasmtime@latest</code> to 40.0.0.</p> </li> <li> <p>Update <code>vacuum@latest</code> to 0.21.7.</p> </li> <li> <p>Update <code>tombi@latest</code> to 0.7.10.</p> </li> <li> <p>Update <code>syft@latest</code> to 1.39.0.</p> </li> <li> <p>Update <code>cargo-binstall@latest</code> to 1.16.5.</p> </li> </ul> <h2>[2.65.1] - 2025-12-21</h2> <ul> <li> <p>Update <code>tombi@latest</code> to 0.7.9.</p> </li> <li> <p>Update <code>vacuum@latest</code> to 0.21.6.</p> </li> <li> <p>Update <code>prek@latest</code> to 0.2.23.</p> </li> </ul> <h2>[2.65.0] - 2025-12-20</h2> <ul> <li> <p>Support <code>cargo-insta</code>. (<a href="https://redirect.github.com/taiki-e/install-action/pull/1372">#1372</a>, thanks <a href="https://github.com/CommanderStorm"><code>@CommanderStorm</code></a>)</p> </li> <li> <p>Update <code>vacuum@latest</code> to 0.21.2.</p> </li> </ul> <h2>[2.64.2] - 2025-12-19</h2> <ul> <li> <p>Update <code>zizmor@latest</code> to 1.19.0.</p> </li> <li> <p>Update <code>mise@latest</code> to 2025.12.12.</p> </li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/taiki-e/install-action/commit/de7896b7cd1c7d181266425abbe571b5a8c757bc"><code>de7896b</code></a> Release 2.65.3</li> <li><a href="https://github.com/taiki-e/install-action/commit/6737b0942ddf3d98c8e52b05b0d91e9aec8cd440"><code>6737b09</code></a> Update <code>tombi@latest</code> to 0.7.11</li> <li>See full diff in <a href="https://github.com/taiki-e/install-action/compare/50cee16bd6b97b2579572f83cfa1c0a721b1e336...de7896b7cd1c7d181266425abbe571b5a8c757bc">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
## Which issue does this PR close? Closes N/A (no existing issue) This change addresses a compiler warning observed during local test runs and does not correspond to a tracked bug or feature request. ## Rationale for this change During local test runs, the helper function ctx_and_codec in datafusion/ffi/tests/utils triggered a dead_code warning. While the function is currently unused, it encapsulates non-trivial setup logic for FFI integration tests and is expected to be reused as FFI test coverage expands. This change documents the intent of the helper and explicitly marks it as intentionally retained scaffolding, improving code clarity. ## What changes are included in this PR? - Added documentation explaining the purpose of the ctx_and_codec helper - Explicitly marked the helper as intentionally unused to silence dead_code warnings ## Are these changes tested? Yes. - This change is documentation-only and does not affect runtime behavior. - Existing test coverage remains unchanged and continues to pass. ## Are there any user-facing changes? No. This PR only affects internal test utilities and does not impact public APIs or user-facing behavior.
…ache#19516) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes #. ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> - Scalar argument optimization delivers 3.6x-8x speedup for the common case of starts_with(column, 'literal') or ends_with(column, 'literal') - StringViewArray benefits even more (~6-8x) than StringArray (~3.6-3.8x) - The optimization uses Arrow's Scalar wrapper to avoid broadcasting scalar values to full arrays ### starts_with | Benchmark | Before | After | Speedup | |--------------------------|----------|----------|---------| | StringArray + scalar | 32.38 µs | 8.49 µs | 3.8x | | StringViewArray + scalar | 78.15 µs | 9.82 µs | 8.0x | ### ends_with | Benchmark | Before | After | Speedup | |--------------------------|----------|----------|---------| | StringArray + scalar | 32.76 µs | 9.06 µs | 3.6x | | StringViewArray + scalar | 76.44 µs | 12.04 µs | 6.4x | ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> Handle all combinations of array and scalar arguments without converting scalars to arrays ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Yes, new unit tests added in this PR. ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> No, just faster performance.
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes apache#19022 - Closes apache#19038 - Closes apache#19024 - Closes apache#19457 ## Rationale for this change Improve coverage of date / time / interval arithmetic operations ## What changes are included in this PR? type coercion improvements, numerous slt tests. ## Are these changes tested? Yes ## Are there any user-facing changes? Additional arithmetic support. Thanks to @foskey51 for the timestamp + duration fix, @pepijnve for the initial set of .slt tests. --------- Co-authored-by: Pepijn Van Eeckhoudt <pepijn@vaneeckhoudt.net> Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
## Which issue does this PR close? - N/A. ## Rationale for this change Implement `ExecutionPlan::reset_state` for `LazyMemoryExec` (used in e.g. `generate_series`) so it can be reused across executions. ## What changes are included in this PR? - Implemented `ExecutionPlan::reset_state` for `LazyMemoryExec`. - Added `reset_state` to the `LazyBatchGenerator` trait and implemented for the structs that implement it. - Added unit tests. ## Are these changes tested? Yes. ## Are there any user-facing changes? Yes, new API method in the `LazyBatchGenerator` trait.
…che#19483) Because of apache#15886 a parse -> unparse -> parse loop changed the query so that it would give incorrect results.
…ment (apache#18993) It was previously ignored ## Which issue does this PR close? - Closes apache#18992. ## Rationale for this change All `TableProvider` implementations must support the `projection` argument of the `scan` method. This was not the case in `CteWorkTable`. ## What changes are included in this PR? Minimal implementation of the projection support. The projection applied before the plan node return results. It might be nice to push it further inside of the recursion implementation to reduce memory consumption but I preferred to keep the fix minimal. ## Are these changes tested? I have not figured out yet a nice SQL query to trigger an error without this change. Some existing queries in `cte.slt` have set projection (i.e. not `None`) so the code is very likely working. There is also a test on the projection itself in `WorkTableExec`
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Closes apache#19322 ## Rationale for this change When calculating the median of an even-length array of integers, averaging the two middle values using `add_wrapping` causes incorrect results due to integer overflow. For example, with Int8 values -85 and -56: ``` Expected: (-85 + -56) / 2 = -70 Actual: -85 + -56 = -141 wraps to 115, then 115 / 2 = 57 ``` <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? - Fix `calculate_median` : Use add_checked to detect overflow, and fall back to a safe midpoint formula `a/2 + b/2 + ((a%2 + b%2) / 2)` when overflow occurs. - Add tests <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? - All previous tests pass - Added new tests <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
## Which issue does this PR close? - Closes apache#19403 ## Rationale for this change After apache#19064 was completed, I found it couldn't meet our internal project requirements: 1. **Redesign try_reverse_output using EquivalenceProperties** - The previous implementation used a simple `is_reverse` method that could only handle basic reverse matching - Now leveraging `EquivalenceProperties` can handle more complex scenarios, including constant column elimination and monotonic functions 2. **Switch to ordering_satisfy method for ordering matching** - This method internally: - Normalizes orderings (removes constant columns) - Checks monotonic functions (like `date_trunc`, `CAST`, `CEIL`) - Handles prefix matching 3. **Extend sort pushdown support to more operators** - Added `try_pushdown_sort` implementation for `ProjectionExec`, `FilterExec`, `CooperativeExec` - These operators can now pass sort requests down to their children ## What changes are included in this PR? ### Core Changes: 1. **ParquetSource::try_reverse_output** (datasource-parquet/src/source.rs) - Added `eq_properties` parameter - Reverses all orderings in equivalence properties - Uses `ordering_satisfy` to check if reversed ordering satisfies the request - Removed `file_ordering` field and `with_file_ordering_info` method 2. **FileSource trait** (datasource/src/file.rs) - Updated `try_reverse_output` signature with `eq_properties` parameter - Added detailed documentation explaining parameter usage and examples 3. **FileScanConfig::try_pushdown_sort** (datasource/src/file_scan_config.rs) - Simplified logic to directly call `file_source.try_reverse_output` - No longer needs to pre-check ordering satisfaction or set file ordering info 4. **New operator support** - `FilterExec::try_pushdown_sort` - Pushes sort below filters - `ProjectionExec::try_pushdown_sort` - Pushes sort below projections - `CooperativeExec::try_pushdown_sort` - Supports sort pushdown in cooperative execution 5. **Removed obsolete methods** - Deleted `LexOrdering::is_reverse` - replaced by `ordering_satisfy` ### Feature Enhancements: **Supported optimization scenarios:** 1. **Constant column elimination** (Test 7) ```sql -- File ordering: [timeframe ASC, period_end ASC] -- Query: WHERE timeframe = 'quarterly' ORDER BY period_end DESC -- Effect: After timeframe becomes constant, reverse scan is enabled ``` 2. **Monotonic function support** (Test 8) ```sql -- File ordering: [ts ASC] -- Query: ORDER BY date_trunc('month', ts) DESC -- Effect: date_trunc is monotonic, reverse scan satisfies the request ``` ## Are these changes tested? Yes, comprehensive tests have been added: - **Test 7 (237 lines)**: Constant column elimination scenarios - Single constant column filter - Multi-value IN clauses (doesn't trigger optimization) - Literal constants in sort expressions - Non-leading column filters (edge cases) - **Test 8 (355 lines)**: Monotonic function scenarios - `date_trunc` (date truncation) - `CAST` (type conversion) - `CEIL` (ceiling) - `ABS` (negative case - not monotonic over mixed positive/negative range) All tests verify: - Presence of `reverse_row_groups=true` in physical plans - Correctness of query results ## Are there any user-facing changes? **API Changes:** - `FileSource::try_reverse_output` signature changed (added `eq_properties` parameter) - Removed `FileSource::with_file_ordering_info` method - Removed `LexOrdering::is_reverse` public method **User-visible improvements:** - More queries can leverage reverse row group scanning for optimization - Especially queries with `WHERE` clauses that make certain columns constant - Queries using monotonic functions (like date functions, type conversions) **Note:** This PR returns `Inexact` results because only row group order is reversed, not row order within row groups. Future enhancements could include: - File reordering based on statistics (returning `Exact`) - Partial sort pushdown for prefix matches --------- Co-authored-by: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com>
…apache#19378) # Relates to: apache#17379 ## Rationale for this change Named parameter matching failed when function signatures used non-lowercase parameter names (e.g., `startTime`, `SYMBOL`). The SQL parser normalizes unquoted argument names to lowercase, but signature parameter names were not normalized during lookup, causing mismatches. Example that failed before this fix: ```sql -- Function signature: ["startTime", "endTime"] SELECT func(starttime => 0); -- ERROR: Unknown parameter 'starttime' ``` ## What changes are included in this PR? **Commit 1: Support case-insensitive named parameters** - Modified `resolve_function_arguments` to normalize signature parameter names during lookup - Fixes matching for signatures with camelCase or uppercase parameter names - Added unit test `test_case_insensitive_parameter_matching` **Commit 2: Require case-sensitive matching for quoted identifiers** - Added `ArgumentName` struct to preserve quote information from SQL parser - Implements SQL standard: quoted identifiers require exact case match - Unquoted identifiers remain case-insensitive - Added unit test `test_quoted_parameter_case_sensitive` ## Are these changes tested? Yes: - Unit tests verify case-insensitive matching for unquoted identifiers - Unit tests verify case-sensitive matching for quoted identifiers - Tests include mixed-case signature parameters (`["prefix", "startPos", "LENGTH"]`) - Existing sqllogictests validate end-to-end behavior ## Are there any user-facing changes? Yes - this is a bug fix and enhancement: - **Bug fix:** Functions with non-lowercase parameter names now work correctly with any-case unquoted arguments - **Enhancement:** Quoted identifiers now follow SQL standards (case-sensitive matching) - No breaking changes for existing queries
) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> - Part of apache#12725 ## Rationale for this change As per the goal stated that we should avoid using the `user_defined` in useful places. <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ## What changes are included in this PR? - Refactored Spark `ascii` to not use `user_defined`. - Checked the Spark code to make sure it follows the same convention <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ## Are these changes tested? - This is a refactoring so no new tests were added but all previous test pass. <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
…pache#19447) ## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes apache#123` indicates that this PR will close issue apache#123. --> Part of apache#18456 ## Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> See issue for the rationale, this PR is implementing the 1st part of the issue. Demo in `datafrusion-cli` ``` DataFusion CLI v51.0.0 > explain analyze select a+1, pow(a,2) from generate_series(1,1000000) as t1(a); +-------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | plan_type | plan | +-------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Plan with Metrics | ProjectionExec: expr=[a@0 + 1 as t1.a + Int64(1), power(CAST(a@0 AS Float64), 2) as pow(t1.a,Int64(2))], metrics=[output_rows=1.00 M, elapsed_compute=32.37ms, output_bytes=15.3 MB, output_batches=123, expr_eval_time_0=8.27ms, expr_eval_time_1=23.82ms] | | | RepartitionExec: partitioning=RoundRobinBatch(14), input_partitions=1, metrics=[output_rows=1.00 M, elapsed_compute=780.87µs, output_bytes=7.7 MB, output_batches=123, spill_count=0, spilled_bytes=0.0 B, spilled_rows=0, fetch_time=1.19ms, repartition_time=1ns, send_time=186.54µs] | | | ProjectionExec: expr=[value@0 as a], metrics=[output_rows=1.00 M, elapsed_compute=148.62µs, output_bytes=7.7 MB, output_batches=123, expr_eval_time_0=4.12µs] | | | LazyMemoryExec: partitions=1, batch_generators=[generate_series: start=1, end=1000000, batch_size=8192], metrics=[output_rows=1.00 M, elapsed_compute=1.01ms, output_bytes=7.7 MB, output_batches=123] | | | | +-------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row(s) fetched. ``` Each expressions' evaluation time are tracked individually in `expr_eval_time_*`. `expr_eval_time_0` is for `a+1`, and `expr_eval_time_1` is for `pow(a,2)`. I chose number index because the pretty formatting for expressions are still a bit verbose now (like `[a@0 + 1 as t1.a + Int64(1)`) ## What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> 1. Moved metrics module from `datafusion-execution` crate to `datafusion-physical-expr-common`. (I optimistically think it's enough to move to `execution` crate previously, however there are many execution utilities live in `physical-expr-common`, so we have to further move it down in the dependency tree) 2. Added a struct to hold each expression's evaluation time: `ExpressionEvaluatorMetrics` 3. Put `Option<ExpressionEvaluatorMetrics>` inside `Projector`, and change the execution accordingly. ## Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 4. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Yes, added several slts ## Are there any user-facing changes? No <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. -->
) Closes apache#19169 ## Rationale for this change: The current implementation of `SparkAscii` UDF uses the default `is_nullable` which always returns true. This is incorrect because the output should only be nullable if the input argument is nullable. This change implements proper null propagation behavior by using `return_field_from_args` . ## Changes in PR: - Implemented return_field_from_args for SparkAscii to properly compute output nullability based on input argument nullability - Changed `return_type` to `return internal_err!` since `return_field_from_args` is now used (following the pattern used by other Spark functions like ilike, concat, elt) - Added unit tests verifying the nullability behavior: - Output is nullable when input is nullable - Output is non-nullable when input is non-nullable ## Test Coverage: Yes, tests are included to verify the change. ## User-facing Changes: No user-facing changes. Co-authored-by: Jefffrey <jeffrey.vo.australia@gmail.com>
# Conflicts: # datafusion/functions/src/datetime/to_unixtime.rs
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
catalog
common
core
datasource
development-process
documentation
Improvements or additions to documentation
execution
ffi
functions
logical-expr
optimizer
physical-expr
physical-plan
proto
spark
sql
sqllogictest
substrait
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR to refactor and update your upstream PR. I took the liberty of merging up to latest main. Let me know what you think.
The biggest changes are
All my changes are in abbf107