Skip to content

Conversation

@HarshMN2345
Copy link
Member

@HarshMN2345 HarshMN2345 commented Jan 20, 2026

What does this PR do?

(Provide a description of what this PR does.)

Test Plan

(Write your test plan here. If you changed any code, please provide us with clear instructions on how you verified your changes work.)

Related PRs and Issues

(If this PR is related to any other PR or resolves any issue or related to any issue link all related PR and issues here.)

Have you read the Contributing Guidelines on issues?

(Write your answer here.)

Summary by CodeRabbit

  • Chores
    • Canonical docs version standardized to the cloud variant; canonicalization logic simplified.
  • SEO / Indexing
    • Added noindex handling for internal, staging, and non-canonical doc versions via response headers and meta tags.
  • Sitemap & robots.txt
    • Tightened sitemap exclusions and expanded robots rules to block internal/auth routes, tracking parameters, and non-cloud docs references while exposing canonical content.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 20, 2026

Walkthrough

Adds SEO controls: a new middleware seoOptimization (run after initSession) injects x-robots-tag: noindex, nofollow for configured internal paths and staging hosts; sitemap generation adds an INTERNAL_PATHS list and excludes internal/auth routes, /threads/*, .json/.xml files, and docs references not under /docs/references/cloud/; CANONICAL_VERSION was removed and normalizeDocsVersion treats 'cloud' as the canonical version; a docs references layout emits a meta robots noindex for non‑canonical versions; robots.txt rules were expanded to cover tracking params, internal/auth pages, and versioned docs rules.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: Improve SEO canonicalization and block non-indexable pages' clearly and concisely summarizes the main changes across all modified files, accurately reflecting the core objectives of enhancing SEO practices.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/routes/robots.txt/`+server.ts:
- Line 26: Remove the unnecessary backslashes used to escape double quotes
inside the template literal that builds the robots.txt content (the literal that
blocks versioned docs and references "cloud"); change occurrences of \"cloud\"
(and any other \"...\" instances) to plain "cloud" (and "..." respectively) so
the template uses normal double quotes within the template literal, then re-run
linting to ensure ESLint warnings are resolved.
🧹 Nitpick comments (1)
src/routes/robots.txt/+server.ts (1)

27-33: Consider future-proofing version blocking.

The current approach explicitly blocks versions starting with 0, 1, 2, 3. If new major versions (e.g., 4.x, 5.x) are released in the future, additional Disallow rules would need to be added manually.

An alternative approach would be to invert the logic—disallow all of /docs/references/ and then explicitly allow only /docs/references/cloud/. However, robots.txt precedence rules vary by crawler (Google uses most-specific-wins, others may use first-match), so the current explicit approach is safer and more predictable.

Disallow: /console/register
Disallow: /v1/
# Block all versioned docs references; only \"cloud\" should be indexable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove unnecessary escape characters in template literal.

In JavaScript template literals, double quotes don't require escaping. The backslashes before the quotes are unnecessary and trigger ESLint errors.

Proposed fix
-# Block all versioned docs references; only \"cloud\" should be indexable
+# Block all versioned docs references; only "cloud" should be indexable
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Block all versioned docs references; only \"cloud\" should be indexable
# Block all versioned docs references; only "cloud" should be indexable
🧰 Tools
🪛 ESLint

[error] 26-26: Unnecessary escape character: ".

(no-useless-escape)


[error] 26-26: Unnecessary escape character: ".

(no-useless-escape)

🤖 Prompt for AI Agents
In `@src/routes/robots.txt/`+server.ts at line 26, Remove the unnecessary
backslashes used to escape double quotes inside the template literal that builds
the robots.txt content (the literal that blocks versioned docs and references
"cloud"); change occurrences of \"cloud\" (and any other \"...\" instances) to
plain "cloud" (and "..." respectively) so the template uses normal double quotes
within the template literal, then re-run linting to ensure ESLint warnings are
resolved.

Comment on lines 27 to 33
Disallow: /docs/references/0
Disallow: /docs/references/1
Disallow: /docs/references/2
Disallow: /docs/references/3
# Allow canonical cloud docs
Allow: /docs/references/cloud/`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why no just -

# Block all versioned docs references
Disallow: /docs/references/*/

# Allow canonical cloud docs
Allow: /docs/references/cloud/

/^\/v1\//
];

const NOINDEX_HOSTS = [/^stage\./i, /^fra\./i, /^internal\./i];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where did internal come from? what about other regions? maybe any *.cloud.appwrite.io makes more sense?

Disallow: /*&ref=
Disallow: /*&adobe_mc=
Disallow: /*&trk=
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wildcards should work here -

# Block tracking parameters
Disallow: /*?utm_
Disallow: /*&utm_
Disallow: /*?ref=
Disallow: /*&ref=
Disallow: /*?trk=
Disallow: /*&trk=
Disallow: /*?adobe_mc=
Disallow: /*&adobe_mc=

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants