Split metrics documentation table by source #1025

kaessert · 2025-11-26T10:33:51Z

The metrics documentation previously mixed metrics from different sources in a single table, making it unclear where each metric originates.

Split into four sections:

Crossplane core metrics: function runner, response cache, circuit breaker, and engine metrics emitted by the Crossplane pod
Provider metrics: crossplane_managed_resource_* metrics from crossplane-runtime, emitted by all providers
Upjet provider metrics: upjet_resource_* metrics only from Upjet-based providers (provider-aws, provider-azure, provider-gcp)
Controller-runtime and Kubernetes client metrics: external dependency metrics emitted by both Crossplane and providers

Additional changes:

Fixed metric name composition_run_function_seconds to function_run_function_seconds (matching actual code)
Added missing metrics: cache metrics, engine metrics, crossplane_managed_resource_drift_seconds
Added missing upjet metrics: cli_duration, active_cli_invocations, running_processes
Removed _bucket suffix from histogram metric names (added by Prometheus)

Applied to all doc versions: v1.20, v2.0-preview, v2.0, v2.1, master

netlify · 2025-11-26T10:33:56Z

✅ Deploy Preview for crossplane ready!

Name	Link
🔨 Latest commit	`a46d215`
🔍 Latest deploy log	https://app.netlify.com/projects/crossplane/deploys/692f4ac7c870fe00084454cf
😎 Deploy Preview	https://deploy-preview-1025--crossplane.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.
Lighthouse	1 paths audited Performance: 95 (🟢 up 1 from production) Accessibility: 90 (🔴 down 2 from production) Best Practices: 92 (no change from production) SEO: 100 (no change from production) PWA: 70 (no change from production) View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify project configuration.

haarchri · 2025-11-26T11:04:27Z

content/master/guides/metrics.md

+
+## Upjet provider metrics
+
+These metrics are only emitted by Upjet-based providers (such as provider-aws, provider-azure, provider-gcp).


provider-upjet-aws, provider-upjet-azure, provider-upjet-gcp, wonder if we want to use the repo links here

haarchri · 2025-11-26T11:10:51Z

content/master/guides/metrics.md

 prometheus.io/scrape: "true"
 ```    

+## Crossplane core metrics


at the beginning of the document we describe how to enable metrics in crossplane core and adding prometheus specifics to enable the scraping

wonder if it makes sense to describe the same for the providers - per default we have the http-prom port and you need a PodMonitor or the prometheus annotations..

that does sound useful - looks like we do have a bit of that already in the provider section, e.g.:

Providers expose metrics on the metrics port (default 8080). To scrape these metrics, configure a PodMonitor or add Prometheus annotations to the provider's DeploymentRuntimeConfig.

haarchri · 2025-11-26T11:24:49Z

thanks for make this more clear - my approval is without binding we need @jbw976 for an additional look

jbw976

Awesome @kaessert, thank you for taking the initiative to make these metrics easier to understand! I do have a couple comments for potential further improvements, let me know what you think of them. Feel free to ping directly so we can iterate fast too 😇

jbw976 · 2025-12-02T01:32:09Z

content/master/guides/metrics.md

+These metrics are emitted by the Crossplane pod itself.
+
+{{< table "table table-hover table-striped table-sm">}}
+| Metric Name | Description | Further Explanation |


the "Further Explanation" column has no entries in it, so it's just an empty column and it causes some values from the previous column to wrap unnecessarily - should the column header just be removed?

you're right, i dropped it :)

jbw976 · 2025-12-02T01:34:02Z

content/master/guides/metrics.md

+Providers expose metrics on the `metrics` port (default `8080`). To scrape these metrics, configure a `PodMonitor` or add Prometheus annotations to the provider's `DeploymentRuntimeConfig`.
+
+{{< table "table table-hover table-striped table-sm">}}
+| Metric Name | Description | Further Explanation |


same here, the "further explanation" column is empty and can be removed...

jbw976 · 2025-12-02T01:35:20Z

content/master/guides/metrics.md

+These metrics come from the controller-runtime framework and Kubernetes client libraries. They are emitted by both Crossplane and providers.
+
 {{< table "table table-hover table-striped table-sm">}}
 | Metric Name | Description | Further Explanation |


actually, even in a table that is using the "further explanation" column, i'm a bit skeptical on the value it's providing. It seems like most of the information there could just be included in the description column. What do you think, do you believe a dedicated column for "further explanation" is the right approach?

Merged into description

jbw976 · 2025-12-02T01:41:36Z

content/master/guides/metrics.md

 prometheus.io/scrape: "true"
 ```    

+## Crossplane core metrics


that does sound useful - looks like we do have a bit of that already in the provider section, e.g.:

Providers expose metrics on the metrics port (default 8080). To scrape these metrics, configure a PodMonitor or add Prometheus annotations to the provider's DeploymentRuntimeConfig.

jbw976 · 2025-12-02T01:56:30Z

content/master/guides/metrics.md

+| {{<hover label="circuit_breaker_opens_total" line="6">}}circuit_breaker_opens_total{{</hover>}} | Number of times the XR circuit breaker transitioned from closed to open |  |
+| {{<hover label="circuit_breaker_closes_total" line="7">}}circuit_breaker_closes_total{{</hover>}} | Number of times the XR circuit breaker transitioned from open to closed |  |
+| {{<hover label="circuit_breaker_events_total" line="8">}}circuit_breaker_events_total{{</hover>}} | Number of XR watch events handled by the circuit breaker, labelled by outcome |  |
+| {{<hover label="engine_controllers_started_total" line="9">}}engine_controllers_started_total{{</hover>}} | Total number of controllers started |  |


good find on getting these names correct! i screwed this up in the release notes for v2.0, so i just retroactively went back and updated that now. Thank you! 😇

https://github.com/crossplane/crossplane/releases/tag/v2.0.0

Welcome 😊

jbw976 · 2025-12-02T02:00:43Z

content/master/guides/metrics.md

+| {{<hover label="function_run_function_request_total" line="1">}}function_run_function_request_total{{</hover>}} | Total number of RunFunctionRequests sent |  |
+| {{<hover label="function_run_function_response_total" line="2">}}function_run_function_response_total{{</hover>}} | Total number of RunFunctionResponses received |  |
+| {{<hover label="function_run_function_seconds" line="3">}}function_run_function_seconds{{</hover>}} | Histogram of RunFunctionResponse latency (seconds) |  |
+| {{<hover label="function_run_function_response_cache_hits_total" line="4">}}function_run_function_response_cache_hits_total{{</hover>}} | Total number of RunFunctionResponse cache hits |  |


there are some more cache related metrics not listed here but are in the code in https://github.com/crossplane/crossplane/blob/release-2.1/internal/xfn/cached/cached_runner_metrics.go#L76-L134 - consider adding those too. did they show up for your in your testing?

jbw976 · 2025-12-02T02:03:11Z

content/master/guides/metrics.md

+| {{<hover label="controller_runtime_active_workers" line="3">}}controller_runtime_active_workers{{</hover>}} | Number of used workers per controller | The number of threads processing jobs from the work queue. |
+| {{<hover label="controller_runtime_max_concurrent_reconciles" line="4">}}controller_runtime_max_concurrent_reconciles{{</hover>}} | Maximum number of concurrent reconciles per controller | Describes how reconciles can happen in parallel. |
+| {{<hover label="controller_runtime_reconcile_errors_total" line="5">}}controller_runtime_reconcile_errors_total{{</hover>}} | Total number of reconciliation errors per controller | A counter that counts reconcile errors. Sharp or non stop rising of this metric might be a problem. |
+| {{<hover label="controller_runtime_reconcile_time_seconds_bucket" line="6">}}controller_runtime_reconcile_time_seconds_bucket{{</hover>}} | Length of time per reconciliation per controller |  |


did you want to remove these _bucket suffixes also, to match what was in the PR body? (there's like 4 of them I think)

Removed _bucket suffix from histogram metric names (added by Prometheus)

Thank you! Removed

kaessert · 2025-12-02T08:42:07Z

@jbw976 that does sound useful - looks like we do have a bit of that already in the provider section - doesn't hurt to mention it here though or wdyt? 🤔

jbw976 · 2025-12-02T16:28:54Z

doesn't hurt to mention it here though or wdyt? 🤔

yeah for sure, it wouldn't hurt to mention it in the top section as well!

jbw976

Awesome @kaessert, this is looking really good now! just another comment though on the older versions and how best we can approach those. What do you think?

jbw976 · 2025-12-02T16:39:34Z

content/v1.20/guides/metrics.md

 prometheus.io/scrape: "true"
 ```    

+## Crossplane core metrics


come to think of it, i don't think we should update old versions like v1.20 and v2.0-preview, because a lot of these values had different names or didn't even exist in those previous versions. but let's also try not to overcomplicate it either, so what do you think of the following?

don't bother updating v2.0-preview, that release is a bit of a one-off and those docs aren't even exposed in the version picker anymore

don't bother updating v1.20, unless you really want to get every old name correct for crossplane_composition_watches_* etc. as shown in https://github.com/crossplane/crossplane/releases/tag/v2.0.0

remove circuit_breaker_* metrics from v2.0, those were added in v2.1

keep v2.1 and master as is 🎉

Great points!

v2.0-preview changes i threw away

put an effort to get every v1.20 old name correct -> we already went too far for shortcuts 😅

circuit_breaker_* was removed from everywhere except v2.1

v2.1 master kept as is ;)

jbw976 · 2025-12-02T18:44:54Z

oh also looks like there are a number of Vale issues to fix for this PR too 😇
https://github.com/crossplane/docs/actions/runs/19853004247/job/56928239618?pr=1025

thanks for continuing to make progress @kaessert!

The metrics documentation previously mixed metrics from different sources in a single table, making it unclear where each metric originates. Split into four sections: - Crossplane core metrics: function runner, response cache, circuit breaker, and engine metrics emitted by the Crossplane pod - Provider metrics: crossplane_managed_resource_* metrics from crossplane-runtime, emitted by all providers - Upjet provider metrics: upjet_resource_* metrics only from Upjet-based providers (provider-aws, provider-azure, provider-gcp) - Controller-runtime and Kubernetes client metrics: external dependency metrics emitted by both Crossplane and providers Additional changes: - Fixed metric name composition_run_function_seconds to function_run_function_seconds (matching actual code) - Added missing metrics: cache metrics, engine metrics, crossplane_managed_resource_drift_seconds - Added missing upjet metrics: cli_duration, active_cli_invocations, running_processes - Removed _bucket suffix from histogram metric names (added by Prometheus) Applied to all doc versions: v1.20, v2.0-preview, v2.0, v2.1, master Signed-off-by: Tobias Kässer <tobias.kasser@upbound.io>

kaessert · 2025-12-02T20:24:02Z

@jbw976 I think we should be good now. Everything should be addressed and vale is passing for me at least locally too :)

jbw976

we already went too far for shortcuts

i love this sentiment 😂

this looks great to me @kaessert, thanks for getting this to the finish line!! 🏁

kaessert force-pushed the fix-metrics-docs branch from 01af529 to 27e0e50 Compare November 26, 2025 10:35

haarchri reviewed Nov 26, 2025

View reviewed changes

kaessert force-pushed the fix-metrics-docs branch from 27e0e50 to 510678d Compare November 26, 2025 11:17

haarchri approved these changes Nov 26, 2025

View reviewed changes

jbw976 reviewed Dec 2, 2025

View reviewed changes

kaessert force-pushed the fix-metrics-docs branch 3 times, most recently from 905ca02 to 39bc1b0 Compare December 2, 2025 09:04

jbw976 reviewed Dec 2, 2025

View reviewed changes

kaessert force-pushed the fix-metrics-docs branch from 39bc1b0 to 9e7eecf Compare December 2, 2025 20:17

kaessert force-pushed the fix-metrics-docs branch from 9e7eecf to a46d215 Compare December 2, 2025 20:23

jbw976 approved these changes Dec 2, 2025

View reviewed changes

jbw976 merged commit da99598 into crossplane:master Dec 2, 2025
7 checks passed

stevendborrelli mentioned this pull request Dec 8, 2025

MEMBERSHIP: kaessert crossplane/org#107

Closed

6 tasks


		## Upjet provider metrics

		These metrics are only emitted by Upjet-based providers (such as provider-aws, provider-azure, provider-gcp).

Split metrics documentation table by source #1025

Split metrics documentation table by source #1025

Conversation

kaessert commented Nov 26, 2025

Uh oh!

netlify bot commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for crossplane ready!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

haarchri commented Nov 26, 2025

Uh oh!

jbw976 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kaessert commented Dec 2, 2025

Uh oh!

jbw976 commented Dec 2, 2025

Uh oh!

jbw976 left a comment

Choose a reason for hiding this comment

Uh oh!

jbw976 Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jbw976 commented Dec 2, 2025

Uh oh!

kaessert commented Dec 2, 2025

Uh oh!

jbw976 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

netlify bot commented Nov 26, 2025 •

edited

Loading

jbw976 Dec 2, 2025 •

edited

Loading