Skip to content

Conversation

@danielmellado
Copy link
Contributor

@danielmellado danielmellado commented Dec 17, 2025

This commit introduces a new API to be introduced as a part of the
migration in CMO from ConfigMap to CRDs

Signed-off-by: Mario Fernandez mariofer@redhat.com
Signed-off-by: Daniel Mellado dmellado@fedoraproject.org

@openshift-ci-robot
Copy link

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 17, 2025

Hello @danielmellado! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

@coderabbitai
Copy link

coderabbitai bot commented Dec 17, 2025

📝 Walkthrough

Walkthrough

Adds spec.prometheusConfig to ClusterMonitoring with extensive Prometheus and Alertmanager settings (scheduling, storage, retention, remoteWrite, TLS, auth, tolerations, topologySpreadConstraints, externalLabels, queryLogFile, etc.). Replaces Alertmanager/Prometheus volumeClaimTemplate PVC schema with sizeInGiB and storageClassName. Introduces many v1alpha1 types (e.g., PrometheusConfig, AdditionalAlertmanagerConfig, RemoteWriteSpec, TLSConfig, AuthorizationConfig, VolumeClaimConfig, RelabelConfig, Retention, Label) and updates deepcopy, OpenAPI, Swagger, and CRD manifests plus new validation tests.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add prometheusConfig API' clearly and concisely describes the main change in the pull request.
Description check ✅ Passed The description accurately relates to the changeset, mentioning the new API introduction and the migration context from ConfigMap to CRDs.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.5.0)

Error: build linters: unable to load custom analyzer "kubeapilinter": tools/_output/bin/kube-api-linter.so, plugin: not implemented
The command is terminated due to an error: build linters: unable to load custom analyzer "kubeapilinter": tools/_output/bin/kube-api-linter.so, plugin: not implemented


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Dec 17, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 17, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign everettraven for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml (2)

453-485: Volume storage schema matches feature‑gated CRD

The Alertmanager volumeClaimTemplate schema here matches the feature‑gated CRD (sizeInGiB + storageClassName, with sane bounds and DNS validation). Keeping these in sync via generation is good for maintainability.


895-1876: TechPreview prometheusConfig matches feature‑gated schema

This prometheusConfig definition mirrors the feature‑gated manifest: same enums, limits, CEL rules, and nested structures. That consistency across variants is important for tooling and user documentation; no additional issues spotted.

🧹 Nitpick comments (2)
config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml (1)

354-776: Add negative tests for relabel targetLabel requirement and staticConfigs validation

The new prometheusConfig tests cover most structural and cardinality constraints well. Once you tighten the CRD validation for writeRelabelConfigs.targetLabel, it would be worthwhile to add a negative case where action: Replace (or HashMod) is set without targetLabel and assert it is rejected. Similarly, a test that feeds a malformed staticConfigs entry (e.g. missing :<port>) would guard the host:port CEL rule from regressions.

payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml (1)

1063-1198: TLS config allows potentially incomplete mTLS configuration.

The tlsConfig section has minProperties: 1 but doesn't enforce that cert and key are provided together for mutual TLS. A user could specify cert without key or vice versa, which would result in an incomplete mTLS configuration. Consider adding a validation rule to ensure both are present or both are absent.

Example CEL rule to add:

x-kubernetes-validations:
- message: cert and key must both be specified for mutual TLS, or both omitted
  rule: '(has(self.cert) && has(self.key)) || (!has(self.cert) && !has(self.key))'
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between bfa868a and 43ae844.

📒 Files selected for processing (12)
  • config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml (1 hunks)
  • config/v1alpha1/types_cluster_monitoring.go (4 hunks)
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml (2 hunks)
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml (2 hunks)
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml (2 hunks)
  • config/v1alpha1/zz_generated.deepcopy.go (10 hunks)
  • config/v1alpha1/zz_generated.featuregated-crd-manifests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml (2 hunks)
  • config/v1alpha1/zz_generated.swagger_doc_generated.go (7 hunks)
  • openapi/generated_openapi/zz_generated.openapi.go (13 hunks)
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml (2 hunks)
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml (2 hunks)
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml
  • config/v1alpha1/zz_generated.swagger_doc_generated.go
  • config/v1alpha1/zz_generated.featuregated-crd-manifests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml
  • config/v1alpha1/types_cluster_monitoring.go
  • config/v1alpha1/zz_generated.deepcopy.go
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml
  • openapi/generated_openapi/zz_generated.openapi.go
🧬 Code graph analysis (1)
config/v1alpha1/zz_generated.swagger_doc_generated.go (1)
config/v1alpha1/types_cluster_monitoring.go (10)
  • AdditionalAlertmanagerConfig (610-671)
  • AuthorizationConfig (857-870)
  • Label (674-688)
  • PrometheusConfig (434-597)
  • RelabelConfig (731-797)
  • RemoteWriteSpec (692-727)
  • Retention (916-936)
  • SecretKeySelector (874-891)
  • TLSConfig (802-830)
  • VolumeClaimConfig (895-912)
🔇 Additional comments (39)
config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml (1)

453-485: Alertmanager PVC schema simplification looks good

The new volumeClaimTemplate schema (required sizeInGiB with sane bounds + optional storageClassName with DNS-1123 validation) is clear, constrained, and much easier to reason about than a full PVC spec. No issues from a correctness or maintainability standpoint.

payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml (2)

453-485: Alertmanager PVC schema change is consistent and maintainable

The simplified volumeClaimTemplate for Alertmanager here matches the CustomNoUpgrade CRD and keeps the schema tight (bounded sizeInGiB, validated storageClassName). This looks correct and easier to support across payload variants.


895-1876: The prometheusConfig block does not duplicate an earlier schema. The writeRelabelConfigs section appears exactly once in this configuration, properly nested within remoteWrite (a property of prometheusConfig). The staticConfigs are in additionalAlertmanagerConfigs, a separate configuration section. The CEL validation rule self.action in [''Replace'', ''HashMod''] ? self.targetLabel != '''' : true correctly enforces the documented requirement that targetLabel is mandatory for Replace and HashMod actions. No consistency issues exist between TechPreviewNoUpgrade and CustomNoUpgrade variants for these fields.

Likely an incorrect or invalid review comment.

payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml (2)

453-485: LGTM: volumeClaimTemplate API simplification improves usability.

The simplified volumeClaimTemplate structure (sizeInGiB + storageClassName) is clearer and more maintainable than exposing the full PVC spec. The required sizeInGiB field ensures explicit storage allocation, while optional storageClassName allows platform defaults. Validation is comprehensive with proper bounds and DNS subdomain checking.


895-1876: LGTM: Comprehensive prometheusConfig API with thorough validation.

The new prometheusConfig field provides extensive configuration surface for platform Prometheus with well-designed validation:

  • Proper security defaults (certificateVerification: "Verify", documented HTTP/HTTPS scheme choice)
  • Reasonable bounds on all numeric constraints (timeouts, storage limits, array sizes)
  • Comprehensive CEL validation rules ensuring API correctness
  • List-map configurations supporting server-side apply workflows
  • Clear documentation throughout the nested structure

The API design enables the ConfigMap-to-CRD migration mentioned in the PR description while maintaining security and usability.

config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml (1)

1-1909: LGTM: DevPreviewNoUpgrade CRD variant is consistent with CustomNoUpgrade.

This CRD manifest correctly mirrors the CustomNoUpgrade variant (first file) with only the expected difference in feature-set annotation (line 10). The volumeClaimTemplate simplification and prometheusConfig additions are identical, ensuring consistency across release channels. The zz_generated file prefix correctly indicates auto-generation from Go types.

payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml (4)

453-485: LGTM on volumeClaimTemplate structure.

The simplified volumeClaimTemplate schema with sizeInGiB (required, 1-16384 range) and storageClassName (optional, DNS subdomain validated) is well-designed. The minProperties: 1 constraint combined with required: [sizeInGiB] ensures valid configurations.


895-908: Well-structured prometheusConfig section.

The top-level prometheusConfig field is comprehensive and follows the same patterns as other config sections in this CRD. The minProperties: 1 constraint ensures the field isn't set to an empty object, which is appropriate.


1843-1875: LGTM on Prometheus volumeClaimTemplate.

The volumeClaimTemplate for Prometheus is consistent with the Alertmanager volumeClaimTemplate structure, ensuring API consistency across the CRD.


1476-1480: No action required—targetLabel validation rule is correct.

The CEL validation rule correctly enforces that targetLabel is required for Replace and HashMod actions. When targetLabel is omitted (as an optional field), it evaluates to an empty string in CEL, causing self.targetLabel != '' to evaluate to false and reject the validation for Replace/HashMod actions. For other actions, the rule correctly permits omitting targetLabel. The minLength: 1 constraint ensures non-empty values when the field is present.

Likely an incorrect or invalid review comment.

config/v1alpha1/zz_generated.swagger_doc_generated.go (3)

121-134: LGTM on AdditionalAlertmanagerConfig Swagger documentation.

The Swagger documentation accurately describes all fields of the AdditionalAlertmanagerConfig type, including name, authorization, pathPrefix, scheme, staticConfigs, timeoutSeconds, and tlsConfig.


256-275: LGTM on PrometheusConfig Swagger documentation.

The PrometheusConfig SwaggerDoc map comprehensively documents all configuration options including additionalAlertmanagerConfigs, enforcedBodySizeLimitBytes, externalLabels, logLevel, nodeSelector, queryLogFile, remoteWrite, resources, retention, tolerations, topologySpreadConstraints, collectionProfile, and volumeClaimTemplate.


346-354: LGTM on VolumeClaimConfig Swagger documentation.

The VolumeClaimConfig documentation correctly describes the storageClassName and sizeInGiB fields with their constraints and default behaviors.

config/v1alpha1/zz_generated.featuregated-crd-manifests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml (2)

453-485: Alertmanager volume storage schema looks consistent and constrained

The simplified volumeClaimTemplate (sizeInGiB + storageClassName with DNS validation and minProperties/required) aligns with the new Prometheus storage shape and keeps user input tightly bounded. No issues from a schema or UX perspective.


895-1876: PrometheusConfig surface and validations look well‑designed

The new prometheusConfig block provides a rich but bounded configuration surface (limits on list sizes, strong enums, URL and path validation, DNS validation for names, etc.), and follows existing patterns used elsewhere in this CRD. I don’t see correctness, security, or maintainability problems in this schema addition.

config/v1alpha1/zz_generated.deepcopy.go (3)

55-91: VolumeClaimTemplate deepcopy usage assumes value types (looks correct)

Both AlertmanagerCustomConfig.DeepCopyInto and PrometheusConfig.DeepCopyInto call in.VolumeClaimTemplate.DeepCopyInto(&out.VolumeClaimTemplate) without a nil check. This is appropriate as long as VolumeClaimTemplate/VolumeClaimConfig are value fields (not pointers), which matches the generated VolumeClaimConfig.DeepCopyInto and the simplified CRD schema. I don’t see a risk of nil dereference given that setup.

Also applies to: 982-1035


14-35: Deepcopy semantics for new helper types are safe and non‑aliasing

For AdditionalAlertmanagerConfig, AuthorizationConfig, Label, RelabelConfig, RemoteWriteSpec, Retention, SecretKeySelector, TLSConfig, and VolumeClaimConfig, the generated DeepCopy* methods correctly:

  • Deep‑copy slices (e.g., StaticConfigs, SourceLabels, WriteRelabelConfigs).
  • Allocate new pointers where needed (e.g., Replacement, StorageClassName).
  • Use value copy for structs that only contain value fields.

This avoids shared mutable state between copies and aligns with Kubernetes codegen patterns.

Also applies to: 120-135, 722-736, 1073-1120, 1099-1120, 1122-1136, 1196-1210, 1233-1250, 1268-1287


399-405: ClusterMonitoringSpec now deep‑copies PrometheusConfig correctly

ClusterMonitoringSpec.DeepCopyInto now invokes in.PrometheusConfig.DeepCopyInto(&out.PrometheusConfig), and PrometheusConfig.DeepCopyInto recursively handles all nested slices, maps, and structs (alertmanager configs, externalLabels, nodeSelector, remoteWrite, resources, retention, tolerations, topologySpreadConstraints, volumeClaimTemplate). This ensures controllers manipulating the spec won’t accidentally share internal state across copies.

Also applies to: 982-1035

openapi/generated_openapi/zz_generated.openapi.go (13)

424-477: LGTM - Type mappings are correctly registered.

The new v1alpha1 types are properly added to the OpenAPI definitions map in alphabetical order, with each type having a corresponding schema builder function.


21229-21305: LGTM - AdditionalAlertmanagerConfig schema is well-defined.

The schema correctly requires name and staticConfigs, uses appropriate list type annotations, and properly declares dependencies on AuthorizationConfig and TLSConfig.


21451-21463: LGTM - volumeClaimTemplate correctly refactored to use shared type.

The change to reference VolumeClaimConfig instead of an inline definition improves type reuse consistency across AlertmanagerConfig and PrometheusConfig.


21487-21527: LGTM - AuthorizationConfig uses proper discriminated union pattern.

The schema correctly implements a discriminated union with type as the discriminator, and securely references secrets via SecretKeySelector rather than inline values.


21964-21982: LGTM - prometheusConfig field properly integrated into ClusterMonitoringSpec.

The field is appropriately optional, well-documented, and dependencies are correctly declared.


22470-22496: LGTM - Label schema is clean and minimal.


22938-23139: LGTM - PrometheusConfig provides comprehensive and well-structured configuration surface.

Key observations:

  • All fields are optional, enabling incremental adoption
  • List types use appropriate x-kubernetes-list-type annotations (map for uniqueness by key, atomic for tolerations)
  • Map keys are correctly specified for server-side apply merging
  • Dependencies are complete and accurate

23169-23243: LGTM - RelabelConfig schema correctly models Prometheus relabeling rules.

The required fields (name, action) make sense for identifying and executing relabel rules, while source/target configuration remains optional based on action type.


23245-23302: LGTM - RemoteWriteSpec provides clean remote write configuration.

The url is correctly required, and writeRelabelConfigs properly uses map-type list with name as the key.


23353-23399: LGTM - RetentionPolicy correctly implements discriminated union.

The schema properly uses x-kubernetes-unions to discriminate between RetentionNumber and RetentionSize configurations.


23425-23456: LGTM - SecretKeySelector is correctly designed.

The atomic map type is appropriate for immutable secret references, and the namespace scoping to openshift-monitoring is clearly documented.


23488-23536: LGTM - TLSConfig schema is well-designed.

The schema correctly references secrets via SecretKeySelector for all certificate/key fields. The certificateVerification field appropriately documents the security implications of SkipVerify. The "at least one option must be specified" constraint is likely enforced via CEL validation in the source type definition.


23561-23587: LGTM - VolumeClaimConfig is clean and practical.

Requiring sizeInGiB while making storageClassName optional allows users to rely on the cluster's default StorageClass, which is a good default experience.

config/v1alpha1/types_cluster_monitoring.go (8)

92-104: LGTM!

The prometheusConfig field is well-documented with clear explanations of what it configures and follows the established pattern of other fields in ClusterMonitoringSpec.


260-265: LGTM!

The change to VolumeClaimConfig simplifies the API by exposing only the necessary storage configuration fields rather than the full PersistentVolumeClaim spec. This is appropriate for a v1alpha1 API.


430-597: Well-structured configuration type.

The PrometheusConfig struct provides comprehensive configuration options with appropriate validation constraints, good documentation, and proper list type annotations for server-side apply support.


762-769: Regex syntax is not validated at admission time.

The regex field documentation states it must be a valid RE2 regular expression, but there's no CEL validation to enforce this. Invalid regex patterns will only fail at runtime when Prometheus attempts to compile them, potentially causing configuration reload failures.

CEL doesn't provide a built-in function to validate regex syntax. Consider either:

  1. Documenting clearly that regex syntax errors will be caught at runtime by the operator
  2. Adding operator-level admission validation to test-compile the regex before accepting the CR

If this is intentionally deferred to runtime validation by CMO, please confirm this is the expected behavior.


607-671: LGTM!

The AdditionalAlertmanagerConfig struct is well-designed with appropriate validation. The clever use of URL parsing in the staticConfigs validation (prepending http:// to validate host:port format) is a good approach.


893-936: LGTM!

Both VolumeClaimConfig and Retention structs have sensible validation bounds and clear documentation. The MinProperties=1 constraint ensures meaningful configuration when these types are used.


854-870: LGTM!

The AuthorizationConfig correctly implements the Kubernetes union pattern with a discriminator field and appropriate XValidation to enforce the union semantics.


673-688: LGTM!

The supporting types (Label, RemoteWriteSpec, SecretKeySelector, RelabelAction, CollectionProfile) are well-defined with appropriate validation constraints, clear documentation, and proper enum handling.

Also applies to: 690-727, 872-891, 938-969

@danielmellado danielmellado force-pushed the prometheus_k8s_config_api branch from 43ae844 to da18977 Compare December 17, 2025 11:57
@danielmellado
Copy link
Contributor Author

The verify-crd-schema and verify-crdify jobs are failing because we intentionally changed the volumeClaimTemplate field from the full corev1.PersistentVolumeClaim type to a simpler VolumeClaimConfig struct with just sizeInGiB and storageClassName.

This was done based on reviewer feedback to provide better admission-time validation and avoid exposing ~40 PVC fields that don't make sense for this use case.

Since this is a TechPreview API (v1alpha1), breaking changes are allowed, and both tools explicitly say it's appropriate to override when you're making intentional changes rather than introducing regressions.

@everettraven do you have any another proposal for this? Besides that, my plan is to split this into smaller changes to make it easier to work/review if that's ok for you.

cc @marioferh

@danielmellado danielmellado force-pushed the prometheus_k8s_config_api branch from da18977 to b001760 Compare December 17, 2025 13:36
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (3)
config/v1alpha1/types_cluster_monitoring.go (1)

610-621: Consider using format validation for consistency.

The name field uses a manual regex pattern for DNS subdomain validation. For consistency with other fields in this file (e.g., SecretKeySelector.Name at line 883, TLSConfig.ServerName at line 824), consider using the format.dns1123Subdomain() validation approach:

-// +kubebuilder:validation:XValidation:rule="self.matches('^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\\\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$')",message="must be a valid DNS subdomain: lowercase alphanumeric, hyphens, or periods"
+// +kubebuilder:validation:XValidation:rule="!format.dns1123Subdomain().validate(self).hasValue()",message="must be a valid DNS subdomain: lowercase alphanumeric, hyphens, or periods"

This leverages the built-in Kubernetes format validator and ensures consistency across the API.

config/v1alpha1/zz_generated.featuregated-crd-manifests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml (1)

895-1882: LGTM! Comprehensive Prometheus configuration with strong validations.

The new prometheusConfig field provides extensive configuration options with appropriate validation:

✅ Security considerations are well-handled:

  • Secrets referenced by name/key (not inline)
  • TLS with proper cert/key mutual validation (lines 1201-1204)
  • Authorization with type-conditional bearerToken validation (lines 983-986)
  • Secure defaults (certificateVerification: Verify)

✅ Cross-field validations are comprehensive:

  • targetLabel required for Replace/HashMod actions (lines 1483-1486)
  • Mutual TLS cert/key must be specified together (lines 1201-1204)

✅ Ranges and constraints are reasonable:

  • Storage: 1 GiB - 16 TiB
  • Retention: 1-365 days
  • Timeouts: 1-600 seconds
  • Body size limit: 1 byte - 1 TB

Optional: Consider stricter host:port validation.

Lines 1042-1049 validate staticConfigs using isURL('http://' + self) to parse host:port. While functional, this accepts entries like "host:port/path" or "host:port?query" since URL parsing includes path/query components. The description states only "host:port" format is expected.

However, Prometheus will reject invalid endpoints at runtime, so this is low-severity validation drift rather than a security or correctness issue.

config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml (1)

453-485: Consider consolidating duplicated volumeClaimTemplate schema.

The volumeClaimTemplate schema is duplicated in both alertmanagerConfig.customConfig (lines 453-485) and prometheusConfig (lines 1849-1881) with identical structure and validation rules. This duplication creates maintenance risk—future updates might inadvertently change only one instance, causing API surface inconsistency.

Since this is generated code, verify that the Go type definition properly reuses VolumeClaimConfig for both fields. If the type is already shared in types_cluster_monitoring.go, the duplication here is unavoidable but acceptable. Otherwise, refactor the Go types to share a common struct.

Also applies to: 1849-1881

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between da18977 and b001760.

📒 Files selected for processing (12)
  • config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml (1 hunks)
  • config/v1alpha1/types_cluster_monitoring.go (4 hunks)
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml (2 hunks)
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml (2 hunks)
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml (2 hunks)
  • config/v1alpha1/zz_generated.deepcopy.go (10 hunks)
  • config/v1alpha1/zz_generated.featuregated-crd-manifests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml (2 hunks)
  • config/v1alpha1/zz_generated.swagger_doc_generated.go (7 hunks)
  • openapi/generated_openapi/zz_generated.openapi.go (13 hunks)
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml (2 hunks)
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml (2 hunks)
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml
  • config/v1alpha1/zz_generated.featuregated-crd-manifests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml
  • config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml
  • openapi/generated_openapi/zz_generated.openapi.go
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml
  • config/v1alpha1/zz_generated.swagger_doc_generated.go
  • config/v1alpha1/zz_generated.deepcopy.go
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml
  • config/v1alpha1/types_cluster_monitoring.go
🧬 Code graph analysis (1)
config/v1alpha1/zz_generated.deepcopy.go (1)
config/v1alpha1/types_cluster_monitoring.go (11)
  • AdditionalAlertmanagerConfig (610-671)
  • TLSConfig (804-832)
  • AuthorizationConfig (859-872)
  • PrometheusConfig (434-597)
  • Label (674-688)
  • RemoteWriteSpec (692-727)
  • ContainerResource (305-335)
  • Retention (918-938)
  • RelabelConfig (731-797)
  • SecretKeySelector (876-893)
  • VolumeClaimConfig (897-914)
🔇 Additional comments (29)
config/v1alpha1/types_cluster_monitoring.go (10)

92-104: LGTM!

The prometheusConfig field is well-documented with clear scope (platform Prometheus only) and appropriate optional semantics.


260-265: LGTM!

The simplified VolumeClaimConfig type provides better API ergonomics by exposing only the essential storage configuration fields (size and storage class) rather than the full PVC specification.


430-597: LGTM!

The PrometheusConfig type is well-designed with:

  • Comprehensive configuration options for platform Prometheus
  • Appropriate validation constraints (min/max items, value ranges)
  • Clear documentation for each field
  • Proper use of list types (map, atomic, set) for different scenarios
  • MinProperties validation to prevent empty objects

691-727: LGTM!

The RemoteWriteSpec type is well-designed with appropriate validations for URLs, timeouts, and relabel configurations.


729-797: LGTM!

The RelabelConfig type demonstrates excellent API design with:

  • Struct-level validation enforcing action-specific requirements
  • Pointer type for replacement allowing distinction between omitted and empty values
  • Comprehensive support for Prometheus relabeling actions
  • Appropriate field length limits for practical use cases

799-832: LGTM!

The TLSConfig type is well-designed with appropriate mTLS validation (cert and key must be paired) and secure defaults (Verify for certificate verification).


856-872: LGTM!

The AuthorizationConfig properly implements a discriminated union with validation ensuring bearerToken is provided when type is BearerToken.


874-893: LGTM!

The SecretKeySelector type is well-designed with appropriate validations and the correct structType=atomic marker for ensuring it's treated as an indivisible unit.


895-914: LGTM!

The VolumeClaimConfig type provides a simplified, focused API for storage configuration, exposing only the essential fields (size and storage class) needed for this use case.


916-938: LGTM!

The Retention type is well-designed with reasonable limits (up to 365 days and 16 TiB) and proper validation ensuring at least one retention policy is specified.

config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml (2)

354-565: LGTM!

The positive test cases provide good coverage of valid PrometheusConfig scenarios, including minimal configuration, scheduling options (nodeSelector, tolerations, topologySpreadConstraints), resource allocation, and external integrations (additionalAlertmanagerConfigs, remoteWrite, externalLabels).


566-776: LGTM!

The negative test cases comprehensively verify validation constraints, including:

  • Empty object rejection (minProperties validation)
  • Maximum item limits across all list fields
  • Minimum item requirements for arrays
  • Error messages properly formatted to match OpenAPI validation output
payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml (2)

453-485: LGTM!

The volumeClaimTemplate schema correctly reflects the simplified VolumeClaimConfig structure with appropriate validations for size and storage class name.


895-1882: LGTM!

The prometheusConfig schema correctly represents the comprehensive PrometheusConfig API with:

  • Proper validation constraints (min/max items, value ranges)
  • Correct list types (map, atomic, set) for different fields
  • Complex validations (TLS cert/key pairing, relabel action requirements)
  • Consistent with the Go type definitions
config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml (1)

1385-1486: CEL validation for writeRelabelConfigs.targetLabel correctly enforces presence across all CRD variants

The CEL rule now uses has(self.targetLabel) to ensure the target label field exists when action is Replace or HashMod, preventing configs from bypassing validation with null values. This stricter validation is consistently applied across all three generated ClusterMonitoring CRD variants (CustomNoUpgrade, DevPreviewNoUpgrade, TechPreviewNoUpgrade).

payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml (1)

453-485: Alertmanager volumeClaimTemplate schema is properly defined

The new simplified volumeClaimTemplate schema looks correct: sizeInGiB is a required int32 with proper bounds (1–16384 GiB), and storageClassName is optional with DNS1123 subdomain validation. The kubebuilder validation tags in the Go type definition align with the CRD schema.

Note: Controller updates needed for this schema change will be in the config-operator repository, not in this API definitions repo.

config/v1alpha1/zz_generated.featuregated-crd-manifests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml (1)

453-485: LGTM! Simplified storage configuration is appropriate for v1alpha1.

The refactoring of volumeClaimTemplate from a full PersistentVolumeClaim spec to a focused sizeInGiB + storageClassName configuration is well-implemented:

  • Proper validation constraints (1-16384 GiB range, DNS subdomain validation)
  • Required field enforcement via required: [sizeInGiB]
  • Clear admission-time validation

This breaking change is acceptable for v1alpha1 (compatibility level 4) and aligns with the PR's goal of providing better validation while avoiding exposing ~40 PVC fields not relevant to this use case.

config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml (1)

1-1915: Schema changes are identical to ClusterMonitoringConfig.yaml.

This CRD manifest (DevPreviewNoUpgrade feature set) contains the same schema changes as reviewed in ClusterMonitoringConfig.yaml:

  1. Simplified volumeClaimTemplate (lines 453-485)
  2. New prometheusConfig field (lines 895-1882)

All validation rules, security considerations, and implementation details are identical. The only difference is the feature gate annotation (line 10: DevPreviewNoUpgrade vs ClusterMonitoringConfig).

Refer to the review comments on the first file for detailed analysis.

config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml (1)

453-485: LGTM! Simplified volumeClaimTemplate schema is appropriate for TechPreview.

The refactored schema reduces the API surface from ~40 PVC fields to just the essential sizeInGiB and storageClassName. This aligns with the PR objectives to provide better admission-time validation while keeping the v1alpha1 API focused and maintainable.

config/v1alpha1/zz_generated.swagger_doc_generated.go (3)

121-134: LGTM! Comprehensive Swagger documentation for AdditionalAlertmanagerConfig.

The documentation clearly describes each field's purpose, constraints, and provides helpful examples (e.g., endpoint formats for staticConfigs). This will aid API consumers in understanding the configuration options.


256-275: LGTM! Well-structured Swagger documentation for PrometheusConfig.

The documentation effectively covers all PrometheusConfig fields with clear explanations of purpose, constraints, and use cases. The descriptions for complex fields like additionalAlertmanagerConfigs and remoteWrite provide helpful context for API consumers.


346-354: LGTM! Swagger documentation accurately reflects the simplified VolumeClaimConfig.

The documentation clearly describes the two fields (storageClassName and sizeInGiB) with appropriate constraints and validation details, matching the simplified API surface introduced in this PR.

config/v1alpha1/zz_generated.deepcopy.go (3)

983-1035: LGTM! PrometheusConfig.DeepCopyInto correctly handles all nested structures.

The implementation properly deep-copies all slice and map fields, and correctly handles nested structs:

  • Slice fields (AdditionalAlertmanagerConfigs, RemoteWrite, Resources, etc.) are deep-copied with iteration
  • Map field (NodeSelector) is deep-copied
  • Simple value types (Retention) use shallow copy, which is safe
  • VolumeClaimTemplate properly calls DeepCopyInto

1269-1277: LGTM! VolumeClaimConfig.DeepCopyInto correctly handles pointer field.

The implementation properly handles the optional StorageClassName pointer field:

  • Line 1270 performs shallow copy of the struct
  • Lines 1271-1275 conditionally deep-copy the pointer field when non-nil

This follows the correct pattern for deep-copying structs with optional pointer fields.


14-35: Verify TLSConfig structure for pointer/slice fields.

Lines 17 and 23 perform shallow copies of Authorization and TLSConfig. We confirmed Authorization (DelegatedAuthorization) contains only a Disabled bool field, making the shallow copy safe. However, the TLSConfig struct definition in the openshift/api config/v1alpha1 package could not be definitively located to verify whether it contains pointer or slice fields that require deep copying.

Additionally, these assignments are redundant—line 15's *out = *in already copies all fields shallowly. If deep copies are needed for nested types, the pattern should be refactored to call DeepCopyInto on those fields explicitly.

payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml (2)

1345-1494: Verify authentication support for remote write endpoints.

The remoteWrite configuration appears to lack explicit authentication fields (e.g., basic auth, bearer token, OAuth2, or TLS client certificates). Sending metrics to remote endpoints without authentication could expose sensitive monitoring data.

Please verify whether:

  1. Authentication for remote write is handled through a separate mechanism (e.g., service accounts, network policies)
  2. This is a deliberate simplification for the v1alpha1 API with plans to add authentication in future versions
  3. Users are expected to use authenticated endpoints that don't require explicit credentials in the config

If authentication is not supported, consider this a security gap that should be addressed before promoting beyond TechPreview.


1849-1881: Good: Consistent volume configuration across components.

The volumeClaimTemplate structure for Prometheus matches the simplified design used in alertmanagerConfig (lines 453-485), ensuring API consistency. The validation rules (1-16384 GiB, DNS subdomain for storage class) are appropriate.

openapi/generated_openapi/zz_generated.openapi.go (2)

21451-21462: LGTM: VolumeClaimConfig simplification.

The change from corev1.PersistentVolumeClaim to VolumeClaimConfig is intentional per the PR description and provides better admission-time validation by exposing only the relevant fields (sizeInGiB and storageClassName) rather than ~40 PVC fields.


23523-23529: The generated OpenAPI schema is missing the Enum constraint for certificateVerification.

The source Go code (config/v1alpha1/types_cluster_monitoring.go) defines CertificateVerificationType with +kubebuilder:validation:Enum=Verify;SkipVerify, but the generated OpenAPI schema at line 23523-23529 omits this constraint. The generated schema should include Enum: []interface{}{"Verify", "SkipVerify"} to reflect the allowed values, ensuring API clients have accurate schema validation information.

Comment on lines +21494 to +22389
"type": {
SchemaProps: spec.SchemaProps{
Description: "type specifies the authentication type to use. Valid values are \"None\" (no authentication) and \"BearerToken\" (bearer token authentication). When set to None, no authentication credentials are sent. When set to BearerToken, the bearerToken field must be specified.",
Type: []string{"string"},
Format: "",
},
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find AuthorizationConfig struct definition
find . -type f -name "*.go" -not -path "*/vendor/*" -not -path "*/zz_generated*" | xargs grep -l "type AuthorizationConfig struct" | head -5

Repository: openshift/api

Length of output: 102


🏁 Script executed:

#!/bin/bash
# Search for AuthorizationConfig and its Type field with validation markers
rg "type AuthorizationConfig struct" -A 20 --type=go | head -60

Repository: openshift/api

Length of output: 2064


🏁 Script executed:

#!/bin/bash
# Check the generated schema around the type field for Enum constraint
sed -n '21490,21510p' openapi/generated_openapi/zz_generated.openapi.go

Repository: openshift/api

Length of output: 1163


🏁 Script executed:

#!/bin/bash
# Check what AuthorizationType is defined as
rg "type AuthorizationType" -A 10 --type=go

Repository: openshift/api

Length of output: 1966


🏁 Script executed:

#!/bin/bash
# Check if there are any validation tags or const definitions for AuthorizationType
rg "AuthorizationType" --type=go -B 2 -A 2 | grep -E "(const|Enum|None|BearerToken)" | head -20

Repository: openshift/api

Length of output: 1101


The Enum constraint is missing from the generated schema despite validation existing in the source.

The source code defines AuthorizationType with +kubebuilder:validation:Enum=None;BearerToken, and the type field in AuthorizationConfig struct uses this type. However, the generated OpenAPI schema (lines 21494-21500) omits the Enum constraint. The schema description documents the valid values but provides no machine-readable Enum validation. Add the validation marker directly to the struct field to ensure it propagates to the generated schema:

Type AuthorizationType `json:"type,omitempty" kubebuilder:validation:Enum=None;BearerToken`

Comment on lines 459 to 460
// +kubebuilder:validation:Minimum=1
// +kubebuilder:validation:Maximum=1099511627776
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these reasonable?

When might someone want to reasonably set this to 1 byte? When might they want to set this 1 terabyte?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think practical values would be:
a minimum of 10 KB,
and anything over 1 GB doesn’t make sense.
But I’m not sure about futurecases, special or corner cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, we find it relatively safe to increase maximums and decrease minimums if absolutely necessary - especially while this API is still a v1alpha1 and only present in TechPreview clusters.

I'd start with a range that is practical + some reasonable buffer for potential unknown cases. If 10kB - 1GB is practical let's go with that.

// Maximum length is 253 characters.
// +kubebuilder:validation:MaxLength=253
// +kubebuilder:validation:MinLength=1
// +kubebuilder:validation:XValidation:rule="self.matches('^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\\\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$')",message="must be a valid DNS subdomain: lowercase alphanumeric, hyphens, or periods"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is an existing format utility for DNS 1123 subdomain names.

Example:

// +kubebuilder:validation:XValidation:rule="!format.dns1123Subdomain().validate(self).hasValue()",message="a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thx

// instances.
// Possible values are `HTTP` or `HTTPS`.
// When omitted, defaults to `HTTP`.
// +kubebuilder:validation:Enum=HTTP;HTTPS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use a type alias, we generally prefer that this marker is placed on the type alias.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// +kubebuilder:validation:MaxItems=10
// +listType=map
// +listMapKey=name
WriteRelabelConfigs []RelabelConfig `json:"writeRelabelConfigs,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minimum length of 1 on this if [] has the same semantic meaning as omitted please.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then should I remove minimum length 1 of every [] field?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I'm asking to add a minimum length of 1 to this field if writeRelabelConfigs: [] is the same as omitting the field (which it reads that way based on the GoDoc).

// +kubebuilder:validation:MaxLength=1000
Regex string `json:"regex,omitempty"`
// targetLabel is the target label name where the result is written.
// Required for Replace and HashMod actions (enforced by validation).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to mention that this is enforced by validation

Suggested change
// Required for Replace and HashMod actions (enforced by validation).
// Required for Replace and HashMod actions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

// VolumeClaimConfig defines the configuration for a PersistentVolumeClaim used for storage.
// +kubebuilder:validation:MinProperties=1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary because you already have one field marked as required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// Prometheus automatically deletes data older than this duration.
// When omitted, the default is 15 days.
// Minimum value is 1 day.
// Maximum value is 365 days (1 year).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any use cases to retain metrics for longer than a year?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it could be, but prometheus is not done for long term retention.

Comment on lines -254 to +265
VolumeClaimTemplate *v1.PersistentVolumeClaim `json:"volumeClaimTemplate,omitempty"`
VolumeClaimTemplate VolumeClaimConfig `json:"volumeClaimTemplate,omitempty,omitzero"`
Copy link
Contributor

@everettraven everettraven Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be cautious of changing this.

If this changes the serialization of the type you'll need to upversion this to v1alpha2.

I had forgotten that *v1.PersistentVolumeClaim had been used in the past so it may be OK to stay consistent. Before promoting to a v1 though I'd recommend converting to a custom type so you can enforce better validations at admission time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the same here:

VolumeClaimTemplate *v1.PersistentVolumeClaim `json:"volumeClaimTemplate,omitempty"`

and discussed before

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I had forgotten about that when I had originally suggested moving to a custom type. I do think it is still better to move to a custom type over time so you can enforce admission time validations that are not present on the *v1.PersistentVolumeClaim type but that is out of the scope of this PR.

I'd agree with reverting this back to *v1.PersistentVolumeClaim for the v1alpha1 version of the API.

Copy link
Contributor

@everettraven everettraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, the structure of this seems to be shaping up nicely.

I've got a handful more questions/comments on the API changes as-is. Once those are addressed I can do another comprehensive review to look at the tests and things like that.

Thanks @danielmellado for opening this new PR - GitHub is much happier loading this one now :P

@everettraven
Copy link
Contributor

The verify-crd-schema and verify-crdify jobs are failing because we intentionally changed the volumeClaimTemplate field from the full corev1.PersistentVolumeClaim type to a simpler VolumeClaimConfig struct with just sizeInGiB and storageClassName.

This was done based on reviewer feedback to provide better admission-time validation and avoid exposing ~40 PVC fields that don't make sense for this use case.

Since this is a TechPreview API (v1alpha1), breaking changes are allowed, and both tools explicitly say it's appropriate to override when you're making intentional changes rather than introducing regressions.

@everettraven do you have any another proposal for this? Besides that, my plan is to split this into smaller changes to make it easier to work/review if that's ok for you.

I left a comment on this but I'll explicitly reply here as well - if if changes the serialization of the type it is not OK to do in v1alpha1 if this has shipped at all. Older clients that relied on that field being the previous type would fail to serialize a valid object into a request body for the updated version of the resource.

The correct way to make this change is to up-version from v1alpha1 to v1alpha2.

Re: splitting into smaller changes, I'll leave that up to you. As-is this PR is pretty big but I think fairly manageable now that we've gotten a large majority of the structural changes for this API sorted out. The previous PR was just getting unwieldy due to the large number of review comments on it.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 25, 2025
This commit introduces a new API to be introduced as a part of the
migration in CMO from ConfigMap to CRDs

Signed-off-by: Mario Fernandez <mariofer@redhat.com>
Signed-off-by: Daniel Mellado <dmellado@fedoraproject.org>
@marioferh marioferh force-pushed the prometheus_k8s_config_api branch from b001760 to f218565 Compare January 9, 2026 08:24
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 9, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Fix all issues with AI agents
In @openapi/generated_openapi/zz_generated.openapi.go:
- Around line 24450-24477: The generated OpenAPI definition returned by
schema_openshift_api_config_v1alpha1_VolumeClaimConfig is missing schema
constraints: add numeric bounds for the integer property "sizeInGiB" by setting
Minimum=1 and Maximum=16384 on that spec.Schema, and add validation for
"storageClassName" by setting MaxLength=253 and a Pattern matching RFC1123 DNS
subdomain on that spec.Schema (e.g. the existing RFC1123 regex used elsewhere in
your codegen). Update the SchemaProps for the "sizeInGiB" and "storageClassName"
entries accordingly so the OpenAPI schema enforces the same kubebuilder
constraints as the source struct.
- Around line 24193-24219: The generated OpenAPI schema for
schema_openshift_api_config_v1alpha1_Retention is missing SchemaProps.Minimum
and SchemaProps.Maximum for the integer properties durationInDays and sizeInGiB;
update the generator to map the kubebuilder validation markers
(+kubebuilder:validation:Minimum and :Maximum) into spec.SchemaProps.Minimum and
spec.SchemaProps.Maximum (they are *float64 fields) for those property entries
(durationInDays and sizeInGiB) so the SchemaProps include the numeric
constraints (e.g., Minimum=1, Maximum=365 for durationInDays and Minimum=1,
Maximum=16384 for sizeInGiB) while preserving Type and Format.
- Around line 24058-24133: The generated OpenAPI definition returned by
schema_openshift_api_config_v1alpha1_RelabelConfig is missing the enum on the
"action" property and the x-kubernetes-validations enforcing that "targetLabel"
is required when action is Replace or HashMod; update the OpenAPIDefinition
construction for schema_openshift_api_config_v1alpha1_RelabelConfig to add an
Enum array
["Replace","Keep","Drop","HashMod","LabelMap","LabelDrop","LabelKeep"] to the
action SchemaProps and add a VendorExtensible Extensions entry
"x-kubernetes-validations" (with a rule checking .action in (Replace,HashMod) ->
require targetLabel) at the schema level or appropriate property so the
generator emits the conditional validation; locate and modify the action and
targetLabel entries inside schema_openshift_api_config_v1alpha1_RelabelConfig to
include these metadata keys.
- Around line 23855-24056: The generated OpenAPI for
schema_openshift_api_config_v1alpha1_PrometheusConfig is missing the kubebuilder
validation constraints; update the SchemaProps for the listed properties to
include the exact constraints from the Go tags: add Enum values
["Error","Warn","Info","Debug"] for "logLevel" and ["Full","Minimal"] for
"collectionProfile"; add MinLength:1 and MaxLength:255 (and the
x-kubernetes-validations entry or equivalent pattern/validation) for
"queryLogFile"; add Minimum:1 and Maximum:1099511627776 for
"enforcedBodySizeLimitBytes"; and add MaxProperties:10 for "nodeSelector" (set
SchemaProps.AdditionalProperties or SchemaProps.MaxProperties appropriately).
Ensure these changes are made in
schema_openshift_api_config_v1alpha1_PrometheusConfig so the OpenAPIDefinition
Properties for "logLevel", "collectionProfile", "queryLogFile",
"enforcedBodySizeLimitBytes", and "nodeSelector" include the corresponding
constraint fields and any x-kubernetes validation extensions required for
server-side validation.
🧹 Nitpick comments (3)
openapi/openapi.json (1)

13146-13154: Clarify tolerations minimum length constraint.

The description states "Minimum length for this list is 1" but the field is optional. Consider clarifying that the minimum applies only when the field is explicitly set (e.g., "When specified, minimum length is 1").

📝 Suggested description update
-          "description": "tolerations defines tolerations for the pods. tolerations is optional.\n\nWhen omitted, this means the user has no opinion and the platform is left to choose reasonable defaults. These defaults are subject to change over time. Defaults are empty/unset. Maximum length for this list is 10 Minimum length for this list is 1",
+          "description": "tolerations defines tolerations for the pods. tolerations is optional.\n\nWhen omitted, this means the user has no opinion and the platform is left to choose reasonable defaults. These defaults are subject to change over time. Defaults are empty/unset. When specified, minimum length is 1 and maximum length is 10.",
openapi/generated_openapi/zz_generated.openapi.go (1)

22853-22859: spec.prometheusConfig is integrated cleanly; consider whether defaulting to {} is intended.

Default: map[string]interface{}{} makes the default an explicit empty object in the schema; if you want “unset vs set empty” to remain distinguishable in persisted objects/SSA, consider omitting the default (in the source type) and regenerating.

payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml (1)

1425-1433: Document regex validation limitation.

The regex field description states "Must be a valid RE2 regular expression" (line 1428), but there is no CEL validation to enforce RE2 syntax at admission time. Invalid regex expressions will only be detected at runtime when Prometheus attempts to compile them, which could lead to configuration errors being discovered late.

While validating regex syntax in CEL is not feasible, consider adding a note in the description that regex syntax validation occurs at runtime, and users should test their configurations.

📝 Suggested documentation improvement
                              regex:
                                description: |-
                                  regex is the regular expression to match against the concatenated source label values.
                                  Must be a valid RE2 regular expression (https://github.com/google/re2/wiki/Syntax).
+                                 Invalid regex syntax will be rejected at runtime by Prometheus, not at admission time.
                                  When omitted, defaults to "(.*)" (matches everything).
                                  Must be between 1 and 1000 characters in length when specified.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between b001760 and f218565.

⛔ Files ignored due to path filters (4)
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • config/v1alpha1/zz_generated.featuregated-crd-manifests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
📒 Files selected for processing (9)
  • config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml
  • config/v1alpha1/types_cluster_monitoring.go
  • config/v1alpha1/zz_generated.deepcopy.go
  • config/v1alpha1/zz_generated.swagger_doc_generated.go
  • openapi/generated_openapi/zz_generated.openapi.go
  • openapi/openapi.json
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • config/v1alpha1/zz_generated.deepcopy.go
🧰 Additional context used
🧬 Code graph analysis (1)
config/v1alpha1/types_cluster_monitoring.go (3)
config/v1alpha1/zz_generated.swagger_doc_generated.go (11)
  • PrometheusConfig (273-275)
  • VolumeClaimConfig (352-354)
  • AdditionalAlertmanagerConfig (132-134)
  • Label (238-240)
  • RemoteWriteSpec (300-302)
  • ContainerResource (228-230)
  • Retention (310-312)
  • AuthorizationConfig (176-178)
  • TLSConfig (333-335)
  • RelabelConfig (288-290)
  • SecretKeySelector (320-322)
operator/v1/types.go (1)
  • LogLevel (94-94)
console/v1/types_console_plugin.go (1)
  • AuthorizationType (282-282)
🔇 Additional comments (34)
config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml (1)

354-776: Comprehensive test coverage for PrometheusConfig API.

The test cases thoroughly validate both positive and negative scenarios for the new PrometheusConfig API, including:

  • Minimal valid configurations
  • Valid configurations with various field combinations
  • Rejection of empty objects and empty arrays
  • Enforcement of max item constraints (10 items for most lists)

The test structure is consistent and aligns well with the API constraints defined in the type definitions.

config/v1alpha1/types_cluster_monitoring.go (5)

92-104: Well-documented PrometheusConfig field addition.

The new prometheusConfig field is appropriately scoped to the platform Prometheus instance with clear documentation distinguishing it from user-workload Prometheus configuration.


430-597: Well-structured PrometheusConfig type with comprehensive validation.

The PrometheusConfig type provides extensive configuration options with appropriate validation constraints:

  • Min/max item limits for arrays (typically 1-10)
  • String length validations
  • CEL validation rules for complex constraints
  • Clear documentation for each field

The minProperties: 1 validation ensures users must specify at least one configuration option, preventing empty objects.

Note: The volumeClaimTemplate field at line 596 is subject to the versioning concern raised separately.


600-893: Well-designed supporting types with strong validation.

The supporting types demonstrate good API design practices:

  1. TLSConfig (lines 799-832): Enforces mTLS pairing constraint via CEL validation - both cert and key must be specified together
  2. AuthorizationConfig (lines 856-872): Uses discriminated union pattern with CEL validation to enforce bearerToken presence based on type
  3. RelabelConfig (lines 729-797): Validates that targetLabel is required for Replace/HashMod actions
  4. SecretKeySelector (lines 874-893): Proper validation for secret names and keys with DNS subdomain constraints

The consistent use of CEL validation (x-kubernetes-validations) for cross-field constraints is a best practice for declarative validation.


895-914: VolumeClaimConfig simplification is well-designed but part of breaking change.

The VolumeClaimConfig type provides a focused, simplified interface with:

  • storageClassName: Optional storage class with DNS subdomain validation
  • sizeInGiB: Required size with reasonable bounds (1 GiB - 16 TiB)

This simplified API is cleaner than exposing the full PersistentVolumeClaim spec (~40 fields) and enables better declarative validation. However, this change is part of the breaking serialization change discussed in the separate review comment for line 265.

Note: Subject to the API versioning concern raised separately.


260-266: Incorrect: This is a new API introduction, not a breaking change.

The VolumeClaimTemplate field using VolumeClaimConfig type was present in the original commit that introduced config/v1alpha1/types_cluster_monitoring.go (commit f218565). This is a new API being introduced for the Cluster Monitoring Operator migration from ConfigMap to CRDs, not a modification of an existing shipped API. There is no prior version of this field using *v1.PersistentVolumeClaim to break compatibility with.

The design choice to use a simplified VolumeClaimConfig struct instead of the full PersistentVolumeClaim type is intentional for this new API, but does not constitute a breaking change to a previously shipped interface.

Likely an incorrect or invalid review comment.

payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml (1)

895-1882: Comprehensive CRD schema for prometheusConfig matches Go types.

The CRD schema for prometheusConfig accurately reflects the Go type definitions with:

  • All nested types properly expanded (AdditionalAlertmanagerConfig, RemoteWriteSpec, RelabelConfig, TLSConfig, etc.)
  • Validation constraints correctly translated to OpenAPI schema (minItems, maxItems, enum values, CEL validations)
  • Appropriate list types (map, set, atomic) based on the Go type markers
  • Required fields properly marked

The schema is extensive (~1000 lines) but appears consistent and complete.

Note: The volumeClaimTemplate within prometheusConfig (lines 1849-1881) is subject to the API versioning concern raised separately.

payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml (3)

895-1882: New prometheusConfig field structure looks comprehensive.

The new prometheusConfig field provides extensive configuration options for the platform Prometheus instance. The structure is well-documented with clear descriptions, appropriate validation rules, and reasonable defaults.

The field includes the same simplified volumeClaimTemplate structure (lines 1849-1881) as alertmanagerConfig, which maintains consistency but is subject to the same versioning concern noted separately.


1849-1881: volumeClaimTemplate in prometheusConfig follows same pattern.

This volumeClaimTemplate structure is consistent with the one in alertmanagerConfig. While the simplified design (only sizeInGiB and storageClassName) is cleaner and avoids exposing ~40 unnecessary PVC fields, it's part of the same breaking change that requires a version bump to v1alpha2.


453-485: No action needed. This is a new API being introduced for the first time, not a breaking change to an existing API. The entire CRD file was added in commit f218565, and the volumeClaimTemplate structure with sizeInGiB and storageClassName fields is the initial design for the new prometheusConfig API. Since no previous version shipped, there is no existing code to break backward compatibility with. The API is marked as TechPreviewNoUpgrade with compatibility level 4, which explicitly states "the API can change at any point for any reason," so versioning rules for stable APIs do not apply.

Likely an incorrect or invalid review comment.

openapi/openapi.json (9)

12049-12094: Well-structured Alertmanager configuration type.

The AdditionalAlertmanagerConfig schema correctly uses x-kubernetes-list-type: set for staticConfigs to ensure unique endpoints, and the required fields (name, staticConfigs) are appropriately chosen for a meaningful configuration entry.


12192-12217: Good use of discriminated union pattern.

The x-kubernetes-unions configuration correctly implements a discriminated union with type as the discriminator. This ensures proper validation: when type is "BearerToken", the bearerToken field becomes required.


12472-12476: Clean integration of prometheusConfig.

The field is appropriately added as optional with clear documentation of its purpose and scope (platform Prometheus only, not user-workload).


12824-12841: LGTM!

Simple and well-documented label type with appropriate constraints.


13064-13145: Well-designed PrometheusConfig schema.

The configuration comprehensively covers Prometheus deployment options with appropriate:

  • List types (map with keys, set, atomic) for server-side apply compatibility
  • Sensible defaults documented in descriptions
  • Proper use of int64 for enforcedBodySizeLimitBytes to support large values

Also applies to: 13155-13267


13333-13351: LGTM!

Clean secret selector implementation with appropriate atomic map type and namespace scoping.


13417-13434: Clean simplified volume claim configuration.

The VolumeClaimConfig type appropriately exposes only the essential fields (sizeInGiB, storageClassName) relevant to the monitoring use case, avoiding the complexity of the full PersistentVolumeClaim spec.


13370-13398: Remove this review comment - validation constraints are enforced.

The TLSConfig validation constraints are already enforced in the codebase via kubebuilder validation markers in config/v1alpha1/types_cluster_monitoring.go:

  • +kubebuilder:validation:MinProperties=1 enforces at least one TLS configuration option is specified
  • +kubebuilder:validation:XValidation:rule="(has(self.cert) && has(self.key)) || (!has(self.cert) && !has(self.key))" enforces cert and key must be specified together or both omitted

These markers are processed at API server admission time and validate all requests. The constraints are not merely documented—they are actively enforced.

Likely an incorrect or invalid review comment.


12172-12176: This review concern is not applicable — ClusterMonitoring v1alpha1 is a new API being introduced for the first time.

The ClusterMonitoring v1alpha1 API is newly created in this commit and does not have existing shipped versions. Additionally, the API is marked with Compatibility level 4, explicitly stating "No compatibility is provided, the API can change at any point for any reason." The use of VolumeClaimConfig instead of PersistentVolumeClaim is an intentional design choice for the new simplified API, not a breaking change to an existing one.

config/v1alpha1/zz_generated.swagger_doc_generated.go (1)

121-354: LGTM! Comprehensive swagger documentation for new Prometheus configuration types.

The generated swagger documentation is well-structured and comprehensive. All new types follow consistent patterns with:

  • Clear descriptions with practical examples
  • Well-documented validation constraints (min/max values, lengths, formats)
  • Appropriate defaults with caveats about being subject to change
  • Clear distinction between required and optional fields

Since this is an auto-generated file (indicated by the zz_generated prefix), any documentation improvements should be made in the source type definitions in types_cluster_monitoring.go, then this file should be regenerated.

openapi/generated_openapi/zz_generated.openapi.go (8)

442-452: New v1alpha1 schema definitions are wired correctly into the OpenAPI map.

The added entries appear consistent with the existing schema_openshift_api_config_v1alpha1_* pattern.


474-480: Label schema registration looks consistent.


484-501: PrometheusConfig/TLS/etc registrations look consistent; watch the v1alpha1 surface area expansion.


22870-22872: Dependency updates look correct.


24377-24426: The source TLSConfig type already has the necessary validation rules.

Both invariants are already enforced in config/v1alpha1/types_cluster_monitoring.go:

  • Line 802: +kubebuilder:validation:MinProperties=1 enforces "at least one field set"
  • Line 803: +kubebuilder:validation:XValidation:rule="(has(self.cert) && has(self.key)) || (!has(self.cert) && !has(self.key))",message="cert and key must both be specified together for mutual TLS, or both be omitted" enforces the cert/key pairing requirement

No action needed.

Likely an incorrect or invalid review comment.


24314-24346: Validation markers already exist in source definition.

The SecretKeySelector struct in config/v1alpha1/types_cluster_monitoring.go (lines 876–893) already includes comprehensive validation markers:

  • +kubebuilder:validation:MinLength=1 and +kubebuilder:validation:MaxLength=253 on both name and key fields
  • XValidation rules enforcing DNS1123 subdomain format for name and alphanumeric/hyphen/underscore/dot pattern for key

No additional pattern or length markers need to be added to the source type.

Likely an incorrect or invalid review comment.


22376-22416: The review comment is inaccurate. The AuthorizationConfig struct in config/v1alpha1/types_cluster_monitoring.go already has an explicit +kubebuilder:validation:XValidation rule that enforces the required/forbidden semantics:

+kubebuilder:validation:XValidation:rule="has(self.type) && self.type == 'BearerToken' ? has(self.bearerToken) : !has(self.bearerToken)",message="bearerToken is required when type is BearerToken, and forbidden otherwise"

This validation uses CEL (Common Expression Language) to enforce exactly the semantics needed—the bearerToken field is required when type=="BearerToken" and forbidden otherwise. This is enforceable at the Kubernetes API level, not merely a documentation hint. The concern raised in the review comment has already been addressed in the source code.

Likely an incorrect or invalid review comment.


23454-23480: Generated OpenAPI schema is missing length validation constraints from source type.

The source code (config/v1alpha1/types_cluster_monitoring.go) correctly defines +kubebuilder:validation:MinLength=1 and +kubebuilder:validation:MaxLength=128 on both the key and value fields. However, the generated OpenAPI schema at lines 23454-23480 only shows Type: []string{"string"} with no minLength or maxLength properties. The schema generator should translate these validation tags into corresponding OpenAPI constraint properties.

Note: Prometheus does support arbitrary UTF-8 in label names, and OpenShift's decision to allow UTF-8 (as documented in the source comments) is consistent with Prometheus technical capabilities.

Likely an incorrect or invalid review comment.

payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml (6)

895-908: LGTM! Well-documented and properly structured API field.

The prometheusConfig field is well-designed with:

  • Clear documentation explaining scope (platform Prometheus only, not user-workload)
  • Proper minProperties: 1 validation ensuring meaningful configuration when specified
  • Sensible default behavior (platform-managed defaults when omitted)

1046-1049: Clever validation approach for host:port format.

The staticConfigs validation cleverly prepends http:// to validate the host:port format using URL parsing functions. This ensures proper hostname/IP and port range validation without reimplementing complex parsing logic.


1849-1881: LGTM! Consistent VolumeClaimConfig structure.

The Prometheus volumeClaimTemplate uses the same VolumeClaimConfig structure as the Alertmanager volumeClaimTemplate (lines 453-485), which provides excellent API consistency. The validation rules are appropriate:

  • Required sizeInGiB ensures meaningful configuration
  • Storage limits (1 GiB - 16 TiB) are reasonable
  • Optional storageClassName with DNS1123 validation is correct

1501-1576: LGTM! Standard Kubernetes types properly reused.

The resources, tolerations, and topologySpreadConstraints fields correctly reuse standard Kubernetes structures with appropriate constraints:

  • Resources: proper quantity validation with cross-field checks for request/limit relationships
  • Tolerations: standard Kubernetes toleration structure
  • TopologySpreadConstraints: standard Kubernetes topology spread with proper list-type (map) and key definitions

The consistent application of these fields across Alertmanager, MetricsServer, and Prometheus components demonstrates good API design.

Also applies to: 1607-1656, 1657-1848


1141-1149: Good security default for certificate verification.

The certificateVerification field correctly defaults to "Verify" (line 1141), ensuring secure TLS certificate verification by default. Users must explicitly set "SkipVerify" to bypass certificate validation, which is the appropriate behavior for security-sensitive configurations.


453-485: No action required. The review comment's premise is incorrect.

This is the initial introduction of the ClusterMonitoring v1alpha1 API—not a modification of an existing released schema. The repository contains only one commit introducing this entire API with volumeClaimTemplate as part of its design. There is no prior released version of ClusterMonitoring that would be broken by this change, and therefore no version bump to v1alpha2 is needed. The volumeClaimTemplate field (using VolumeClaimConfig) is part of the new API's first release.

Likely an incorrect or invalid review comment.

Comment on lines 22118 to 22194
func schema_openshift_api_config_v1alpha1_AdditionalAlertmanagerConfig(ref common.ReferenceCallback) common.OpenAPIDefinition {
return common.OpenAPIDefinition{
Schema: spec.Schema{
SchemaProps: spec.SchemaProps{
Description: "AdditionalAlertmanagerConfig represents configuration for additional Alertmanager instances. The `AdditionalAlertmanagerConfig` resource defines settings for how a component communicates with additional Alertmanager instances.",
Type: []string{"object"},
Properties: map[string]spec.Schema{
"name": {
SchemaProps: spec.SchemaProps{
Description: "name is a unique identifier for this Alertmanager configuration entry. This field is used as the map key for server-side apply, allowing multiple controllers (e.g., ACM, cluster admins) to independently manage their own entries. The name must be a valid DNS subdomain (RFC 1123): lowercase alphanumeric characters, hyphens, or periods, and must start and end with an alphanumeric character. Maximum length is 253 characters.",
Type: []string{"string"},
Format: "",
},
},
"authorization": {
SchemaProps: spec.SchemaProps{
Description: "authorization configures the authentication method for Alertmanager connections. Supports bearer token authentication. When omitted, no authentication is used.",
Default: map[string]interface{}{},
Ref: ref("github.com/openshift/api/config/v1alpha1.AuthorizationConfig"),
},
},
"pathPrefix": {
SchemaProps: spec.SchemaProps{
Description: "pathPrefix defines an optional URL path prefix to prepend to the Alertmanager API endpoints. For example, if your Alertmanager is behind a reverse proxy at \"/alertmanager/\", set this to \"/alertmanager\" so requests go to \"/alertmanager/api/v1/alerts\" instead of \"/api/v1/alerts\". This is commonly needed when Alertmanager is deployed behind ingress controllers or load balancers. Must start with \"/\" and not end with \"/\" (unless it is the root path \"/\"). Must not contain query strings (\"?\") or fragments (\"#\").",
Type: []string{"string"},
Format: "",
},
},
"scheme": {
SchemaProps: spec.SchemaProps{
Description: "scheme defines the URL scheme to use when communicating with Alertmanager instances. Possible values are `HTTP` or `HTTPS`. When omitted, defaults to `HTTP`.",
Type: []string{"string"},
Format: "",
},
},
"staticConfigs": {
VendorExtensible: spec.VendorExtensible{
Extensions: spec.Extensions{
"x-kubernetes-list-type": "set",
},
},
SchemaProps: spec.SchemaProps{
Description: "staticConfigs is a list of statically configured Alertmanager endpoints in the form of `<host>:<port>`. Each entry must be a valid hostname, IPv4 address, or IPv6 address (in brackets) followed by a colon and a valid port number (1-65535). Examples: \"alertmanager.example.com:9093\", \"192.168.1.100:9093\", \"[::1]:9093\" At least one endpoint must be specified (minimum 1, maximum 10 endpoints).",
Type: []string{"array"},
Items: &spec.SchemaOrArray{
Schema: &spec.Schema{
SchemaProps: spec.SchemaProps{
Default: "",
Type: []string{"string"},
Format: "",
},
},
},
},
},
"timeoutSeconds": {
SchemaProps: spec.SchemaProps{
Description: "timeoutSeconds defines the timeout in seconds for requests to Alertmanager. When omitted, the default is 10 seconds. Minimum value is 1 second. Maximum value is 600 seconds (10 minutes).",
Type: []string{"integer"},
Format: "int32",
},
},
"tlsConfig": {
SchemaProps: spec.SchemaProps{
Description: "tlsConfig defines the TLS settings to use for Alertmanager connections. When omitted, this means no opinion and the platform is left to choose a reasonable default, which is subject to change over time.",
Default: map[string]interface{}{},
Ref: ref("github.com/openshift/api/config/v1alpha1.TLSConfig"),
},
},
},
Required: []string{"name", "staticConfigs"},
},
},
Dependencies: []string{
"github.com/openshift/api/config/v1alpha1.AuthorizationConfig", "github.com/openshift/api/config/v1alpha1.TLSConfig"},
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

❓ Verification inconclusive

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# First, let's find where the source types are defined
echo "=== Searching for AdditionalAlertmanagerConfig type definition ==="
rg -n --hidden -S 'type\s+AdditionalAlertmanagerConfig\s+struct' -C2 --max-count=5

Repository: openshift/api

Length of output: 170


🌐 Web query:

AdditionalAlertmanagerConfig kubebuilder validation openshift api

💡 Result:

Short answer: use kubebuilder validation markers on the AdditionalAlertmanagerConfig Go type (and its fields) to generate the CRD OpenAPI schema — e.g. +kubebuilder:validation:Enum, +kubebuilder:validation:Pattern, +kubebuilder:validation:Maximum/Minimum, +kubebuilder:validation:Optional/Required, and for conditional rules use +kubebuilder:validation:XValidation. OpenShift’s monitoring types already use these markers in their API package as examples. Generate manifests with controller-gen / make manifests and the validation appears under spec.validation in the CRD. [1][2][3]

Quick example (sketch):

  • on the struct field:
    // +kubebuilder:validation:Enum=v1;v2
    APIVersion string json:"apiVersion"
  • for timeout string pattern:
    // +kubebuilder:validation:Pattern=^([0-9]+(s|m|h))$
    Timeout string json:"timeout,omitempty"
  • conditional rule example (use XValidation for complex constraints):
    // +kubebuilder:validation:XValidation:rule="has(self.timeout) -> (self.timeout =~ '^[0-9]+s$')",message="timeout must be seconds"

References / examples:

  • Kubebuilder CRD validation markers (list and XValidation usage). [1]
  • OpenShift docs describing AdditionalAlertmanagerConfig fields (what to validate). [2]
  • OpenShift API Go package showing real-world use of kubebuilder validation/XValidation. [3]

If you want, I can:

  • produce a full Go type for AdditionalAlertmanagerConfig with kubebuilder markers based on the OpenShift spec, or
  • show the exact controller-gen command and CRD snippet that will be generated. Which would you prefer?

🌐 Web query:

site:github.com openshift/api AdditionalAlertmanagerConfig type definition kubebuilder validation

💡 Result:

I couldn't find a direct GitHub file named AdditionalAlertmanagerConfig with explicit kubebuilder tags in openshift/api, but the type and its OpenAPI/kubebuilder validation are published in the OpenShift API package docs and the Cluster Monitoring Operator repo/docs:

  • OpenShift monitoring API (type definitions / kubebuilder/OpenAPI fields for monitoring v1). Shows AdditionalAlertmanagerConfig fields and generated validation. [1]
  • Cluster Monitoring Operator repo (where AdditionalAlertmanagerConfig is used/configured via cluster-monitoring-config). [2]
  • OpenShift/OKD configmap reference docs describing AdditionalAlertmanagerConfig fields and allowed values. [3]

References

  1. github.com/openshift/api — monitoring v1 (API/type docs). [pkg.go.dev view]. [1]
  2. github.com/openshift/cluster-monitoring-operator — operator repository (uses additionalAlertmanagerConfigs). [2]
  3. OpenShift/OKD documentation — "AdditionalAlertmanagerConfig" configmap reference. [3]

If you want, I can open the specific Go file in the openshift/api repo (or show the exact struct and generated kubebuilder tags) and paste the relevant lines. Which would you prefer?


🌐 Web query:

github.com/openshift/api config/v1alpha1 AdditionalAlertmanagerConfig struct source code

💡 Result:

Here's the AdditionalAlertmanagerConfig Go struct (based on the OpenShift/OKD config reference fields):

type AdditionalAlertmanagerConfig struct {
APIVersion string json:"apiVersion"
BearerToken *v1.SecretKeySelector json:"bearerToken,omitempty"
PathPrefix string json:"pathPrefix,omitempty"
Scheme string json:"scheme,omitempty"
StaticConfigs []string json:"staticConfigs,omitempty"
Timeout *string json:"timeout,omitempty"
TLSConfig TLSConfig json:"tlsConfig,omitempty"
}

Sources (field list and descriptions): OpenShift/OKD config-map reference for AdditionalAlertmanagerConfig. [1] [2]

References:

  • [1] Red Hat / OpenShift docs — Config map reference (AdditionalAlertmanagerConfig). link
  • [2] OKD docs — Config map reference (AdditionalAlertmanagerConfig). link

If you want, I can locate the exact source file in the openshift/api repo and give a direct link.

Citations:


🌐 Web query:

"AdditionalAlertmanagerConfig" "kubebuilder:validation" site:github.com/openshift/api

💡 Result:

I couldn't find any occurrences of "AdditionalAlertmanagerConfig" with kubebuilder:validation markers in the openshift/api GitHub repo. Relevant references show the type documented in OpenShift/OKD docs and implemented in the Cluster Monitoring Operator instead:

  • OpenShift docs (AdditionalAlertmanagerConfig config-map reference). [1]
  • openshift/cluster-monitoring-operator repository (operator that manages the monitoring stack where AdditionalAlertmanagerConfig is used). [2]

References:
[1] Red Hat / OpenShift documentation — AdditionalAlertmanagerConfig (config map reference). (docs.redhat.com)
[2] GitHub — openshift/cluster-monitoring-operator. (github.com)

If you want, I can search the openshift/api repo more thoroughly (file-by-file) or look for kubebuilder markers in the cluster-monitoring-operator types. Which would you prefer?

Citations:


🌐 Web query:

site:github.com/openshift/api/blob "AdditionalAlertmanagerConfig" config v1alpha1

💡 Result:

I couldn't find any file named AdditionalAlertmanagerConfig under github.com/openshift/api/blob. The type is documented in OpenShift/OKD monitoring docs (describes the same fields used in cluster-monitoring-config): apiVersion, bearerToken, pathPrefix, scheme, staticConfigs, timeout, tlsConfig. See documentation: [1], [2].

Sources

  • OKD / Cluster Monitoring Operator config-map reference (AdditionalAlertmanagerConfig). [1]
  • Red Hat / Monitoring stack for OpenShift — AdditionalAlertmanagerConfig. [2]

Schema docs describe constraints that aren't enforced by the shown structural schema (risk: invalid configs admitted).

Examples in this segment: name RFC1123 constraints, scheme allowed values, pathPrefix shape, staticConfigs min/max, timeoutSeconds bounds—none appear as Pattern/Enum/MinItems/MaxItems/Minimum/Maximum in the shown schema.

Since this file is generated, please ensure the source Go types in config/v1alpha1 include kubebuilder validations (or XValidation) and then regenerate.

Comment on lines 24058 to 24133
func schema_openshift_api_config_v1alpha1_RelabelConfig(ref common.ReferenceCallback) common.OpenAPIDefinition {
return common.OpenAPIDefinition{
Schema: spec.Schema{
SchemaProps: spec.SchemaProps{
Description: "RelabelConfig represents a relabeling rule.",
Type: []string{"object"},
Properties: map[string]spec.Schema{
"name": {
SchemaProps: spec.SchemaProps{
Description: "name is a unique identifier for this relabel configuration. Must contain only alphanumeric characters, hyphens, and underscores. Must be between 1 and 63 characters in length.",
Type: []string{"string"},
Format: "",
},
},
"sourceLabels": {
VendorExtensible: spec.VendorExtensible{
Extensions: spec.Extensions{
"x-kubernetes-list-type": "set",
},
},
SchemaProps: spec.SchemaProps{
Description: "sourceLabels specifies which label names to extract from each series for this relabeling rule. Each entry must be a valid label name (non-empty). The values of these labels are joined together using the configured separator, and the resulting string is then matched against the regular expression for the replace, keep, or drop actions. If a referenced label does not exist on a series, Prometheus substitutes an empty string. When omitted, the rule operates without extracting source labels (useful for actions like labelmap). Maximum of 10 source labels can be specified, each between 1 and 128 characters.",
Type: []string{"array"},
Items: &spec.SchemaOrArray{
Schema: &spec.Schema{
SchemaProps: spec.SchemaProps{
Default: "",
Type: []string{"string"},
Format: "",
},
},
},
},
},
"separator": {
SchemaProps: spec.SchemaProps{
Description: "separator is the character sequence used to join source label values. Common examples: \";\" (default), \",\", \"::\", \"|||\". When omitted, defaults to \";\" (semicolon). Must be between 1 and 5 characters in length when specified.",
Type: []string{"string"},
Format: "",
},
},
"regex": {
SchemaProps: spec.SchemaProps{
Description: "regex is the regular expression to match against the concatenated source label values. Must be a valid RE2 regular expression (https://github.com/google/re2/wiki/Syntax). When omitted, defaults to \"(.*)\" (matches everything). Must be between 1 and 1000 characters in length when specified.",
Type: []string{"string"},
Format: "",
},
},
"targetLabel": {
SchemaProps: spec.SchemaProps{
Description: "targetLabel is the target label name where the result is written. Required for Replace and HashMod actions (enforced by validation). When omitted for other actions (Keep, Drop, LabelMap, LabelDrop, LabelKeep), no target label is set. Must be between 1 and 128 characters in length when specified.",
Type: []string{"string"},
Format: "",
},
},
"replacement": {
SchemaProps: spec.SchemaProps{
Description: "replacement is the value against which a regex replace is performed if the regular expression matches. Regex capture groups are available (e.g., $1, $2). When omitted, defaults to \"$1\" (the first capture group). Setting to an empty string (\"\") explicitly clears the target label value. Must be at most 255 characters in length.",
Type: []string{"string"},
Format: "",
},
},
"action": {
SchemaProps: spec.SchemaProps{
Description: "action is the action to perform on the matched labels. Valid actions are:\n - Replace: Replaces the value of targetLabel with replacement, using regex capture groups.\n - Keep: Keeps only metrics where regex matches the source labels.\n - Drop: Drops metrics where regex matches the source labels.\n - HashMod: Sets targetLabel to the hash modulus of the source labels.\n - LabelMap: Copies labels matching regex to new label names derived from replacement.\n - LabelDrop: Drops labels matching regex.\n - LabelKeep: Keeps only labels matching regex.",
Type: []string{"string"},
Format: "",
},
},
},
Required: []string{"name", "action"},
},
},
}
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for RelabelConfig struct definition
echo "=== Searching for RelabelConfig struct definition ==="
rg -n --hidden -S 'type\s+RelabelConfig\s+struct' -C5 --type go

echo ""
echo "=== Searching for kubebuilder validation rules on RelabelConfig ==="
rg -n --hidden -S '\+kubebuilder:validation' --type go -B2 -A2 | grep -A5 -B5 'RelabelConfig' || echo "No validation rules found near RelabelConfig"

echo ""
echo "=== Searching for action field validation in context ==="
rg -n --hidden -S 'Action\s+' --type go -B3 -A3 | grep -A5 -B5 'RelabelConfig\|type.*Action' | head -50

Repository: openshift/api

Length of output: 11169


🏁 Script executed:

#!/bin/bash
# Find the file containing RelabelConfig definition
find . -name "*.go" -type f | xargs grep -l "type RelabelConfig struct" | head -5

Repository: openshift/api

Length of output: 127


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Get the complete RelabelConfig struct from config/v1alpha1
echo "=== Complete RelabelConfig struct from config/v1alpha1 ==="
sed -n '729,800p' ./config/v1alpha1/types_cluster_monitoring.go

echo ""
echo "=== Check for action field Enum constraint ==="
rg -n --hidden -S -A50 'type RelabelConfig struct' ./config/v1alpha1/types_cluster_monitoring.go | head -80

Repository: openshift/api

Length of output: 7467


🏁 Script executed:

#!/bin/bash
# Check if action field has Enum or validation annotations
rg -n --hidden -S 'Action.*string|+kubebuilder:validation' ./config/v1alpha1/types_cluster_monitoring.go -A2 -B2 | grep -A10 -B10 'RelabelConfig\|Action\s'

Repository: openshift/api

Length of output: 185


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for RelabelAction type definition
echo "=== RelabelAction type definition ==="
rg -n --hidden -S 'type RelabelAction' ./config/v1alpha1/types_cluster_monitoring.go -A20

echo ""
echo "=== Verify if RelabelAction is a string enum with +kubebuilder:validation:Enum ==="
rg -n --hidden -S 'RelabelAction|+kubebuilder:validation:Enum' ./config/v1alpha1/types_cluster_monitoring.go -B2 -A5

Repository: openshift/api

Length of output: 1396


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for the generated schema function for RelabelConfig to see what it generates
echo "=== Search for generated OpenAPI schema function ==="
rg -n 'schema_openshift_api_config_v1alpha1_RelabelConfig' ./openapi/generated_openapi/zz_generated.openapi.go -A5

echo ""
echo "=== Check if there are any x-kubernetes-validations in the generated schema ==="
rg -n 'x-kubernetes-validations' ./openapi/generated_openapi/zz_generated.openapi.go | head -20

echo ""
echo "=== Check how RelabelAction enum is handled in generated schema ==="
rg -n 'RelabelAction|action.*enum' ./openapi/generated_openapi/zz_generated.openapi.go -i | head -30

Repository: openshift/api

Length of output: 1538


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Get the complete schema function
echo "=== Complete RelabelConfig schema function ==="
sed -n '24058,24135p' ./openapi/generated_openapi/zz_generated.openapi.go

echo ""
echo "=== Check action field details in generated schema ==="
sed -n '24058,24135p' ./openapi/generated_openapi/zz_generated.openapi.go | grep -A10 '"action"'

Repository: openshift/api

Length of output: 5611


Add Enum constraint and x-kubernetes-validations to the generated RelabelConfig schema.

The source code defines validation rules that are missing from the generated OpenAPI schema:

  • The action field should include an Enum constraint with the valid values: Replace, Keep, Drop, HashMod, LabelMap, LabelDrop, LabelKeep
  • The schema should encode the x-kubernetes-validations rule that enforces targetLabel as required when action is Replace or HashMod

This requires fixes to the OpenAPI schema generation to properly capture the validation metadata from the source type annotations.

🤖 Prompt for AI Agents
In @openapi/generated_openapi/zz_generated.openapi.go around lines 24058 -
24133, The generated OpenAPI definition returned by
schema_openshift_api_config_v1alpha1_RelabelConfig is missing the enum on the
"action" property and the x-kubernetes-validations enforcing that "targetLabel"
is required when action is Replace or HashMod; update the OpenAPIDefinition
construction for schema_openshift_api_config_v1alpha1_RelabelConfig to add an
Enum array
["Replace","Keep","Drop","HashMod","LabelMap","LabelDrop","LabelKeep"] to the
action SchemaProps and add a VendorExtensible Extensions entry
"x-kubernetes-validations" (with a rule checking .action in (Replace,HashMod) ->
require targetLabel) at the schema level or appropriate property so the
generator emits the conditional validation; locate and modify the action and
targetLabel entries inside schema_openshift_api_config_v1alpha1_RelabelConfig to
include these metadata keys.

Comment on lines 24193 to 24219
func schema_openshift_api_config_v1alpha1_Retention(ref common.ReferenceCallback) common.OpenAPIDefinition {
return common.OpenAPIDefinition{
Schema: spec.Schema{
SchemaProps: spec.SchemaProps{
Description: "Retention configures how long Prometheus retains metrics data and how much storage it can use.",
Type: []string{"object"},
Properties: map[string]spec.Schema{
"durationInDays": {
SchemaProps: spec.SchemaProps{
Description: "durationInDays specifies how many days Prometheus will retain metrics data. Prometheus automatically deletes data older than this duration. When omitted, the default is 15 days. Minimum value is 1 day. Maximum value is 365 days (1 year).",
Type: []string{"integer"},
Format: "int32",
},
},
"sizeInGiB": {
SchemaProps: spec.SchemaProps{
Description: "sizeInGiB specifies the maximum storage size in gibibytes (GiB) that Prometheus can use for data blocks and the write-ahead log (WAL). When the limit is reached, Prometheus will delete oldest data first. When omitted, no size limit is enforced and Prometheus uses available PersistentVolume capacity. Minimum value is 1 GiB. Maximum value is 16384 GiB (16 TiB).",
Type: []string{"integer"},
Format: "int32",
},
},
},
},
},
}
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for the Retention struct definition
echo "=== Searching for Retention struct definition ==="
find . -name "*.go" -type f ! -path "./vendor/*" ! -path "*generated_openapi*" | head -20

# Find files that might contain the Retention struct
rg -l "type\s+Retention\s+struct" --type go || echo "Direct search found nothing"

# Search more broadly for Retention with struct context
rg -n "type Retention struct" --type go -C2 || echo "No exact match"

Repository: openshift/api

Length of output: 764


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Look for kubebuilder validation markers related to Retention
echo "=== Searching for kubebuilder validation markers ==="
rg -n "kubebuilder:validation:(Minimum|Maximum)" --type go -C2 | head -50 || echo "No validation markers found"

# Search for files mentioning durationInDays or sizeInGiB outside generated files
echo "=== Searching for durationInDays/sizeInGiB in source code ==="
rg -l "durationInDays|sizeInGiB" --type go | grep -v "generated_openapi" || echo "Not found in non-generated files"

Repository: openshift/api

Length of output: 4046


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for the config v1alpha1 package structure
find . -type d -name "v1alpha1" | grep -v vendor | grep -v node_modules

# List Go files in likely locations
find . -path "*/config/v1alpha1*" -name "*.go" -type f 2>/dev/null | head -20

Repository: openshift/api

Length of output: 713


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for Retention struct in types_cluster_monitoring.go
rg -n "type Retention struct" config/v1alpha1/types_cluster_monitoring.go -A20 || echo "Retention struct not found with exact pattern"

# Try broader search
rg -n "Retention" config/v1alpha1/types_cluster_monitoring.go -B2 -A10 | head -100

Repository: openshift/api

Length of output: 3533


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Get the file size first
wc -l config/v1alpha1/types_cluster_monitoring.go

# Read the whole file to find Retention and surrounding context
cat config/v1alpha1/types_cluster_monitoring.go

Repository: openshift/api

Length of output: 50369


Fix OpenAPI generation: add Minimum/Maximum properties to match source kubebuilder markers.

The source Retention struct (config/v1alpha1/types_cluster_monitoring.go) has proper kubebuilder validation markers:

  • durationInDays: +kubebuilder:validation:Minimum=1 and +kubebuilder:validation:Maximum=365
  • sizeInGiB: +kubebuilder:validation:Minimum=1 and +kubebuilder:validation:Maximum=16384

However, the generated OpenAPI schema at lines 24193-24219 is missing the corresponding Minimum and Maximum properties in the SchemaProps. The constraints are only documented in descriptions (text), but not enforced as schema properties. The OpenAPI generation tool must be updated to translate these kubebuilder validation markers into OpenAPI Minimum/Maximum properties.

🤖 Prompt for AI Agents
In @openapi/generated_openapi/zz_generated.openapi.go around lines 24193 -
24219, The generated OpenAPI schema for
schema_openshift_api_config_v1alpha1_Retention is missing SchemaProps.Minimum
and SchemaProps.Maximum for the integer properties durationInDays and sizeInGiB;
update the generator to map the kubebuilder validation markers
(+kubebuilder:validation:Minimum and :Maximum) into spec.SchemaProps.Minimum and
spec.SchemaProps.Maximum (they are *float64 fields) for those property entries
(durationInDays and sizeInGiB) so the SchemaProps include the numeric
constraints (e.g., Minimum=1, Maximum=365 for durationInDays and Minimum=1,
Maximum=16384 for sizeInGiB) while preserving Type and Format.

Comment on lines 24450 to 24477
func schema_openshift_api_config_v1alpha1_VolumeClaimConfig(ref common.ReferenceCallback) common.OpenAPIDefinition {
return common.OpenAPIDefinition{
Schema: spec.Schema{
SchemaProps: spec.SchemaProps{
Description: "VolumeClaimConfig defines the configuration for a PersistentVolumeClaim used for storage.",
Type: []string{"object"},
Properties: map[string]spec.Schema{
"storageClassName": {
SchemaProps: spec.SchemaProps{
Description: "storageClassName is the name of the StorageClass to use for the PersistentVolumeClaim. When omitted, the default StorageClass is used. Must be a valid DNS subdomain name (RFC 1123) when specified. Maximum length is 253 characters.",
Type: []string{"string"},
Format: "",
},
},
"sizeInGiB": {
SchemaProps: spec.SchemaProps{
Description: "sizeInGiB specifies the storage capacity in gibibytes (GiB) for the volume. This determines how much data can be stored. Minimum value is 1 GiB. Maximum value is 16384 GiB (16 TiB).",
Type: []string{"integer"},
Format: "int32",
},
},
},
Required: []string{"sizeInGiB"},
},
},
}
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "*.go" ! -path "*/vendor/*" ! -path "*/.git/*" | head -20

Repository: openshift/api

Length of output: 712


🏁 Script executed:

rg -l "type\s+VolumeClaimConfig\s+struct" --type go

Repository: openshift/api

Length of output: 100


🏁 Script executed:

rg "type VolumeClaimConfig struct" -A 10 -B 2 --type go

Repository: openshift/api

Length of output: 1452


🏁 Script executed:

rg "sizeInGiB" -A 5 -B 2 config/v1alpha1/types_cluster_monitoring.go

Repository: openshift/api

Length of output: 1387


Add Minimum/Maximum constraints for sizeInGiB and MaxLength/Pattern for storageClassName to the OpenAPI schema.

The source struct VolumeClaimConfig defines kubebuilder validation constraints (Minimum=1, Maximum=16384 for sizeInGiB; MaxLength=253 and RFC1123 validation for storageClassName), but these are not reflected in the generated OpenAPI schema. The descriptions document these constraints, but they should also be enforced at the schema level via Minimum, Maximum, and Pattern fields for proper API validation and client code generation.

🤖 Prompt for AI Agents
In @openapi/generated_openapi/zz_generated.openapi.go around lines 24450 -
24477, The generated OpenAPI definition returned by
schema_openshift_api_config_v1alpha1_VolumeClaimConfig is missing schema
constraints: add numeric bounds for the integer property "sizeInGiB" by setting
Minimum=1 and Maximum=16384 on that spec.Schema, and add validation for
"storageClassName" by setting MaxLength=253 and a Pattern matching RFC1123 DNS
subdomain on that spec.Schema (e.g. the existing RFC1123 regex used elsewhere in
your codegen). Update the SchemaProps for the "sizeInGiB" and "storageClassName"
entries accordingly so the OpenAPI schema enforces the same kubebuilder
constraints as the source struct.

Comment on lines 453 to 485
volumeClaimTemplate:
description: |-
volumeClaimTemplate Defines persistent storage for Alertmanager. Use this setting to
configure the persistent volume claim, including storage class, volume
size, and name.
volumeClaimTemplate defines persistent storage for Alertmanager. Use this setting to
configure the persistent volume claim, including storage class and volume size.
If omitted, the Pod uses ephemeral storage and alert data will not persist
across restarts.
This field is optional.
minProperties: 1
properties:
apiVersion:
sizeInGiB:
description: |-
APIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
type: string
kind:
sizeInGiB specifies the storage capacity in gibibytes (GiB) for the volume.
This determines how much data can be stored.
Minimum value is 1 GiB.
Maximum value is 16384 GiB (16 TiB).
format: int32
maximum: 16384
minimum: 1
type: integer
storageClassName:
description: |-
Kind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
storageClassName is the name of the StorageClass to use for the PersistentVolumeClaim.
When omitted, the default StorageClass is used.
Must be a valid DNS subdomain name (RFC 1123) when specified.
Maximum length is 253 characters.
maxLength: 253
type: string
metadata:
description: |-
Standard object's metadata.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata
type: object
spec:
description: |-
spec defines the desired characteristics of a volume requested by a pod author.
More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#persistentvolumeclaims
properties:
accessModes:
description: |-
accessModes contains the desired access modes the volume should have.
More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#access-modes-1
items:
type: string
type: array
x-kubernetes-list-type: atomic
dataSource:
description: |-
dataSource field can be used to specify either:
* An existing VolumeSnapshot object (snapshot.storage.k8s.io/VolumeSnapshot)
* An existing PVC (PersistentVolumeClaim)
If the provisioner or an external controller can support the specified data source,
it will create a new volume based on the contents of the specified data source.
When the AnyVolumeDataSource feature gate is enabled, dataSource contents will be copied to dataSourceRef,
and dataSourceRef contents will be copied to dataSource when dataSourceRef.namespace is not specified.
If the namespace is specified, then dataSourceRef will not be copied to dataSource.
properties:
apiGroup:
description: |-
APIGroup is the group for the resource being referenced.
If APIGroup is not specified, the specified Kind must be in the core API group.
For any other third-party types, APIGroup is required.
type: string
kind:
description: Kind is the type of resource being
referenced
type: string
name:
description: Name is the name of resource being
referenced
type: string
required:
- kind
- name
type: object
x-kubernetes-map-type: atomic
dataSourceRef:
description: |-
dataSourceRef specifies the object from which to populate the volume with data, if a non-empty
volume is desired. This may be any object from a non-empty API group (non
core object) or a PersistentVolumeClaim object.
When this field is specified, volume binding will only succeed if the type of
the specified object matches some installed volume populator or dynamic
provisioner.
This field will replace the functionality of the dataSource field and as such
if both fields are non-empty, they must have the same value. For backwards
compatibility, when namespace isn't specified in dataSourceRef,
both fields (dataSource and dataSourceRef) will be set to the same
value automatically if one of them is empty and the other is non-empty.
When namespace is specified in dataSourceRef,
dataSource isn't set to the same value and must be empty.
There are three important differences between dataSource and dataSourceRef:
* While dataSource only allows two specific types of objects, dataSourceRef
allows any non-core object, as well as PersistentVolumeClaim objects.
* While dataSource ignores disallowed values (dropping them), dataSourceRef
preserves all values, and generates an error if a disallowed value is
specified.
* While dataSource only allows local objects, dataSourceRef allows objects
in any namespaces.
(Beta) Using this field requires the AnyVolumeDataSource feature gate to be enabled.
(Alpha) Using the namespace field of dataSourceRef requires the CrossNamespaceVolumeDataSource feature gate to be enabled.
properties:
apiGroup:
description: |-
APIGroup is the group for the resource being referenced.
If APIGroup is not specified, the specified Kind must be in the core API group.
For any other third-party types, APIGroup is required.
type: string
kind:
description: Kind is the type of resource being
referenced
type: string
name:
description: Name is the name of resource being
referenced
type: string
namespace:
description: |-
Namespace is the namespace of resource being referenced
Note that when a namespace is specified, a gateway.networking.k8s.io/ReferenceGrant object is required in the referent namespace to allow that namespace's owner to accept the reference. See the ReferenceGrant documentation for details.
(Alpha) This field requires the CrossNamespaceVolumeDataSource feature gate to be enabled.
type: string
required:
- kind
- name
type: object
resources:
description: |-
resources represents the minimum resources the volume should have.
If RecoverVolumeExpansionFailure feature is enabled users are allowed to specify resource requirements
that are lower than previous value but must still be higher than capacity recorded in the
status field of the claim.
More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#resources
properties:
limits:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: |-
Limits describes the maximum amount of compute resources allowed.
More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
type: object
requests:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: |-
Requests describes the minimum amount of compute resources required.
If Requests is omitted for a container, it defaults to Limits if that is explicitly specified,
otherwise to an implementation-defined value. Requests cannot exceed Limits.
More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
type: object
type: object
selector:
description: selector is a label query over volumes
to consider for binding.
properties:
matchExpressions:
description: matchExpressions is a list of label
selector requirements. The requirements are
ANDed.
items:
description: |-
A label selector requirement is a selector that contains values, a key, and an operator that
relates the key and values.
properties:
key:
description: key is the label key that the
selector applies to.
type: string
operator:
description: |-
operator represents a key's relationship to a set of values.
Valid operators are In, NotIn, Exists and DoesNotExist.
type: string
values:
description: |-
values is an array of string values. If the operator is In or NotIn,
the values array must be non-empty. If the operator is Exists or DoesNotExist,
the values array must be empty. This array is replaced during a strategic
merge patch.
items:
type: string
type: array
x-kubernetes-list-type: atomic
required:
- key
- operator
type: object
type: array
x-kubernetes-list-type: atomic
matchLabels:
additionalProperties:
type: string
description: |-
matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels
map is equivalent to an element of matchExpressions, whose key field is "key", the
operator is "In", and the values array contains only "value". The requirements are ANDed.
type: object
type: object
x-kubernetes-map-type: atomic
storageClassName:
description: |-
storageClassName is the name of the StorageClass required by the claim.
More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#class-1
type: string
volumeAttributesClassName:
description: |-
volumeAttributesClassName may be used to set the VolumeAttributesClass used by this claim.
If specified, the CSI driver will create or update the volume with the attributes defined
in the corresponding VolumeAttributesClass. This has a different purpose than storageClassName,
it can be changed after the claim is created. An empty string or nil value indicates that no
VolumeAttributesClass will be applied to the claim. If the claim enters an Infeasible error state,
this field can be reset to its previous value (including nil) to cancel the modification.
If the resource referred to by volumeAttributesClass does not exist, this PersistentVolumeClaim will be
set to a Pending state, as reflected by the modifyVolumeStatus field, until such as a resource
exists.
More info: https://kubernetes.io/docs/concepts/storage/volume-attributes-classes/
type: string
volumeMode:
description: |-
volumeMode defines what type of volume is required by the claim.
Value of Filesystem is implied when not included in claim spec.
type: string
volumeName:
description: volumeName is the binding reference to
the PersistentVolume backing this claim.
type: string
type: object
status:
description: |-
status represents the current information/status of a persistent volume claim.
Read-only.
More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#persistentvolumeclaims
properties:
accessModes:
description: |-
accessModes contains the actual access modes the volume backing the PVC has.
More info: https://kubernetes.io/docs/concepts/storage/persistent-volumes#access-modes-1
items:
type: string
type: array
x-kubernetes-list-type: atomic
allocatedResourceStatuses:
additionalProperties:
description: |-
When a controller receives persistentvolume claim update with ClaimResourceStatus for a resource
that it does not recognizes, then it should ignore that update and let other controllers
handle it.
type: string
description: "allocatedResourceStatuses stores status
of resource being resized for the given PVC.\nKey
names follow standard Kubernetes label syntax. Valid
values are either:\n\t* Un-prefixed keys:\n\t\t-
storage - the capacity of the volume.\n\t* Custom
resources must use implementation-defined prefixed
names such as \"example.com/my-custom-resource\"\nApart
from above values - keys that are unprefixed or
have kubernetes.io prefix are considered\nreserved
and hence may not be used.\n\nClaimResourceStatus
can be in any of following states:\n\t- ControllerResizeInProgress:\n\t\tState
set when resize controller starts resizing the volume
in control-plane.\n\t- ControllerResizeFailed:\n\t\tState
set when resize has failed in resize controller
with a terminal error.\n\t- NodeResizePending:\n\t\tState
set when resize controller has finished resizing
the volume but further resizing of\n\t\tvolume is
needed on the node.\n\t- NodeResizeInProgress:\n\t\tState
set when kubelet starts resizing the volume.\n\t-
NodeResizeFailed:\n\t\tState set when resizing has
failed in kubelet with a terminal error. Transient
errors don't set\n\t\tNodeResizeFailed.\nFor example:
if expanding a PVC for more capacity - this field
can be one of the following states:\n\t- pvc.status.allocatedResourceStatus['storage']
= \"ControllerResizeInProgress\"\n - pvc.status.allocatedResourceStatus['storage']
= \"ControllerResizeFailed\"\n - pvc.status.allocatedResourceStatus['storage']
= \"NodeResizePending\"\n - pvc.status.allocatedResourceStatus['storage']
= \"NodeResizeInProgress\"\n - pvc.status.allocatedResourceStatus['storage']
= \"NodeResizeFailed\"\nWhen this field is not set,
it means that no resize operation is in progress
for the given PVC.\n\nA controller that receives
PVC update with previously unknown resourceName
or ClaimResourceStatus\nshould ignore the update
for the purpose it was designed. For example - a
controller that\nonly is responsible for resizing
capacity of the volume, should ignore PVC updates
that change other valid\nresources associated with
PVC.\n\nThis is an alpha field and requires enabling
RecoverVolumeExpansionFailure feature."
type: object
x-kubernetes-map-type: granular
allocatedResources:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: "allocatedResources tracks the resources
allocated to a PVC including its capacity.\nKey
names follow standard Kubernetes label syntax. Valid
values are either:\n\t* Un-prefixed keys:\n\t\t-
storage - the capacity of the volume.\n\t* Custom
resources must use implementation-defined prefixed
names such as \"example.com/my-custom-resource\"\nApart
from above values - keys that are unprefixed or
have kubernetes.io prefix are considered\nreserved
and hence may not be used.\n\nCapacity reported
here may be larger than the actual capacity when
a volume expansion operation\nis requested.\nFor
storage quota, the larger value from allocatedResources
and PVC.spec.resources is used.\nIf allocatedResources
is not set, PVC.spec.resources alone is used for
quota calculation.\nIf a volume expansion capacity
request is lowered, allocatedResources is only\nlowered
if there are no expansion operations in progress
and if the actual volume capacity\nis equal or lower
than the requested capacity.\n\nA controller that
receives PVC update with previously unknown resourceName\nshould
ignore the update for the purpose it was designed.
For example - a controller that\nonly is responsible
for resizing capacity of the volume, should ignore
PVC updates that change other valid\nresources associated
with PVC.\n\nThis is an alpha field and requires
enabling RecoverVolumeExpansionFailure feature."
type: object
capacity:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: capacity represents the actual resources
of the underlying volume.
type: object
conditions:
description: |-
conditions is the current Condition of persistent volume claim. If underlying persistent volume is being
resized then the Condition will be set to 'Resizing'.
items:
description: PersistentVolumeClaimCondition contains
details about state of pvc
properties:
lastProbeTime:
description: lastProbeTime is the time we probed
the condition.
format: date-time
type: string
lastTransitionTime:
description: lastTransitionTime is the time
the condition transitioned from one status
to another.
format: date-time
type: string
message:
description: message is the human-readable message
indicating details about last transition.
type: string
reason:
description: |-
reason is a unique, this should be a short, machine understandable string that gives the reason
for condition's last transition. If it reports "Resizing" that means the underlying
persistent volume is being resized.
type: string
status:
description: |-
Status is the status of the condition.
Can be True, False, Unknown.
More info: https://kubernetes.io/docs/reference/kubernetes-api/config-and-storage-resources/persistent-volume-claim-v1/#:~:text=state%20of%20pvc-,conditions.status,-(string)%2C%20required
type: string
type:
description: |-
Type is the type of the condition.
More info: https://kubernetes.io/docs/reference/kubernetes-api/config-and-storage-resources/persistent-volume-claim-v1/#:~:text=set%20to%20%27ResizeStarted%27.-,PersistentVolumeClaimCondition,-contains%20details%20about
type: string
required:
- status
- type
type: object
type: array
x-kubernetes-list-map-keys:
- type
x-kubernetes-list-type: map
currentVolumeAttributesClassName:
description: |-
currentVolumeAttributesClassName is the current name of the VolumeAttributesClass the PVC is using.
When unset, there is no VolumeAttributeClass applied to this PersistentVolumeClaim
type: string
modifyVolumeStatus:
description: |-
ModifyVolumeStatus represents the status object of ControllerModifyVolume operation.
When this is unset, there is no ModifyVolume operation being attempted.
properties:
status:
description: "status is the status of the ControllerModifyVolume
operation. It can be in any of following states:\n
- Pending\n Pending indicates that the PersistentVolumeClaim
cannot be modified due to unmet requirements,
such as\n the specified VolumeAttributesClass
not existing.\n - InProgress\n InProgress
indicates that the volume is being modified.\n
- Infeasible\n Infeasible indicates that the
request has been rejected as invalid by the
CSI driver. To\n\t resolve the error, a valid
VolumeAttributesClass needs to be specified.\nNote:
New statuses can be added in the future. Consumers
should check for unknown statuses and fail appropriately."
type: string
targetVolumeAttributesClassName:
description: targetVolumeAttributesClassName is
the name of the VolumeAttributesClass the PVC
currently being reconciled
type: string
required:
- status
type: object
phase:
description: phase represents the current phase of
PersistentVolumeClaim.
type: string
type: object
x-kubernetes-validations:
- message: storageClassName must be a valid DNS subdomain
name
rule: '!format.dns1123Subdomain().validate(self).hasValue()'
required:
- sizeInGiB
type: object
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

CRD volumeClaimTemplate reflects breaking API change.

The CRD schema for volumeClaimTemplate has been updated to match the simplified VolumeClaimConfig type, with only sizeInGiB and storageClassName fields.

This CRD change is consistent with the Go type definition but is part of the breaking serialization change that should trigger an API version bump, as discussed in the review comments for the Go types file.

The CRD manifest accurately reflects the Go types, but the underlying serialization change requires an API version bump from v1alpha1 to v1alpha2.

Based on learnings from PR discussion.

Copy link
Contributor

@marioferh marioferh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the addressed comments are out of the scope of this PR prometheusConfig. If you agree I can do a new one to fix that.

Comment on lines 459 to 460
// +kubebuilder:validation:Minimum=1
// +kubebuilder:validation:Maximum=1099511627776
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think practical values would be:
a minimum of 10 KB,
and anything over 1 GB doesn’t make sense.
But I’m not sure about futurecases, special or corner cases.

Comment on lines -254 to +265
VolumeClaimTemplate *v1.PersistentVolumeClaim `json:"volumeClaimTemplate,omitempty"`
VolumeClaimTemplate VolumeClaimConfig `json:"volumeClaimTemplate,omitempty,omitzero"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the same here:

VolumeClaimTemplate *v1.PersistentVolumeClaim `json:"volumeClaimTemplate,omitempty"`

and discussed before

// Maximum length is 253 characters.
// +kubebuilder:validation:MaxLength=253
// +kubebuilder:validation:MinLength=1
// +kubebuilder:validation:XValidation:rule="self.matches('^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\\\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$')",message="must be a valid DNS subdomain: lowercase alphanumeric, hyphens, or periods"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thx

// instances.
// Possible values are `HTTP` or `HTTPS`.
// When omitted, defaults to `HTTP`.
// +kubebuilder:validation:Enum=HTTP;HTTPS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// +kubebuilder:validation:MaxItems=10
// +listType=map
// +listMapKey=name
WriteRelabelConfigs []RelabelConfig `json:"writeRelabelConfigs,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then should I remove minimum length 1 of every [] field?

// +kubebuilder:validation:MaxLength=1000
Regex string `json:"regex,omitempty"`
// targetLabel is the target label name where the result is written.
// Required for Replace and HashMod actions (enforced by validation).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

// VolumeClaimConfig defines the configuration for a PersistentVolumeClaim used for storage.
// +kubebuilder:validation:MinProperties=1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// Prometheus automatically deletes data older than this duration.
// When omitted, the default is 15 days.
// Minimum value is 1 day.
// Maximum value is 365 days (1 year).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it could be, but prometheus is not done for long term retention.

Signed-off-by: Mario Fernandez <mariofer@redhat.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In @config/v1alpha1/types_cluster_monitoring.go:
- Around line 286-300: The comments for the LogLevel constants are grammatically
inconsistent; update the doc comments for LogLevelWarn, LogLevelInfo, and
LogLevelDebug (the constant declarations with names LogLevelWarn, LogLevelInfo,
LogLevelDebug) to remove the leading commas and make them full sentences
matching LogLevelError (e.g., "LogLevelWarn logs warnings and errors.",
"LogLevelInfo logs general information, warnings, and errors.", "LogLevelDebug
logs detailed debugging information.").

In
@payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml:
- Around line 1482-1486: The x-kubernetes-validations rule string is malformed;
replace the current rule for targetLabel under x-kubernetes-validations with a
properly quoted expression that checks when self.action is 'Replace' or
'HashMod' then has(self.targetLabel) and self.targetLabel != "" otherwise true;
ensure the inner list uses doubled single-quotes for YAML single-quoted string
(e.g., ''Replace'', ''HashMod''), the comparison uses != "" (empty double
quotes) and the ternary is wrapped correctly (self.action in [...] ?
(has(self.targetLabel) && self.targetLabel != "") : true) so the string has
matching quotes and parentheses.

In
@payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml:
- Around line 1483-1487: The CEL expression in the relabel validation rule uses
a non-ASCII closing quote (self.targetLabel != ”) which makes the CEL invalid;
update the rule string to use an ASCII empty-string check and proper
quoting/escaping so the condition reads self.action in ['Replace','HashMod'] ?
has(self.targetLabel) && self.targetLabel != '' : true (i.e., replace the
Unicode quote with two ASCII single quotes inside the YAML single-quoted rule
value), ensuring the surrounding YAML quoting/escaping stays correct for the
rule field.
🧹 Nitpick comments (2)
payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml (1)

1244-1282: externalLabels.key is likely too permissive for Prometheus label-name rules

Schema currently allows (per description) any UTF-8 string; invalid label keys will likely make Prometheus reject config at runtime. Consider enforcing the Prometheus label-name regex (and adjust the description accordingly).

Proposed schema tightening (example)
                       externalLabels:
                         description: |-
                           externalLabels defines labels to be attached to time series and alerts
@@
                     items:
@@
                       properties:
                         key:
                           description: |-
                             key is the name of the label.
-                            Prometheus supports UTF-8 label names, so any valid UTF-8 string is allowed.
+                            Must be a valid Prometheus label name.
                             Must be between 1 and 128 characters in length.
                           maxLength: 128
                           minLength: 1
                           type: string
+                          x-kubernetes-validations:
+                          - message: key must be a valid Prometheus label name
+                            rule: self.matches('^[a-zA-Z_][a-zA-Z0-9_]*$')
openapi/openapi.json (1)

13117-13128: Consider whether URL is the appropriate map key for remoteWrite.

The remoteWrite field uses url as the x-kubernetes-list-map-key. While URLs are typically unique, there are edge cases where the same URL might be used with different configurations (e.g., different write relabeling rules, different authentication).

Using URL as the map key means:

  1. Users cannot configure the same endpoint multiple times with different settings
  2. Server-side apply will merge entries with the same URL, potentially causing configuration loss

Consider whether a name field (similar to AdditionalAlertmanagerConfig and RelabelConfig) would be more appropriate as the map key, with url remaining as a required but non-unique field. This would provide more flexibility for users who need multiple configurations to the same endpoint.

If the current design is intentional (enforcing one config per URL), this is acceptable, but the documentation should explicitly state that limitation.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between f218565 and ff7d44e.

⛔ Files ignored due to path filters (4)
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • config/v1alpha1/zz_generated.featuregated-crd-manifests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
📒 Files selected for processing (7)
  • config/v1alpha1/types_cluster_monitoring.go
  • config/v1alpha1/zz_generated.swagger_doc_generated.go
  • openapi/generated_openapi/zz_generated.openapi.go
  • openapi/openapi.json
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml
🧰 Additional context used
🧬 Code graph analysis (1)
config/v1alpha1/zz_generated.swagger_doc_generated.go (1)
config/v1alpha1/types_cluster_monitoring.go (10)
  • AdditionalAlertmanagerConfig (611-671)
  • AuthorizationConfig (858-871)
  • Label (674-688)
  • PrometheusConfig (434-597)
  • RelabelConfig (730-796)
  • RemoteWriteSpec (691-726)
  • Retention (916-936)
  • SecretKeySelector (875-892)
  • TLSConfig (803-831)
  • VolumeClaimConfig (895-912)
🔇 Additional comments (39)
payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml (2)

453-485: Compatibility level 4 permits breaking changes to v1alpha1—no version bump needed

The Compatibility level 4 annotation explicitly states "No compatibility is provided, the API can change at any point for any reason." This is an alpha API with no stability guarantees, so schema changes do not require bumping the version. Additionally, only v1alpha1 exists for this API; there is no stable v1 to migrate clients to.

Likely an incorrect or invalid review comment.


1033-1050: Fix non-ASCII quote in targetLabel validation (line 1486)

Line 1486 contains a Unicode right double quotation mark (U+201D) instead of a standard ASCII quote in self.targetLabel != ", which will cause CEL validation to fail. Replace with the correct ASCII quote character.

Proposed fix
                           x-kubernetes-validations:
                           - message: targetLabel is required when action is Replace
                               or HashMod
                             rule: 'self.action in [''Replace'', ''HashMod''] ? has(self.targetLabel)
-                              && self.targetLabel != " : true'
+                              && self.targetLabel != "" : true'

The isURL() and url() functions used in staticConfigs (lines 1046–1048) and remoteWrite.url (lines 1382–1383) are supported in Kubernetes CEL and include the necessary accessors (getHostname(), getPort(), getScheme(), etc.). No changes needed there.

Likely an incorrect or invalid review comment.

payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml (1)

453-485: Remove concern: This is a Compatibility Level 4 API, which explicitly permits schema changes.

The CRD header states: "Compatibility level 4: No compatibility is provided, the API can change at any point for any reason. These capabilities should not be used by applications needing long term support." This designation explicitly authorizes the schema changes you're concerned about, making a v1alpha2 unnecessary for compatibility reasons. Only v1alpha1 is defined, and versioning decisions should be driven by feature stabilization, not compatibility guarantees that don't apply to this API.

Likely an incorrect or invalid review comment.

payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml (2)

894-1881: Well-structured API with comprehensive validations.

The new prometheusConfig section demonstrates good API design practices:

  • Comprehensive validation: CEL validation rules effectively constrain inputs (DNS names, URLs, timeouts, sizes, etc.)
  • Consistent patterns: Similar structures (like TLSConfig, SecretKeySelector) are reused throughout, promoting consistency
  • Proper constraints: All list types have reasonable min/max items, strings have length limits, and numeric fields have appropriate ranges
  • Empty object prevention: Using minProperties: 1 on optional nested objects (lines 907, 1068, 1581) ensures they're not set to empty when specified
  • Clear documentation: Field descriptions provide good context and examples

The structure properly supports common Prometheus use cases like remote write, additional alertmanagers, and persistent storage configuration.

Note: The prometheusConfig.volumeClaimTemplate (lines 1849-1880) uses the same simplified structure as alertmanagerConfig.volumeClaimTemplate. Since this is a new field, there's no breaking change concern here.


453-484: This review comment is based on a false premise. The evidence shows that this is the initial introduction of the ClusterMonitoring v1alpha1 API, not a modification of an existing released API. The CRD files were newly created in this PR (commit ff7d44e) and are marked as DevPreviewNoUpgrade, meaning v1alpha1 has not yet shipped to users. Since there are no existing clients using the prior schema, the volumeClaimTemplate simplification is not a breaking change and does not require an API version bump.

The concern about backward compatibility would only apply if v1alpha1 had already been released to end users. In this case, the API is being introduced for the first time, so the simplified volumeClaimTemplate structure is the initial design and is appropriate.

Likely an incorrect or invalid review comment.

openapi/openapi.json (9)

12192-12217: Union discriminator pattern correctly implemented.

The AuthorizationConfig type properly uses the x-kubernetes-unions discriminator pattern to enforce that bearerToken is required when type is "BearerToken" and forbidden otherwise. This is the correct pattern for conditional field requirements in Kubernetes APIs.

Similar to the previous comment, ensure that the type field has an enum constraint (["None", "BearerToken"]) enforced via CEL validation rules in the CRD, since it's not expressed in this OpenAPI schema.


12472-12476: LGTM: Non-breaking addition of prometheusConfig field.

The addition of the prometheusConfig field to ClusterMonitoringSpec is a non-breaking additive change. The field is properly documented with comprehensive usage guidance and correctly references the new PrometheusConfig type.


12824-12841: Verify CEL validation for length constraints on Label fields.

The Label type descriptions specify that both key and value must be between 1 and 128 characters, but these constraints are not expressed in the OpenAPI schema via minLength and maxLength. Ensure these are enforced via CEL validation rules in the CRD.


13064-13267: Verify comprehensive CEL validation for PrometheusConfig and related types.

The PrometheusConfig, RelabelConfig, RemoteWriteSpec, and Retention types have extensive validation constraints documented in their descriptions, but these are not expressed in the OpenAPI schema. This pattern is consistent across the new types and is acceptable if enforced via CEL.

Examples of documented but not schema-enforced constraints:

  • additionalAlertmanagerConfigs: min 1, max 10 items
  • externalLabels: min 1, max 50 items
  • remoteWrite: min 1, max 10 items
  • enforcedBodySizeLimitBytes: min 1, max 1099511627776
  • durationInDays: min 1, max 365
  • sizeInGiB: min 1, max 16384
  • nodeSelector: max 10 entries
  • tolerations: min 1, max 10 items
  • topologySpreadConstraints: min 1, max 10 items
  • resources: min 1, max 10 items

Ensure all these constraints are properly enforced via CEL validation rules in the CRD definition.


13333-13351: LGTM: SecretKeySelector correctly uses atomic map type.

The SecretKeySelector type correctly uses x-kubernetes-map-type: atomic, which ensures that the entire secret reference is treated as a single unit during updates (replaced entirely rather than merged field-by-field). This is the correct pattern for secret references in Kubernetes APIs.

As with other types, ensure that the documented validation constraints (name and key format/length) are enforced via CEL rules.


13370-13398: Verify CEL validation for TLSConfig complex constraints.

The TLSConfig type has complex validation requirements documented in its description:

  1. "At least one TLS configuration option must be specified"
  2. "For mutual TLS (mTLS), both cert and key must be specified together, or both omitted"
  3. certificateVerification must be either "Verify" or "SkipVerify"

These constraints cannot be expressed in OpenAPI and must be enforced via CEL validation rules. Ensure the CRD includes CEL rules for:

  • At least one of {ca, cert, key, certificateVerification, serverName} must be set
  • (cert == nil && key == nil) || (cert != nil && key != nil) // mutual requirement
  • certificateVerification in ["Verify", "SkipVerify"]

13417-13434: VolumeClaimConfig: Verify CEL validation and review in context of breaking change.

The VolumeClaimConfig type is the simplified replacement for PersistentVolumeClaim in the volumeClaimTemplate field. This simplification reduces the API surface from ~40 PVC fields to just 2 fields (sizeInGiB and storageClassName), which improves validation and user experience.

However, this is part of the breaking change flagged earlier. The schema looks correct for the new simplified design, but ensure:

  1. CEL validation enforces sizeInGiB constraints (min=1, max=16384)
  2. storageClassName validation (DNS subdomain, max 253 chars)
  3. This change is properly versioned (v1alpha1 vs v1alpha2) based on release status

12049-12094: The review comment is incorrect. All validation constraints mentioned in the AdditionalAlertmanagerConfig descriptions are already enforced in the ClusterMonitoring CRD:

  • staticConfigs: Enforced via minItems: 1, maxItems: 10 in the schema, plus CEL validation rule for host:port format (validates hostname/IPv4/IPv6 and port range 1-65535)
  • timeoutSeconds: Enforced via minimum: 1, maximum: 600 in the schema
  • scheme: Enforced via enum: [HTTP, HTTPS] in the schema
  • pathPrefix: Enforced via three CEL validation rules (must start with /, must not end with / unless root, must not contain ? or #)

The OpenAPI JSON file referenced in the review does not display these detailed constraints because OpenAPI is derived from the YAML CRD, where the constraints are fully defined.

Likely an incorrect or invalid review comment.


12172-12176: No breaking change issue: v1alpha1 has not shipped yet.

This type change from io.k8s.api.core.v1.PersistentVolumeClaim to VolumeClaimConfig occurred during the initial development of v1alpha1 on January 8-9, 2026—before any release. The API version was first introduced in commit e20773b with PersistentVolumeClaim, then refined to VolumeClaimConfig in commit ff7d44e within the same development cycle.

Per the repository's API guidelines, breaking changes require new API versions only after an API has shipped. Since v1alpha1 hasn't been released to users yet, this change is acceptable during the pre-release development phase.

config/v1alpha1/zz_generated.swagger_doc_generated.go (1)

121-354: LGTM! Generated documentation accurately reflects the new API types.

The auto-generated Swagger documentation correctly maps all fields for the 10 new types (AdditionalAlertmanagerConfig, AuthorizationConfig, Label, PrometheusConfig, RelabelConfig, RemoteWriteSpec, Retention, SecretKeySelector, TLSConfig, VolumeClaimConfig) and properly updates the volumeClaimTemplate documentation to reflect the simplified VolumeClaimConfig structure.

All field descriptions match their corresponding source type definitions, and the documentation maintains consistency in style and terminology.

openapi/generated_openapi/zz_generated.openapi.go (12)

445-498: Schema registrations look correct.

The new v1alpha1 type registrations follow the established naming conventions and are properly alphabetized. The wiring of the new OpenAPI definitions appears consistent with the existing pattern.


22118-22194: AdditionalAlertmanagerConfig schema is well-structured.

The schema definition properly declares dependencies, includes comprehensive field descriptions, and uses appropriate Kubernetes validation markers (x-kubernetes-list-type: set). The use of name as a map key for server-side apply is correctly documented.


22376-22416: AuthorizationConfig discriminated union is correctly implemented.

The schema properly uses the x-kubernetes-unions extension to implement a discriminated union pattern based on the type field. The union configuration correctly specifies that bearerToken is only valid when type is "BearerToken".


22853-22871: PrometheusConfig field addition looks good.

The new prometheusConfig field is properly integrated into the ClusterMonitoringSpec schema with comprehensive documentation and correct dependency declarations.


23454-23480: Label schema definition is correct.

The simple key/value structure appropriately represents Prometheus external labels with proper validation constraints.


23855-24056: PrometheusConfig schema is comprehensive but has the same API versioning concern.

The schema definition is well-structured with appropriate Kubernetes list validations (x-kubernetes-list-map-keys, x-kubernetes-list-type) for server-side apply support. However, the volumeClaimTemplate field at lines 24043-24048 has the same API versioning concern as noted for AlertmanagerConfig - if v1alpha1 has shipped, changing from PersistentVolumeClaim to VolumeClaimConfig requires a version bump.


24058-24218: RelabelConfig, RemoteWriteSpec, and Retention schemas are well-defined.

These schemas follow consistent patterns with appropriate validation markers and comprehensive field descriptions. The x-kubernetes-list-map-keys annotations enable proper server-side apply support.


24220-24312: Backup retention types are correctly generated.

Note that RetentionNumberConfig, RetentionPolicy, and RetentionSizeConfig (lines 24220-24312) are backup retention configurations, distinct from the Prometheus metrics Retention type (lines 24193-24218). The schemas are properly generated from their source types.


24314-24345: SecretKeySelector schema is correctly defined.

The schema appropriately uses x-kubernetes-map-type: atomic to ensure the entire selector is treated as an atomic unit for updates. Field descriptions clearly specify namespace and validation constraints.


24377-24425: TLSConfig schema structure is correct.

The schema properly declares dependencies on SecretKeySelector for CA, certificate, and key references. Note that the description states "At least one TLS configuration option must be specified" and "both cert and key must be specified together, or both omitted," but these constraints are not enforced at the OpenAPI schema level. These validations are likely implemented via CEL rules or admission webhooks in the source API definition.


24450-24476: VolumeClaimConfig schema is correctly defined.

The simplified two-field structure (sizeInGiB and storageClassName) is properly defined. This is the new type that replaces the more complex corev1.PersistentVolumeClaim in the volumeClaimTemplate fields. As generated code, this accurately reflects the source API definition. The API versioning concern noted earlier for AlertmanagerConfig and PrometheusConfig applies to this type.


22342-22351: This review comment is based on an incorrect assumption and should be dismissed.

The v1alpha1 API for ClusterMonitoring is explicitly marked as pre-release with "Compatibility level 4: No compatibility is provided, the API can change at any point for any reason." This means v1alpha1 is not a shipped/stable API and is free to change. Furthermore, VolumeClaimConfig appears to be the original and only type used in AlertmanagerConfig.volumeClaimTemplate—there is no evidence of a prior change from PersistentVolumeClaim to VolumeClaimConfig. Since v1alpha1 has never been released and is explicitly designed to be unstable, the concern about breaking shipped APIs does not apply.

Likely an incorrect or invalid review comment.

config/v1alpha1/types_cluster_monitoring.go (12)

92-104: LGTM: Well-documented field addition.

The prometheusConfig field is properly documented with clear use cases and default behavior. The optional nature and omitzero tag are appropriate for this configuration.


430-597: LGTM: Comprehensive PrometheusConfig type with appropriate validations.

The PrometheusConfig struct is well-designed with:

  • Clear documentation for each field and its purpose
  • Reasonable limits (e.g., enforcedBodySizeLimitBytes max 1TB, externalLabels max 50)
  • Proper use of list types (map, set, atomic) for different use cases
  • Comprehensive validation rules for complex fields like queryLogFile
  • Consistent patterns with other config types in the file

The MinProperties=1 validation ensures configuration isn't empty when specified.


608-671: LGTM: Well-designed AdditionalAlertmanagerConfig with robust validation.

The validation rules are comprehensive:

  • DNS subdomain validation for name field
  • Complex CEL-based validation for staticConfigs to ensure proper host:port format
  • Path validation for pathPrefix prevents common mistakes (no trailing slash, no query strings)
  • Reasonable timeout bounds (1-600 seconds)

673-688: LGTM: Simple and appropriate Label type.

The 128-character limit for both key and value is reasonable for Prometheus labels, and requiring both fields is correct.


690-726: LGTM: RemoteWriteSpec properly validates URLs and timeout values.

The URL validation ensures only http/https schemes are used, and the timeout bounds (1-600 seconds) are reasonable for remote write operations.


728-796: LGTM: RelabelConfig with comprehensive validation rules.

The cross-field validation at line 729 correctly enforces that targetLabel must be present and non-empty for Replace and HashMod actions. The field-level validations (length limits, regex patterns) are appropriate for relabeling use cases.


798-831: LGTM: TLSConfig enforces mTLS pairing and secure defaults.

The validation at line 802 correctly ensures that client certificate and key are specified together for mutual TLS. The default of certificateVerification=Verify (line 829) is the secure choice.


833-853: LGTM: Clear enum definitions with appropriate security warnings.

The documentation appropriately warns that CertificateVerificationSkipVerify is insecure.


855-871: LGTM: Correctly implemented discriminated union.

The AuthorizationConfig follows Kubernetes union type patterns with proper validation to ensure bearerToken is present only when type is BearerToken.


873-892: LGTM: SecretKeySelector properly validated and marked as atomic.

The +structType=atomic marker (line 874) is appropriate for secret references, ensuring they are updated atomically during merges.


894-912: VolumeClaimConfig type is well-designed, but note the breaking change concern.

The simplified API with sizeInGiB and storageClassName is clearer than exposing the full PersistentVolumeClaim surface. However, this type is part of the breaking change discussed in the earlier comment about requiring a v1alpha2 API version bump.

Note: This approval is for the type design itself. The serialization compatibility issue requires addressing through API version bump (see earlier comment).


914-969: LGTM: Well-defined Retention, RelabelAction, and CollectionProfile types.

The retention bounds are reasonable (up to 1 year, 16 TiB storage), and the relabel actions cover all standard Prometheus operations. The MinProperties=1 on Retention ensures at least one policy is configured.

Comment on lines +260 to 266
// volumeClaimTemplate defines persistent storage for Alertmanager. Use this setting to
// configure the persistent volume claim, including storage class and volume size.
// If omitted, the Pod uses ephemeral storage and alert data will not persist
// across restarts.
// This field is optional.
// +optional
VolumeClaimTemplate *v1.PersistentVolumeClaim `json:"volumeClaimTemplate,omitempty"`
VolumeClaimTemplate VolumeClaimConfig `json:"volumeClaimTemplate,omitempty,omitzero"`
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Breaking API change requires version bump to v1alpha2.

This change modifies the VolumeClaimTemplate field from *v1.PersistentVolumeClaim to VolumeClaimConfig, which is a serialization-breaking change. As noted in the PR comments by @everettraven: if the v1alpha1 API has already shipped, older clients that created valid PersistentVolumeClaim objects will fail to deserialize them against this updated schema.

The correct approach is to bump the API version from v1alpha1 to v1alpha2 for such type changes. This ensures backward compatibility and prevents runtime failures for existing clients.

Based on PR comments.

Comment on lines +286 to 300
// LogLevel defines the verbosity of logs emitted by Alertmanager.
// Valid values are Error, Warn, Info and Debug.
// +kubebuilder:validation:Enum=Error;Warn;Info;Debug
type LogLevel string

const (
// Error only errors will be logged.
// LogLevelError only errors will be logged.
LogLevelError LogLevel = "Error"
// Warn, both warnings and errors will be logged.
// LogLevelWarn, both warnings and errors will be logged.
LogLevelWarn LogLevel = "Warn"
// Info, general information, warnings, and errors will all be logged.
// LogLevelInfo, general information, warnings, and errors will all be logged.
LogLevelInfo LogLevel = "Info"
// Debug, detailed debugging information will be logged.
// LogLevelDebug, detailed debugging information will be logged.
LogLevelDebug LogLevel = "Debug"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix grammatical inconsistency in LogLevel constant comments.

The comments for LogLevelWarn, LogLevelInfo, and LogLevelDebug (lines 294, 296, 298) begin with leading commas, creating incomplete sentences:

  • Line 294: "LogLevelWarn, both warnings..."
  • Line 296: "LogLevelInfo, general information..."
  • Line 298: "LogLevelDebug, detailed debugging..."

In contrast, LogLevelError (line 292) correctly forms a complete sentence. For consistency and proper godoc formatting, remove the leading commas.

📝 Suggested fix
-	// LogLevelWarn, both warnings and errors will be logged.
+	// LogLevelWarn means both warnings and errors will be logged.
 	LogLevelWarn LogLevel = "Warn"
-	// LogLevelInfo, general information, warnings, and errors will all be logged.
+	// LogLevelInfo means general information, warnings, and errors will all be logged.
 	LogLevelInfo LogLevel = "Info"
-	// LogLevelDebug, detailed debugging information will be logged.
+	// LogLevelDebug means detailed debugging information will be logged.
 	LogLevelDebug LogLevel = "Debug"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// LogLevel defines the verbosity of logs emitted by Alertmanager.
// Valid values are Error, Warn, Info and Debug.
// +kubebuilder:validation:Enum=Error;Warn;Info;Debug
type LogLevel string
const (
// Error only errors will be logged.
// LogLevelError only errors will be logged.
LogLevelError LogLevel = "Error"
// Warn, both warnings and errors will be logged.
// LogLevelWarn, both warnings and errors will be logged.
LogLevelWarn LogLevel = "Warn"
// Info, general information, warnings, and errors will all be logged.
// LogLevelInfo, general information, warnings, and errors will all be logged.
LogLevelInfo LogLevel = "Info"
// Debug, detailed debugging information will be logged.
// LogLevelDebug, detailed debugging information will be logged.
LogLevelDebug LogLevel = "Debug"
)
// LogLevel defines the verbosity of logs emitted by Alertmanager.
// Valid values are Error, Warn, Info and Debug.
// +kubebuilder:validation:Enum=Error;Warn;Info;Debug
type LogLevel string
const (
// LogLevelError only errors will be logged.
LogLevelError LogLevel = "Error"
// LogLevelWarn means both warnings and errors will be logged.
LogLevelWarn LogLevel = "Warn"
// LogLevelInfo means general information, warnings, and errors will all be logged.
LogLevelInfo LogLevel = "Info"
// LogLevelDebug means detailed debugging information will be logged.
LogLevelDebug LogLevel = "Debug"
)
🤖 Prompt for AI Agents
In @config/v1alpha1/types_cluster_monitoring.go around lines 286 - 300, The
comments for the LogLevel constants are grammatically inconsistent; update the
doc comments for LogLevelWarn, LogLevelInfo, and LogLevelDebug (the constant
declarations with names LogLevelWarn, LogLevelInfo, LogLevelDebug) to remove the
leading commas and make them full sentences matching LogLevelError (e.g.,
"LogLevelWarn logs warnings and errors.", "LogLevelInfo logs general
information, warnings, and errors.", "LogLevelDebug logs detailed debugging
information.").

Comment on lines 1482 to 1486
x-kubernetes-validations:
- message: targetLabel is required when action is Replace
or HashMod
rule: 'self.action in [''Replace'', ''HashMod''] ? has(self.targetLabel)
&& self.targetLabel != ” : true'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix malformed validation rule for targetLabel.

The validation rule at line 1486 appears to have a syntax error. The rule string contains an unmatched quote and incorrect syntax:

rule: 'self.action in [''Replace'', ''HashMod''] ? has(self.targetLabel)
  && self.targetLabel != " : true'

The rule should properly check that targetLabel is non-empty when the action is "Replace" or "HashMod". The current rule has:

  • An unclosed string literal (!= ")
  • Missing closing parenthesis before the ternary operator's false branch
🔧 Proposed fix
                            x-kubernetes-validations:
                            - message: targetLabel is required when action is Replace
                                or HashMod
-                             rule: 'self.action in [''Replace'', ''HashMod''] ? has(self.targetLabel)
-                               && self.targetLabel != " : true'
+                             rule: 'self.action in [''Replace'', ''HashMod''] ? (has(self.targetLabel)
+                               && self.targetLabel != "") : true'
🤖 Prompt for AI Agents
In
@payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml
around lines 1482 - 1486, The x-kubernetes-validations rule string is malformed;
replace the current rule for targetLabel under x-kubernetes-validations with a
properly quoted expression that checks when self.action is 'Replace' or
'HashMod' then has(self.targetLabel) and self.targetLabel != "" otherwise true;
ensure the inner list uses doubled single-quotes for YAML single-quoted string
(e.g., ''Replace'', ''HashMod''), the comparison uses != "" (empty double
quotes) and the ternary is wrapped correctly (self.action in [...] ?
(has(self.targetLabel) && self.targetLabel != "") : true) so the string has
matching quotes and parentheses.

Comment on lines 1483 to 1487
- message: targetLabel is required when action is Replace
or HashMod
rule: 'self.action in [''Replace'', ''HashMod''] ? has(self.targetLabel)
&& self.targetLabel != ” : true'
maxItems: 10
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

CRITICAL: invalid CEL due to non-ASCII quote in relabel validation

Same issue as the CustomNoUpgrade CRD: self.targetLabel != ” should be a normal empty-string check (and must not use Unicode quotes).

Proposed fix
                           x-kubernetes-validations:
                           - message: targetLabel is required when action is Replace
                               or HashMod
                             rule: 'self.action in [''Replace'', ''HashMod''] ? has(self.targetLabel)
-                              && self.targetLabel != ” : true'
+                              && self.targetLabel != "" : true'
🤖 Prompt for AI Agents
In
@payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml
around lines 1483 - 1487, The CEL expression in the relabel validation rule uses
a non-ASCII closing quote (self.targetLabel != ”) which makes the CEL invalid;
update the rule string to use an ASCII empty-string check and proper
quoting/escaping so the condition reads self.action in ['Replace','HashMod'] ?
has(self.targetLabel) && self.targetLabel != '' : true (i.e., replace the
Unicode quote with two ASCII single quotes inside the YAML single-quoted rule
value), ensuring the surrounding YAML quoting/escaping stays correct for the
rule field.

Signed-off-by: Mario Fernandez <mariofer@redhat.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @config/v1alpha1/types_cluster_monitoring.go:
- Around line 728-730: The CEL XValidation on RelabelConfig is malformed; update
the kubebuilder tag on the RelabelConfig type to replace the broken rule that
contains an unclosed string with a correct expression equivalent to the one used
in monitoring/v1/types.go — e.g. use "(self.action != 'Replace' && self.action
!= 'HashMod') || has(self.targetLabel)" — so the rule requires targetLabel when
action is Replace or HashMod and keeps the existing message text intact.

In
@payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml:
- Around line 1483-1486: The CEL validation for writeRelabelConfigs' targetLabel
rule contains a curly quote causing invalid syntax; update the rule expression
to use straight double quotes for the empty string comparison so it reads
self.action in ['Replace','HashMod'] ? (has(self.targetLabel) &&
self.targetLabel != "") : true, ensuring all quotes in the expression are plain
ASCII quotes and the conditional remains unchanged.
🧹 Nitpick comments (3)
config/v1alpha1/types_cluster_monitoring.go (3)

894-912: Inconsistency: +required with omitempty on SizeInGiB.

SizeInGiB is marked +required but has omitempty in the JSON tag. While the Minimum=1 validation prevents zero values at the API level, the omitempty tag will omit the field when serializing a zero value, which could cause confusion during debugging or when inspecting resources.

For required fields in Kubernetes APIs, it's conventional to omit the omitempty tag to ensure the field is always present in serialized output.

♻️ Suggested fix
-	SizeInGiB int32 `json:"sizeInGiB,omitempty"`
+	SizeInGiB int32 `json:"sizeInGiB"`

611-658: Inconsistency: Required fields with omitempty tags.

The Name (line 622) and StaticConfigs (line 658) fields are marked +required but have omitempty in their JSON tags. This pattern is inconsistent across multiple required fields in this file.

For required fields, consider removing omitempty to ensure they're always serialized:

♻️ Suggested fixes
-	Name string `json:"name,omitempty"`
+	Name string `json:"name"`
-	StaticConfigs []string `json:"staticConfigs,omitempty"`
+	StaticConfigs []string `json:"staticConfigs"`

673-688: Consider removing omitempty from required Label fields.

Both Key and Value are marked +required but have omitempty. This is the same pattern noted elsewhere.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between ff7d44e and 03d3a6f.

⛔ Files ignored due to path filters (4)
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • config/v1alpha1/zz_generated.featuregated-crd-manifests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
📒 Files selected for processing (4)
  • config/v1alpha1/types_cluster_monitoring.go
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml
🧰 Additional context used
🧬 Code graph analysis (1)
config/v1alpha1/types_cluster_monitoring.go (3)
config/v1alpha1/zz_generated.swagger_doc_generated.go (8)
  • PrometheusConfig (273-275)
  • VolumeClaimConfig (352-354)
  • AdditionalAlertmanagerConfig (132-134)
  • ContainerResource (228-230)
  • Retention (310-312)
  • AuthorizationConfig (176-178)
  • TLSConfig (333-335)
  • SecretKeySelector (320-322)
operator/v1/types.go (1)
  • LogLevel (94-94)
console/v1/types_console_plugin.go (1)
  • AuthorizationType (282-282)
🔇 Additional comments (11)
payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml (1)

453-485: This API is being introduced for the first time, not modified.

The ClusterMonitoring API v1alpha1 is a new addition (all files are marked as "A" added, not modified). The volumeClaimTemplate structure with sizeInGiB and storageClassName is the initial design, not a breaking change from a prior version.

Additionally, the API explicitly declares Compatibility Level 4, which states: "No compatibility is provided, the API can change at any point for any reason. These capabilities should not be used by applications needing long term support." This level permits breaking changes by design.

Likely an incorrect or invalid review comment.

payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml (2)

894-1881: Well-structured prometheusConfig API.

The new prometheusConfig section is comprehensive and follows Kubernetes API conventions with:

  • Proper list types (x-kubernetes-list-type: map with appropriate keys)
  • Thorough field validations (DNS1123, URL validation, range constraints)
  • Clear documentation for each field
  • Sensible defaults and bounds

453-484: Review comment contains incorrect premise about API structure changes.

The concern about a breaking change from corev1.PersistentVolumeClaim to VolumeClaimConfig does not apply here. v1alpha1 already defines VolumeClaimTemplate using the custom VolumeClaimConfig struct (with storageClassName and sizeInGiB fields), not corev1.PersistentVolumeClaim. Additionally, there is no evidence that v1alpha1 has been released—the repository has no release tags and the git history shows only recent commits from PR #2643.

Likely an incorrect or invalid review comment.

config/v1alpha1/types_cluster_monitoring.go (8)

92-104: LGTM - PrometheusConfig field addition.

The new prometheusConfig field is well-documented with comprehensive comments explaining its purpose and scope. The omitempty,omitzero tags correctly handle optional struct fields.


873-892: LGTM - SecretKeySelector structure.

The SecretKeySelector struct is well-designed with:

  • +structType=atomic for proper merge semantics
  • Appropriate validation rules for secret names and keys
  • Clear documentation

The +required / omitempty pattern is consistent with other structs (noted separately).


798-831: LGTM - TLSConfig with proper mTLS validation.

Good design choices:

  • The mTLS validation ensures cert and key are specified together or both omitted
  • certificateVerification defaults to Verify (secure by default)
  • MinProperties=1 ensures at least one TLS option is configured when the struct is present

855-871: LGTM - AuthorizationConfig union pattern.

The union discriminator pattern is correctly implemented with appropriate CEL validation ensuring bearerToken is required when type is BearerToken and forbidden otherwise.


914-936: LGTM - Retention configuration.

Well-designed with:

  • MinProperties=1 ensuring meaningful configuration when set
  • Sensible bounds (1-365 days for duration, 1-16384 GiB for size)
  • Clear documentation of defaults

430-597: Comprehensive PrometheusConfig structure.

The PrometheusConfig struct is well-designed with:

  • Clear field groupings (scheduling, resources, retention, integrations)
  • Proper list types (+listType=map with keys, +listType=atomic where appropriate)
  • Comprehensive validation markers
  • Thorough documentation including defaults

938-969: LGTM - RelabelAction and CollectionProfile enums.

The enum types and constants are properly defined with:

  • Appropriate +kubebuilder:validation:Enum markers
  • Clear documentation for each value
  • Consistent naming conventions

260-266: Confirm API versioning strategy for VolumeClaimTemplate change.

The VolumeClaimTemplate field type changed from what was previously *v1.PersistentVolumeClaim to VolumeClaimConfig. As noted in the CRD review and PR discussion with @everettraven, this is a breaking serialization change.

If this API has been shipped in any release, clients using the old schema will fail. Please confirm the versioning strategy.

Comment on lines 1483 to 1486
- message: targetLabel is required when action is Replace
or HashMod
rule: 'self.action in [''Replace'', ''HashMod''] ? (has(self.targetLabel)
&& self.targetLabel != ”) : true'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix invalid CEL validation (curly quote) in writeRelabelConfigs targetLabel rule.

self.targetLabel != ” (curly quote) is not valid CEL and is very likely the root cause for verify-crd-schema / verify-crdify failures.

Proposed fix
-                          - message: targetLabel is required when action is Replace
-                              or HashMod
-                            rule: 'self.action in [''Replace'', ''HashMod''] ? (has(self.targetLabel)
-                              && self.targetLabel != ”) : true'
+                          - message: targetLabel is required when action is Replace or HashMod
+                            rule: 'self.action in [''Replace'', ''HashMod''] ? (has(self.targetLabel) && self.targetLabel != "") : true'
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- message: targetLabel is required when action is Replace
or HashMod
rule: 'self.action in [''Replace'', ''HashMod''] ? (has(self.targetLabel)
&& self.targetLabel != ”) : true'
- message: targetLabel is required when action is Replace or HashMod
rule: 'self.action in [''Replace'', ''HashMod''] ? (has(self.targetLabel) && self.targetLabel != "") : true'
🤖 Prompt for AI Agents
In
@payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml
around lines 1483 - 1486, The CEL validation for writeRelabelConfigs'
targetLabel rule contains a curly quote causing invalid syntax; update the rule
expression to use straight double quotes for the empty string comparison so it
reads self.action in ['Replace','HashMod'] ? (has(self.targetLabel) &&
self.targetLabel != "") : true, ensuring all quotes in the expression are plain
ASCII quotes and the conditional remains unchanged.

// +kubebuilder:validation:MaxLength=255
// +kubebuilder:validation:XValidation:rule="self.startsWith('/') || !self.contains('/')",message="must be an absolute path starting with '/' or a simple filename without '/'"
// +kubebuilder:validation:XValidation:rule="!self.startsWith('/dev/') || self in ['/dev/stdout', '/dev/stderr', '/dev/null']",message="only /dev/stdout, /dev/stderr, and /dev/null are allowed as /dev/ paths"
QueryLogFile string `json:"queryLogFile,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall if this has already been discussed, but would it make sense to add some lightweight set of valid filename characters via a regular expression and that it follows the correct format?

For example, I think that with the current rules something like ///////////// would be considered a valid input but doesn't sound like it would actually be valid?

// +listMapKey=name
// +kubebuilder:validation:MaxItems=10
// +kubebuilder:validation:MinItems=1
Resources []ContainerResource `json:"resources,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have minimums enforced for specific resources to ensure that the container has enough resources to minimally function?

// +kubebuilder:validation:MaxItems=10
// +kubebuilder:validation:items:MaxLength=255
// +kubebuilder:validation:items:XValidation:rule="isURL('http://' + self) && url('http://' + self).getHostname() != '' && url('http://' + self).getPort() != '' && int(url('http://' + self).getPort()) >= 1 && int(url('http://' + self).getPort()) <= 65535",message="must be a valid 'host:port' where host is a DNS name, IPv4, or IPv6 address (in brackets), and port is 1-65535"
// +listType=set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specify that each entry must be unique in the GoDoc please.

// +kubebuilder:validation:MaxItems=10
// +listType=map
// +listMapKey=name
WriteRelabelConfigs []RelabelConfig `json:"writeRelabelConfigs,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I'm asking to add a minimum length of 1 to this field if writeRelabelConfigs: [] is the same as omitting the field (which it reads that way based on the GoDoc).

// +kubebuilder:validation:MaxItems=10
// +kubebuilder:validation:items:MinLength=1
// +kubebuilder:validation:items:MaxLength=128
// +listType=set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specify uniqueness constraint in the GoDoc please.

// Allowed values are "Verify" (performs certificate verification, secure) and "SkipVerify" (skips verification, insecure).
// When omitted, defaults to "Verify" (secure certificate verification is performed).
// +optional
// +kubebuilder:default=Verify
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For configuration APIs, we try to avoid the use of this marker because this becomes a contractual default you can never change without it being a breaking change.

Instead, we try to have a controller responsible for recognizing the omission of this field and performing some default behavior.

We normally document that with the terminology:

When omitted, this means no opinion and the platform is left to choose a reasonable default, which is subject to change over time.
The default value is ...

SourceLabels []string `json:"sourceLabels,omitempty"`
// separator is the character sequence used to join source label values.
// Common examples: ";" (default), ",", "::", "|||".
// When omitted, defaults to ";" (semicolon).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably use our standard terminology here:

When omitted, this means no opinion and the platform is left to choose a reasonable default, which is subject to change over time.
The default value is ;.

Separator string `json:"separator,omitempty"`
// regex is the regular expression to match against the concatenated source label values.
// Must be a valid RE2 regular expression (https://github.com/google/re2/wiki/Syntax).
// When omitted, defaults to "(.*)" (matches everything).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably use our standard terminology:

When omitted, this means no opinion and the platform is left to choose a reasonable default, which is subject to change over time.
The default value is (.*) to match everything.

TargetLabel string `json:"targetLabel,omitempty"`
// replacement is the value against which a regex replace is performed if the
// regular expression matches. Regex capture groups are available (e.g., $1, $2).
// When omitted, defaults to "$1" (the first capture group).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should stay consistent with our standard terminology:

When omitted, this means no opinion and the platform is left to choose a reasonable default, which is subject to change over time.
The default value is $1 (the first capture group).

type Retention struct {
// durationInDays specifies how many days Prometheus will retain metrics data.
// Prometheus automatically deletes data older than this duration.
// When omitted, the default is 15 days.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Standard terminology for consistency:

When omitted, this means no opinion and the platform is left to choose a reasonable default, which is subject to change over time.
The default value is 15.

Signed-off-by: Mario Fernandez <mariofer@redhat.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Fix all issues with AI agents
In
`@config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml`:
- Around line 714-733: Add tests that assert rejection of empty arrays for the
remaining prometheusConfig list fields that have minItems:1:
prometheusConfig.additionalAlertmanagerConfigs, prometheusConfig.remoteWrite,
prometheusConfig.externalLabels, and prometheusConfig.resources. For each,
create a test case similar to the existing ones ("Should reject
PrometheusK8sConfig with empty ... array") using an initial manifest that sets
the specific field to [] under spec.prometheusConfig and an expectedError string
matching the pattern used (e.g. 'spec.prometheusConfig.<field>: Invalid value:
0: spec.prometheusConfig.<field> in body should have at least 1 items') so they
mirror the tolerations/topologySpreadConstraints tests.

In `@config/v1alpha1/types_cluster_monitoring.go`:
- Around line 734-737: Fix the malformed CEL string and align the XValidation
action list with the actual RelabelAction enum used by RelabelConfig: replace
the unclosed quote `self.targetLabel != ”` with a proper empty-string literal
`self.targetLabel != ''`, and update the first rule's action set to only
reference existing enum values (e.g. use
['Replace','HashMod','LabelMap','Keep','Drop','LabelDrop','LabelKeep'] or
whichever subset requires targetLabel), or alternatively add the missing enum
members (Lowercase, Uppercase, KeepEqual, DropEqual) to the RelabelAction enum
so the rule and enum are consistent.

In `@config/v1alpha1/zz_generated.deepcopy.go`:
- Around line 1059-1063: PrometheusConfig.VolumeClaimTemplate is still typed as
*v1.PersistentVolumeClaim; update the PrometheusConfig definition
(types_cluster_monitoring.go) to use the new VolumeClaimConfig type (same as
AlertmanagerCustomConfig.VolumeClaimTemplate uses), then re-run the deepcopy
generator to regenerate zz_generated.deepcopy.go so the DeepCopyInto/DeepCopy
methods use VolumeClaimConfig instead of *v1.PersistentVolumeClaim; ensure any
imports and references to VolumeClaimTemplate constructors or defaults are
adjusted to the new type.

In `@openapi/generated_openapi/zz_generated.openapi.go`:
- Around line 24043-24053: PrometheusConfig currently still references
k8s.io/api/core/v1.PersistentVolumeClaim for the volumeClaimTemplate; change the
PrometheusConfig struct to use your VolumeClaimConfig type (the same replacement
done for AlertmanagerConfig.volumeClaimTemplate) and remove the
PersistentVolumeClaim reference from the generated Dependencies list so it
references VolumeClaimConfig instead; after updating the source struct,
regenerate the openapi/zz_generated.openapi.go so the SchemaProps and
Dependencies reflect VolumeClaimConfig rather than
k8s.io/api/core/v1.PersistentVolumeClaim.

In `@openapi/openapi.json`:
- Around line 13168-13171: PrometheusConfig.volumeClaimTemplate currently $ref's
io.k8s.api.core.v1.PersistentVolumeClaim; change that $ref to the simplified
VolumeClaimConfig type used by AlertmanagerConfig (i.e. replace the reference
value "#/definitions/io.k8s.api.core.v1.PersistentVolumeClaim" with
"#/definitions/VolumeClaimConfig" for the PrometheusConfig.volumeClaimTemplate
entry) so Prometheus uses the compact sizeInGiB/storageClassName schema instead
of the full PVC definition.
♻️ Duplicate comments (7)
openapi/generated_openapi/zz_generated.openapi.go (5)

22146-22152: Enum constraint for scheme field is documented but not enforced in schema.

The description states allowed values are HTTP or HTTPS, but the generated schema lacks the Enum field. This means clients won't receive schema-level validation for invalid values.

This is a known issue with OpenAPI generation from kubebuilder markers - ensure the source type has +kubebuilder:validation:Enum=HTTP;HTTPS and regenerate.


22383-22389: type field missing Enum constraint for None and BearerToken.

The description documents valid values as "None" and "BearerToken", but the schema lacks the Enum field. Add +kubebuilder:validation:Enum=None;BearerToken to the source type field if not already present, then regenerate.


24119-24124: RelabelConfig action field missing Enum constraint.

The description documents valid actions (Replace, Keep, Drop, HashMod, LabelMap, LabelDrop, LabelKeep), but the schema lacks the corresponding Enum field. Add +kubebuilder:validation:Enum=Replace;Keep;Drop;HashMod;LabelMap;LabelDrop;LabelKeep to the source type and regenerate.


24199-24211: Retention fields durationInDays and sizeInGiB missing Min/Max constraints in schema.

Per descriptions:

  • durationInDays: Min 1, Max 365
  • sizeInGiB: Min 1, Max 16384

These bounds are documented but not enforced at the schema level. Ensure source has +kubebuilder:validation:Minimum and +kubebuilder:validation:Maximum markers and regenerate.


24463-24468: VolumeClaimConfig sizeInGiB missing Min/Max constraints.

Description states minimum 1 GiB and maximum 16384 GiB, but the schema lacks Minimum/Maximum properties. Ensure source has +kubebuilder:validation:Minimum=1 and +kubebuilder:validation:Maximum=16384 and regenerate.

config/v1alpha1/types_cluster_monitoring.go (2)

260-265: Breaking API change: VolumeClaimTemplate type modification.

This change from *v1.PersistentVolumeClaim to VolumeClaimConfig alters the serialization format. If v1alpha1 has already shipped, existing clients with valid PVC configurations will fail to deserialize.

As discussed in PR comments by @everettraven, the recommended approach is to bump to v1alpha2 for this type change, or revert to *v1.PersistentVolumeClaim for the v1alpha1 API.


292-299: Grammatical inconsistency in LogLevel constant comments.

The comments for LogLevelWarn, LogLevelInfo, and LogLevelDebug have leading commas creating incomplete sentences, while LogLevelError is correctly formatted.

📝 Suggested fix
-	// LogLevelError only errors will be logged.
+	// LogLevelError means only errors will be logged.
 	LogLevelError LogLevel = "Error"
-	// LogLevelWarn, both warnings and errors will be logged.
+	// LogLevelWarn means both warnings and errors will be logged.
 	LogLevelWarn LogLevel = "Warn"
-	// LogLevelInfo, general information, warnings, and errors will all be logged.
+	// LogLevelInfo means general information, warnings, and errors will all be logged.
 	LogLevelInfo LogLevel = "Info"
-	// LogLevelDebug, detailed debugging information will be logged.
+	// LogLevelDebug means detailed debugging information will be logged.
 	LogLevelDebug LogLevel = "Debug"
🧹 Nitpick comments (5)
config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml (1)

392-420: Inconsistent YAML string quoting.

The resource values (100m, 500m, 128Mi, 512Mi) and names (cpu, memory) are unquoted here, but the analogous MetricsServerConfig tests (lines 232-258) use quoted strings. While YAML will parse both correctly, consistent quoting improves readability.

Suggested fix for consistent quoting
     - name: Should accept PrometheusK8sConfig with valid resources
       initial: |
         apiVersion: config.openshift.io/v1alpha1
         kind: ClusterMonitoring
         spec:
           userDefined:
             mode: "Disabled"
           prometheusConfig:
             resources:
-              - name: cpu
-                request: 100m
-                limit: 500m
-              - name: memory
-                request: 128Mi
-                limit: 512Mi
+              - name: "cpu"
+                request: "100m"
+                limit: "500m"
+              - name: "memory"
+                request: "128Mi"
+                limit: "512Mi"
       expected: |
         apiVersion: config.openshift.io/v1alpha1
         kind: ClusterMonitoring
         spec:
           userDefined:
             mode: "Disabled"
           prometheusConfig:
             resources:
-              - name: cpu
-                request: 100m
-                limit: 500m
-              - name: memory
-                request: 128Mi
-                limit: 512Mi
+              - name: "cpu"
+                request: "100m"
+                limit: "500m"
+              - name: "memory"
+                request: "128Mi"
+                limit: "512Mi"
openapi/generated_openapi/zz_generated.openapi.go (2)

23461-23474: Label key and value fields lack length constraints in schema.

Both fields' descriptions document 1-128 character limits, but the generated schema doesn't include MinLength/MaxLength properties. Ensure the source struct has appropriate kubebuilder markers (+kubebuilder:validation:MinLength=1, +kubebuilder:validation:MaxLength=128) and regenerate.


24411-24416: TLSConfig certificateVerification field lacks Enum constraint.

Description documents allowed values Verify and SkipVerify, but the schema doesn't include an Enum field. Add +kubebuilder:validation:Enum=Verify;SkipVerify to the source type and regenerate.

config/v1alpha1/types_cluster_monitoring.go (2)

644-650: Consider avoiding kubebuilder:default for configuration APIs.

Line 648 uses +kubebuilder:default=HTTP which creates a contractual default that cannot be changed without a breaking change. Per OpenShift API conventions, configuration APIs should allow the controller to apply defaults so they can evolve over time.

Consider updating the documentation to use the standard terminology:

 	// scheme defines the URL scheme to use when communicating with Alertmanager
 	// instances.
 	// Possible values are `HTTP` or `HTTPS`.
-	// When omitted, defaults to `HTTP`.
-	// +kubebuilder:default=HTTP
+	// When omitted, this means no opinion and the platform is left to choose a reasonable default, which is subject to change over time.
+	// The current default value is `HTTP`.
 	// +optional

787-795: Minor documentation clarification.

The replacement field documentation (lines 787-791) only mentions the Replace action, but the XValidation rule (line 736) also allows replacement with LabelMap action.

Consider updating the documentation:

-	// replacement value against which a Replace action is performed if the
+	// replacement value used for Replace or LabelMap actions when the
 	// regular expression matches. Regex capture groups are available (e.g., $1, $2).
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 03d3a6f and 10e16f8.

⛔ Files ignored due to path filters (4)
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • config/v1alpha1/zz_generated.featuregated-crd-manifests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
📒 Files selected for processing (9)
  • config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml
  • config/v1alpha1/types_cluster_monitoring.go
  • config/v1alpha1/zz_generated.deepcopy.go
  • config/v1alpha1/zz_generated.swagger_doc_generated.go
  • openapi/generated_openapi/zz_generated.openapi.go
  • openapi/openapi.json
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-DevPreviewNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_10_config-operator_01_clustermonitorings-TechPreviewNoUpgrade.crd.yaml
🧰 Additional context used
🧬 Code graph analysis (2)
config/v1alpha1/types_cluster_monitoring.go (1)
operator/v1/types.go (1)
  • LogLevel (94-94)
openapi/generated_openapi/zz_generated.openapi.go (1)
config/v1/types_feature.go (1)
  • Default (41-41)
🔇 Additional comments (34)
config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml (4)

354-391: LGTM!

These acceptance tests for minimal PrometheusK8sConfig and valid nodeSelector follow the established patterns and correctly verify the expected behavior.


421-565: LGTM!

Comprehensive acceptance tests for tolerations, topologySpreadConstraints, additionalAlertmanagerConfigs, remoteWrite, and externalLabels. The test structures are consistent and correctly verify expected behavior.


566-776: LGTM!

The rejection tests thoroughly cover:

  • Empty object validation (minProperties: 1)
  • Maximum item constraints for all list fields (consistently testing 11 items against max of 10)
  • Empty array validation for tolerations and topologySpreadConstraints

Error messages correctly align with CRD structural validation constraints.


777-931: Well-designed queryLogFile validation tests.

Excellent coverage of path validation scenarios including:

  • Valid cases: simple filenames, absolute paths, allowed /dev/ paths
  • Security-focused rejections: path traversal (..), disallowed device paths, consecutive/trailing slashes

The validation rules appropriately prevent common path injection vectors while allowing legitimate use cases.

openapi/openapi.json (9)

12049-12094: LGTM!

The AdditionalAlertmanagerConfig type is well-structured with clear validation constraints documented in the descriptions. The use of x-kubernetes-list-type: set for staticConfigs ensures uniqueness, and the timeout range (1-600 seconds) is reasonable.


12172-12176: LGTM - Verify API versioning implications.

The change to use VolumeClaimConfig instead of PersistentVolumeClaim aligns with the PR objective to enable stricter admission-time validation. However, as noted in the PR comments by everettraven, if v1alpha1 has already shipped, this serialized type change could break existing clients. Verify whether a version bump to v1alpha2 is needed.


12192-12217: LGTM!

The AuthorizationConfig type correctly implements the Kubernetes discriminated union pattern with x-kubernetes-unions. The conditional requirement for bearerToken based on the type value is properly documented.


12472-12476: LGTM!

The prometheusConfig field addition is well-documented with a comprehensive description of configurable aspects including pod scheduling, resource allocation, retention policies, and external integrations.


12824-12841: LGTM!

The Label type is a straightforward key/value struct with appropriate length constraints (1-128 characters) and UTF-8 support that aligns with Prometheus's label name conventions.


13174-13266: LGTM!

The RelabelConfig, RemoteWriteSpec, and Retention type definitions are well-structured with:

  • Clear validation constraints documented in descriptions
  • Appropriate list types (set for unique items, map for keyed entries)
  • Reasonable bounds (regex up to 1000 chars, retention up to 365 days/16 TiB)
  • Comprehensive action documentation for relabeling rules

13332-13350: LGTM!

The SecretKeySelector type follows the standard Kubernetes pattern for referencing secret data. The x-kubernetes-map-type: atomic annotation and namespace constraint (openshift-monitoring) are appropriate.


13369-13397: LGTM with a note on validation.

The TLSConfig type is well-structured for TLS configuration. The description states "At least one TLS configuration option must be specified" and "both cert and key must be specified together, or both omitted" - ensure these constraints are enforced via CEL validation rules or admission webhooks, as the OpenAPI schema itself allows all fields to be omitted.


13416-13433: LGTM!

The VolumeClaimConfig type is a clean, simplified alternative to PersistentVolumeClaim, exposing only sizeInGiB (required) and storageClassName (optional). This aligns with the PR objective to enable stricter admission-time validation.

config/v1alpha1/zz_generated.swagger_doc_generated.go (1)

1-564: Auto-generated swagger documentation looks correct.

This file is auto-generated using hack/update-swagger-docs.sh. The swagger documentation for the new types (AdditionalAlertmanagerConfig, AuthorizationConfig, Label, PrometheusConfig, RelabelConfig, RemoteWriteSpec, Retention, SecretKeySelector, TLSConfig, VolumeClaimConfig) is comprehensive and follows consistent patterns with existing types in the file.

config/v1alpha1/zz_generated.deepcopy.go (2)

14-35: LGTM: New deepcopy implementations follow correct patterns.

The AdditionalAlertmanagerConfig.DeepCopyInto correctly handles:

  • Direct value copy for Authorization and TLSConfig (which contain only string/value fields via SecretKeySelector)
  • Proper slice deep copy for StaticConfigs

1272-1291: LGTM: VolumeClaimConfig.DeepCopyInto correctly handles the optional pointer field.

The implementation properly nil-checks and deep copies the StorageClassName pointer field.

openapi/generated_openapi/zz_generated.openapi.go (5)

445-449: LGTM - Schema function registrations follow established pattern.

The new v1alpha1 type registrations for AdditionalAlertmanagerConfig and AuthorizationConfig are properly added to the OpenAPI definitions map.


477-498: LGTM - Type registrations are properly structured.

New v1alpha1 types (Label, PrometheusConfig, RelabelConfig, RemoteWriteSpec, Retention, SecretKeySelector, TLSConfig, VolumeClaimConfig) are correctly registered following the established pattern.


22340-22351: LGTM - AlertmanagerConfig correctly uses VolumeClaimConfig.

The volumeClaimTemplate field properly references the new VolumeClaimConfig type with dependencies correctly updated. This aligns with the PR objective to replace the full PVC schema with a focused config struct.


22853-22871: LGTM - PrometheusConfig integration into ClusterMonitoringSpec.

The prometheusConfig field is properly added with a comprehensive description and correct type reference. Dependencies are updated accordingly.


24313-24344: LGTM - SecretKeySelector schema with atomic map type.

The x-kubernetes-map-type: atomic extension is correctly applied, ensuring proper server-side apply merge behavior for this type.

config/v1alpha1/types_cluster_monitoring.go (13)

92-104: LGTM!

The prometheusConfig field is well-documented with clear explanations of its purpose and configurable aspects. The optional semantics with omitempty,omitzero are appropriate.


695-732: LGTM!

The RemoteWriteSpec has comprehensive validation including URL scheme enforcement, proper bounds on timeout, and list constraints with uniqueness via listMapKey=name.


809-842: LGTM!

The TLSConfig struct properly enforces mutual TLS requirements with the XValidation rule ensuring cert and key are both present or both absent. The MinProperties=1 constraint ensures meaningful configuration.


866-882: LGTM!

The AuthorizationConfig correctly implements the union pattern with proper discriminator validation. The XValidation rule ensures bearerToken is required only when type is BearerToken.


884-903: LGTM!

The SecretKeySelector provides proper validation for secret references with DNS subdomain validation on names and character restrictions on keys. The +structType=atomic ensures the struct is treated as an indivisible unit.


905-923: LGTM!

The VolumeClaimConfig provides a simplified, focused API surface for storage configuration with appropriate bounds (1-16384 GiB) and DNS subdomain validation for storage class names.


925-948: LGTM!

The Retention struct properly constrains retention duration (1-365 days) and storage size. The MinProperties=1 validation ensures at least one retention limit is specified when the struct is used.


950-981: LGTM!

The RelabelAction and CollectionProfile enums are well-documented with clear descriptions for each value.

Note: The RelabelAction enum should be reconciled with the XValidation rule in RelabelConfig (line 735), which references actions not present in this enum.


500-520: LGTM!

The queryLogFile validation rules are comprehensive, covering:

  • Character restrictions (alphanumeric, ., _, -, /)
  • Path type enforcement (absolute or simple filename)
  • Explicit allowlist for /dev/ paths
  • Path traversal prevention

Edge cases like /// are correctly rejected by the !self.contains('//') rule.


746-770: LGTM!

The sourceLabels field has proper uniqueness constraint (+listType=set) documented in godoc. The separator field uses the standard OpenShift terminology for defaults.


678-693: LGTM!

The Label type correctly allows UTF-8 label names per Prometheus 2.x+ capabilities, with appropriate length constraints.


603-610: LGTM!

The AlertmanagerScheme enum is properly defined with HTTP and HTTPS values.


844-864: LGTM!

The CertificateVerificationType and AuthorizationType enums are well-defined with clear documentation distinguishing secure from insecure options.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +714 to +733
- name: Should reject PrometheusK8sConfig with empty tolerations array
initial: |
apiVersion: config.openshift.io/v1alpha1
kind: ClusterMonitoring
spec:
userDefined:
mode: "Disabled"
prometheusConfig:
tolerations: []
expectedError: 'spec.prometheusConfig.tolerations: Invalid value: 0: spec.prometheusConfig.tolerations in body should have at least 1 items'
- name: Should reject PrometheusK8sConfig with empty topologySpreadConstraints array
initial: |
apiVersion: config.openshift.io/v1alpha1
kind: ClusterMonitoring
spec:
userDefined:
mode: "Disabled"
prometheusConfig:
topologySpreadConstraints: []
expectedError: 'spec.prometheusConfig.topologySpreadConstraints: Invalid value: 0: spec.prometheusConfig.topologySpreadConstraints in body should have at least 1 items'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Find the CRD schema definition for ClusterMonitoring
find . -type f -name "*clustermonitoring*" -o -name "*monitoring*" | grep -E "\.(yaml|yml)$" | head -20

Repository: openshift/api

Length of output: 170


🏁 Script executed:

# Search for minItems in YAML files
rg -n "minItems" --type yaml --type yml | grep -i "prometheus\|config" | head -30

Repository: openshift/api

Length of output: 170


🏁 Script executed:

# Look for the actual CRD schema definition
find . -type f \( -name "*.yaml" -o -name "*.yml" \) | xargs grep -l "kind.*ClusterMonitoring" | grep -v test

Repository: openshift/api

Length of output: 836


🏁 Script executed:

# Examine the CRD manifest to find prometheusConfig schema
cat ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml | grep -A 500 "prometheusConfig"

Repository: openshift/api

Length of output: 32432


🏁 Script executed:

# Check the size of the CRD file first
wc -l ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml

Repository: openshift/api

Length of output: 177


🏁 Script executed:

# Continue reading the CRD schema to find remoteWrite minItems, resources, tolerations, and topologySpreadConstraints
sed -n '1,2311p' ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml | grep -A 200 "remoteWrite:" | head -250

Repository: openshift/api

Length of output: 13067


🏁 Script executed:

# Also search for resources, tolerations, and topologySpreadConstraints in the prometheusConfig schema
rg -A 5 "resources:|tolerations:|topologySpreadConstraints:" ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml

Repository: openshift/api

Length of output: 4967


🏁 Script executed:

# Search for tolerations and topologySpreadConstraints in prometheusConfig to confirm their minItems values
rg -B 5 -A 10 "tolerations:|topologySpreadConstraints:" ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml | grep -A 10 "prometheusConfig" | tail -100

Repository: openshift/api

Length of output: 39


🏁 Script executed:

# Find the exact minItems values for tolerations and topologySpreadConstraints in prometheusConfig
sed -n '1,2311p' ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml | grep -A 20 "tolerations:" | grep -E "minItems|maxItems"

Repository: openshift/api

Length of output: 39


🏁 Script executed:

# Extract and summarize all minItems constraints for prometheusConfig fields
python3 << 'EOF'
import yaml
import sys

with open('./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml', 'r') as f:
    doc = yaml.safe_load(f)

# Navigate to prometheusConfig properties
spec = doc['spec']['names']['validation']['openAPIV3Schema']['properties']['spec']['properties']['prometheusConfig']['properties']

# Check which list fields have minItems
list_fields = {
    'additionalAlertmanagerConfigs': spec.get('additionalAlertmanagerConfigs', {}),
    'remoteWrite': spec.get('remoteWrite', {}),
    'externalLabels': spec.get('externalLabels', {}),
    'resources': spec.get('resources', {}),
    'tolerations': spec.get('tolerations', {}),
    'topologySpreadConstraints': spec.get('topologySpreadConstraints', {})
}

print("MinItems constraints for prometheusConfig list fields:")
for field, schema in list_fields.items():
    min_items = schema.get('minItems', 'Not set')
    max_items = schema.get('maxItems', 'Not set')
    has_type_array = schema.get('type') == 'array'
    print(f"  {field}: minItems={min_items}, maxItems={max_items}, is_array={has_type_array}")
EOF

Repository: openshift/api

Length of output: 152


🏁 Script executed:

# Extract minItems constraints from the prometheusConfig section directly
python3 << 'EOF'
import re

with open('./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml', 'r') as f:
    content = f.read()

# Find the prometheusConfig section
match = re.search(r'prometheusConfig:\s*(.*?)(?=\n\s{18}[a-zA-Z])', content, re.DOTALL)
if match:
    prom_config_section = match.group(0)
    
    # Find all field names and their minItems
    fields = re.findall(r'(\w+):\s*(?:.*?)minItems:\s*(\d+)', prom_config_section, re.DOTALL)
    
    print("Fields in prometheusConfig with minItems constraints:")
    for field, min_items in fields:
        print(f"  {field}: minItems={min_items}")
EOF

Repository: openshift/api

Length of output: 110


🏁 Script executed:

# More direct approach: search for each field with minItems context
for field in "additionalAlertmanagerConfigs" "remoteWrite" "externalLabels" "resources" "tolerations" "topologySpreadConstraints"; do
    echo "=== $field ==="
    rg -A 30 "^\s+$field:" ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml | grep -E "minItems|maxItems|description" | head -5
done

Repository: openshift/api

Length of output: 1610


🏁 Script executed:

# Direct search for minItems in the CRD file near each field
python3 << 'EOF'
with open('./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml', 'r') as f:
    lines = f.readlines()

# Find line numbers where prometheusConfig properties appear
fields_to_check = ['additionalAlertmanagerConfigs', 'remoteWrite', 'externalLabels', 'resources', 'tolerations', 'topologySpreadConstraints']

for i, line in enumerate(lines):
    for field in fields_to_check:
        if f'  {field}:' in line:
            # Look for minItems in the next 100 lines
            section = ''.join(lines[i:min(i+100, len(lines))])
            # Find the next field definition to know where this field ends
            next_field_idx = float('inf')
            for j in range(i+1, min(i+100, len(lines))):
                if lines[j].strip() and not lines[j].startswith(' ' * 20) and lines[j].startswith('  '):
                    next_field_idx = j - i
                    break
            
            section = ''.join(lines[i:i+next_field_idx])
            
            # Search for minItems in this field's section
            for line_offset, section_line in enumerate(section.split('\n')):
                if 'minItems:' in section_line:
                    min_val = section_line.strip().split('minItems:')[1].strip()
                    print(f"{field}: minItems={min_val}")
                    break
EOF

Repository: openshift/api

Length of output: 208


🏁 Script executed:

# Let's just extract the prometheusConfig section and grep more carefully
awk '/^\s+prometheusConfig:/,/^\s+[a-zA-Z]/' ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml | grep -E "(^\s+(additionalAlertmanagerConfigs|remoteWrite|externalLabels|resources|tolerations|topologySpreadConstraints):)|(^\s+minItems:)" | head -20

Repository: openshift/api

Length of output: 39


🏁 Script executed:

# Search for minItems directly in context of prometheusConfig fields
rg "minItems:" ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml -B 50 | grep -E "(additionalAlertmanagerConfigs|remoteWrite|externalLabels|resources|tolerations|topologySpreadConstraints|minItems:)" | grep -A 1 -B 1 "additionalAlertmanagerConfigs\|remoteWrite\|externalLabels\|resources\|tolerations\|topologySpreadConstraints"

Repository: openshift/api

Length of output: 906


🏁 Script executed:

# Find line numbers for each field and look at the minItems nearby
grep -n "additionalAlertmanagerConfigs:\|remoteWrite:\|externalLabels:\|resources:\|tolerations:\|topologySpreadConstraints:" ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml | grep -v "description"

Repository: openshift/api

Length of output: 736


🏁 Script executed:

# For each field, extract the section and check for minItems
python3 << 'EOF'
with open('./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml', 'r') as f:
    content = f.read()

# Split by prometheusConfig and look for properties
if 'prometheusConfig:' in content:
    prom_idx = content.find('prometheusConfig:')
    # Find the properties section
    props_idx = content.find('properties:', prom_idx)
    if props_idx > 0:
        # Get next 30000 chars which should cover all prometheusConfig properties
        section = content[props_idx:props_idx+30000]
        
        # Find each field and its minItems
        fields = ['additionalAlertmanagerConfigs', 'remoteWrite', 'externalLabels', 'resources', 'tolerations', 'topologySpreadConstraints']
        
        for field in fields:
            field_pattern = f'{field}:'
            idx = section.find(field_pattern)
            if idx >= 0:
                # Look for minItems in next 2000 chars
                next_section = section[idx:idx+2000]
                lines = next_section.split('\n')
                min_items = None
                for line in lines[:50]:
                    if 'minItems:' in line:
                        min_items = line.strip()
                        break
                if min_items:
                    print(f"{field}: {min_items}")
                else:
                    print(f"{field}: no minItems found")
EOF

Repository: openshift/api

Length of output: 164


🏁 Script executed:

# Extract minItems for each prometheusConfig field at the specified line numbers
# Line 909: additionalAlertmanagerConfigs
sed -n '909,950p' ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml | grep -E "minItems|maxItems"

Repository: openshift/api

Length of output: 39


🏁 Script executed:

# Line 1346: remoteWrite
sed -n '1346,1400p' ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml | grep -E "minItems|maxItems"

Repository: openshift/api

Length of output: 39


🏁 Script executed:

# Line 1244: externalLabels
sed -n '1244,1280p' ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml | grep -E "minItems|maxItems"

Repository: openshift/api

Length of output: 121


🏁 Script executed:

# Line 1519: resources
sed -n '1519,1580p' ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml | grep -E "minItems|maxItems"

Repository: openshift/api

Length of output: 39


🏁 Script executed:

# Line 1626: tolerations (prometheusConfig)
sed -n '1626,1650p' ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml | grep -E "minItems|maxItems"

Repository: openshift/api

Length of output: 39


🏁 Script executed:

# Line 1676: topologySpreadConstraints (prometheusConfig)
sed -n '1676,1700p' ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml | grep -E "minItems|maxItems"

Repository: openshift/api

Length of output: 39


🏁 Script executed:

# Use awk to parse the YAML and find minItems for each field
awk '
/additionalAlertmanagerConfigs:/ {field="additionalAlertmanagerConfigs"; in_field=1}
/remoteWrite:/ {field="remoteWrite"; in_field=1}
/externalLabels:/ && !/description/ {field="externalLabels"; in_field=1}
/resources:/ && /prometheusConfig/ {field="resources"; in_field=1}
/tolerations:/ && !/description/ {field="tolerations"; in_field=1}
/topologySpreadConstraints:/ && !/description/ {field="topologySpreadConstraints"; in_field=1}

in_field && /minItems:/ {print field ": " $0; in_field=0}
in_field && /^\s+[a-zA-Z]+:/ && !/minItems/ && !/maxItems/ && !/description/ && !/properties/ && !/items/ && !/type/ {in_field=0}
' ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml

Repository: openshift/api

Length of output: 39


🏁 Script executed:

# Let's just look at larger sections around each field definition
echo "=== additionalAlertmanagerConfigs (line ~909) ===" && sed -n '905,1250p' ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml | grep -A 2 -B 2 "additionalAlertmanagerConfigs:" | head -20 && echo "...looking for minItems..." && sed -n '1230,1245p' ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml | grep "minItems"

Repository: openshift/api

Length of output: 401


🏁 Script executed:

# Look at the full prometheusConfig section more systematically with context
sed -n '860,1750p' ./config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings-CustomNoUpgrade.crd.yaml | grep -E "^\s+(additionalAlertmanagerConfigs|remoteWrite|externalLabels|resources|tolerations|topologySpreadConstraints):" -A 325 | grep -E "^\s+(additionalAlertmanagerConfigs|remoteWrite|externalLabels|resources|tolerations|topologySpreadConstraints):|minItems:|maxItems:" | head -40

Repository: openshift/api

Length of output: 878


🏁 Script executed:

# Now check the test file to see which empty array tests exist
grep -n "Should reject.*empty.*array\|Should reject.*empty.*" ./config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml

Repository: openshift/api

Length of output: 296


🏁 Script executed:

# Count existing empty array test cases
grep -c "expectedError.*should have at least 1 items" ./config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml

Repository: openshift/api

Length of output: 58


🏁 Script executed:

# List which fields have rejection tests for empty arrays
rg -B 5 "should have at least 1 items" ./config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml | grep -E "(prometheusConfig\.|additionalAlertmanagerConfigs|remoteWrite|externalLabels|resources|tolerations|topologySpreadConstraints)" | head -20

Repository: openshift/api

Length of output: 448


Add empty array rejection tests for additional prometheusConfig list fields.

The CRD schema requires minItems: 1 for additionalAlertmanagerConfigs, remoteWrite, externalLabels, and resources under prometheusConfig, but empty array tests exist only for tolerations and topologySpreadConstraints. Add rejection tests for the missing fields to ensure consistent validation coverage.

🤖 Prompt for AI Agents
In
`@config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml`
around lines 714 - 733, Add tests that assert rejection of empty arrays for the
remaining prometheusConfig list fields that have minItems:1:
prometheusConfig.additionalAlertmanagerConfigs, prometheusConfig.remoteWrite,
prometheusConfig.externalLabels, and prometheusConfig.resources. For each,
create a test case similar to the existing ones ("Should reject
PrometheusK8sConfig with empty ... array") using an initial manifest that sets
the specific field to [] under spec.prometheusConfig and an expectedError string
matching the pattern used (e.g. 'spec.prometheusConfig.<field>: Invalid value:
0: spec.prometheusConfig.<field> in body should have at least 1 items') so they
mirror the tolerations/topologySpreadConstraints tests.

Comment on lines +595 to +601
// volumeClaimTemplate defines persistent storage for Prometheus. Use this setting to
// configure the persistent volume claim, including storage class and volume size.
// If omitted, the Pod uses ephemeral storage and Prometheus data will not persist
// across restarts.
// +optional
VolumeClaimTemplate *v1.PersistentVolumeClaim `json:"volumeClaimTemplate,omitempty,omitzero"`
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

API inconsistency: VolumeClaimTemplate type differs between configs.

PrometheusConfig.VolumeClaimTemplate uses *v1.PersistentVolumeClaim (line 600), while AlertmanagerCustomConfig.VolumeClaimTemplate uses VolumeClaimConfig (line 265). This creates an inconsistent API surface where users must use different schemas for the same conceptual configuration.

Consider aligning both to use the same type. If the custom VolumeClaimConfig approach is preferred for stricter validation, apply it consistently. If *v1.PersistentVolumeClaim is preferred for v1alpha1 compatibility, revert AlertmanagerCustomConfig as well.

Comment on lines +734 to +737
// RelabelConfig represents a relabeling rule.
// +kubebuilder:validation:XValidation:rule="self.action in ['Replace', 'HashMod', 'Lowercase', 'Uppercase', 'KeepEqual', 'DropEqual'] ? (has(self.targetLabel) && self.targetLabel != ”) : !has(self.targetLabel)",message="targetLabel is required when action is Replace, HashMod, Lowercase, Uppercase, KeepEqual or DropEqual, and forbidden otherwise"
// +kubebuilder:validation:XValidation:rule="self.action in ['Replace', 'LabelMap'] || !has(self.replacement)",message="replacement is only valid when action is Replace or LabelMap"
type RelabelConfig struct {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Malformed CEL rule and enum mismatch in RelabelConfig validation.

Two issues with the XValidation rule on line 735:

  1. Malformed CEL syntax: self.targetLabel != " has an unclosed string literal. Should use single quotes: self.targetLabel != ''

  2. Enum mismatch: The rule references actions Lowercase, Uppercase, KeepEqual, DropEqual that are not defined in the RelabelAction enum (lines 951-952). The enum only defines: Replace, Keep, Drop, HashMod, LabelMap, LabelDrop, LabelKeep.

Either add the missing actions to the enum or update the validation rule to match the defined actions.

🐛 Proposed fix (align rule with existing enum)
 // RelabelConfig represents a relabeling rule.
-// +kubebuilder:validation:XValidation:rule="self.action in ['Replace', 'HashMod', 'Lowercase', 'Uppercase', 'KeepEqual', 'DropEqual'] ? (has(self.targetLabel) && self.targetLabel != ") : !has(self.targetLabel)",message="targetLabel is required when action is Replace, HashMod, Lowercase, Uppercase, KeepEqual or DropEqual, and forbidden otherwise"
+// +kubebuilder:validation:XValidation:rule="self.action in ['Replace', 'HashMod'] ? (has(self.targetLabel) && self.targetLabel != '') : !has(self.targetLabel)",message="targetLabel is required when action is Replace or HashMod, and forbidden otherwise"
 // +kubebuilder:validation:XValidation:rule="self.action in ['Replace', 'LabelMap'] || !has(self.replacement)",message="replacement is only valid when action is Replace or LabelMap"
 type RelabelConfig struct {
🤖 Prompt for AI Agents
In `@config/v1alpha1/types_cluster_monitoring.go` around lines 734 - 737, Fix the
malformed CEL string and align the XValidation action list with the actual
RelabelAction enum used by RelabelConfig: replace the unclosed quote
`self.targetLabel != ”` with a proper empty-string literal `self.targetLabel !=
''`, and update the first rule's action set to only reference existing enum
values (e.g. use
['Replace','HashMod','LabelMap','Keep','Drop','LabelDrop','LabelKeep'] or
whichever subset requires targetLabel), or alternatively add the missing enum
members (Lowercase, Uppercase, KeepEqual, DropEqual) to the RelabelAction enum
so the rule and enum are consistent.

Comment on lines +1059 to +1063
if in.VolumeClaimTemplate != nil {
in, out := &in.VolumeClaimTemplate, &out.VolumeClaimTemplate
*out = new(v1.PersistentVolumeClaim)
(*in).DeepCopyInto(*out)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find source type files and check VolumeClaimTemplate definitions
echo "=== Looking for types files ==="
fd -e go -g '*types*.go' | head -20

echo ""
echo "=== Checking PrometheusConfig definition ==="
rg -n 'type PrometheusConfig struct' -A 30 --type=go

echo ""
echo "=== Checking AlertmanagerCustomConfig definition ==="
rg -n 'type AlertmanagerCustomConfig struct' -A 30 --type=go

echo ""
echo "=== Checking VolumeClaimConfig definition ==="
rg -n 'type VolumeClaimConfig struct' -A 20 --type=go

Repository: openshift/api

Length of output: 9831


🏁 Script executed:

#!/bin/bash
echo "=== Searching for VolumeClaimTemplate in PrometheusConfig ==="
rg -n 'VolumeClaimTemplate' config/v1alpha1/types_cluster_monitoring.go -B 2 -A 2

echo ""
echo "=== Full PrometheusConfig struct ==="
sed -n '434,600p' config/v1alpha1/types_cluster_monitoring.go | head -100

Repository: openshift/api

Length of output: 6976


*PrometheusConfig.VolumeClaimTemplate should use VolumeClaimConfig, not v1.PersistentVolumeClaim.

The PR changes volumeClaimTemplate from corev1.PersistentVolumeClaim to the new VolumeClaimConfig struct, but PrometheusConfig.VolumeClaimTemplate (line 600 in types_cluster_monitoring.go) still uses *v1.PersistentVolumeClaim, while AlertmanagerCustomConfig.VolumeClaimTemplate (line 265) correctly uses VolumeClaimConfig. Update the source type definition in types_cluster_monitoring.go line 600, then regenerate.

🤖 Prompt for AI Agents
In `@config/v1alpha1/zz_generated.deepcopy.go` around lines 1059 - 1063,
PrometheusConfig.VolumeClaimTemplate is still typed as
*v1.PersistentVolumeClaim; update the PrometheusConfig definition
(types_cluster_monitoring.go) to use the new VolumeClaimConfig type (same as
AlertmanagerCustomConfig.VolumeClaimTemplate uses), then re-run the deepcopy
generator to regenerate zz_generated.deepcopy.go so the DeepCopyInto/DeepCopy
methods use VolumeClaimConfig instead of *v1.PersistentVolumeClaim; ensure any
imports and references to VolumeClaimTemplate constructors or defaults are
adjusted to the new type.

Comment on lines +24043 to +24053
"volumeClaimTemplate": {
SchemaProps: spec.SchemaProps{
Description: "volumeClaimTemplate defines persistent storage for Prometheus. Use this setting to configure the persistent volume claim, including storage class and volume size. If omitted, the Pod uses ephemeral storage and Prometheus data will not persist across restarts.",
Ref: ref("k8s.io/api/core/v1.PersistentVolumeClaim"),
},
},
},
},
},
Dependencies: []string{
"github.com/openshift/api/config/v1alpha1.AdditionalAlertmanagerConfig", "github.com/openshift/api/config/v1alpha1.ContainerResource", "github.com/openshift/api/config/v1alpha1.Label", "github.com/openshift/api/config/v1alpha1.RemoteWriteSpec", "github.com/openshift/api/config/v1alpha1.Retention", "k8s.io/api/core/v1.PersistentVolumeClaim", "k8s.io/api/core/v1.Toleration", "k8s.io/api/core/v1.TopologySpreadConstraint"},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Inconsistency: PrometheusConfig.volumeClaimTemplate uses PersistentVolumeClaim instead of VolumeClaimConfig.

This contradicts the PR objective which states the volumeClaimTemplate field should be replaced with VolumeClaimConfig for both Alertmanager and Prometheus:

  • AlertmanagerConfig.volumeClaimTemplateVolumeClaimConfig ✓ (line 22344)
  • PrometheusConfig.volumeClaimTemplatek8s.io/api/core/v1.PersistentVolumeClaim ✗ (line 24046)

The Dependencies list at line 24053 confirms PersistentVolumeClaim is still referenced. Update the source PrometheusConfig struct to use VolumeClaimConfig and regenerate.

🤖 Prompt for AI Agents
In `@openapi/generated_openapi/zz_generated.openapi.go` around lines 24043 -
24053, PrometheusConfig currently still references
k8s.io/api/core/v1.PersistentVolumeClaim for the volumeClaimTemplate; change the
PrometheusConfig struct to use your VolumeClaimConfig type (the same replacement
done for AlertmanagerConfig.volumeClaimTemplate) and remove the
PersistentVolumeClaim reference from the generated Dependencies list so it
references VolumeClaimConfig instead; after updating the source struct,
regenerate the openapi/zz_generated.openapi.go so the SchemaProps and
Dependencies reflect VolumeClaimConfig rather than
k8s.io/api/core/v1.PersistentVolumeClaim.

Comment on lines +13168 to +13171
"volumeClaimTemplate": {
"description": "volumeClaimTemplate defines persistent storage for Prometheus. Use this setting to configure the persistent volume claim, including storage class and volume size. If omitted, the Pod uses ephemeral storage and Prometheus data will not persist across restarts.",
"$ref": "#/definitions/io.k8s.api.core.v1.PersistentVolumeClaim"
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Inconsistent volumeClaimTemplate type - should use VolumeClaimConfig.

The PR objective states that volumeClaimTemplate should be replaced with VolumeClaimConfig for both Alertmanager and Prometheus. AlertmanagerConfig.volumeClaimTemplate (line 12175) correctly references VolumeClaimConfig, but PrometheusConfig.volumeClaimTemplate still references io.k8s.api.core.v1.PersistentVolumeClaim.

This inconsistency means Prometheus storage configuration would still expose ~40 PVC fields instead of the simplified sizeInGiB and storageClassName fields.

🐛 Proposed fix
         "volumeClaimTemplate": {
           "description": "volumeClaimTemplate defines persistent storage for Prometheus. Use this setting to configure the persistent volume claim, including storage class and volume size. If omitted, the Pod uses ephemeral storage and Prometheus data will not persist across restarts.",
-          "$ref": "#/definitions/io.k8s.api.core.v1.PersistentVolumeClaim"
+          "default": {},
+          "$ref": "#/definitions/com.github.openshift.api.config.v1alpha1.VolumeClaimConfig"
         }
🤖 Prompt for AI Agents
In `@openapi/openapi.json` around lines 13168 - 13171,
PrometheusConfig.volumeClaimTemplate currently $ref's
io.k8s.api.core.v1.PersistentVolumeClaim; change that $ref to the simplified
VolumeClaimConfig type used by AlertmanagerConfig (i.e. replace the reference
value "#/definitions/io.k8s.api.core.v1.PersistentVolumeClaim" with
"#/definitions/VolumeClaimConfig" for the PrometheusConfig.volumeClaimTemplate
entry) so Prometheus uses the compact sizeInGiB/storageClassName schema instead
of the full PVC definition.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 14, 2026

@danielmellado: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/integration 10e16f8 link true /test integration
ci/prow/verify-crdify 10e16f8 link true /test verify-crdify
ci/prow/verify-crd-schema 10e16f8 link true /test verify-crd-schema

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants