Skip to content

feat(overrides): add CEL valueProgram support for dynamic override values#595

Open
JinghanFu wants to merge 14 commits intomainfrom
jinghanfu/kcl-override-celValue
Open

feat(overrides): add CEL valueProgram support for dynamic override values#595
JinghanFu wants to merge 14 commits intomainfrom
jinghanfu/kcl-override-celValue

Conversation

@JinghanFu
Copy link
Copy Markdown
Contributor

Summary

Adds a valueProgram field to overrides, allowing CEL expressions to dynamically compute override values from the current (live) resource state. Previously, overrides could only set static values. With valueProgram, the override value is evaluated at apply time against the actual cluster resource, enabling scenarios like preserving VPA-adjusted resource quantities.

Motivation

When external actors (e.g., VPA) modify resource fields, Eno's static overrides couldn't reference the current cluster state to decide what value to use. valueProgram solves this by letting a CEL expression read from the live resource and return a computed value.

Changes

internal/resource/mutation/mutation.go

  • Added ValueProgram cel.Program field to Op struct
  • Added valueProgram JSON field to jsonOp for deserialization
  • UnmarshalJSON: parses valueProgram string into a compiled cel.Program via enocel.Parse()
  • Apply: if ValueProgram is set, evaluates it against the current resource state to resolve the value dynamically; skips mutation if current is nil or CEL evaluates to null

internal/resource/mutation/mutation_test.go

  • Added tests for valueProgram evaluation, null handling, and missing current resource scenarios

pkg/function/overrides/override.go

  • Added ValueProgram string field to Override struct
  • validate(): if ValueProgram is non-empty, parses it as CEL to ensure validity

pkg/function/overrides/override_test.go

  • Added unit tests for ValueProgram validation and Override.Test() behavior

internal/controllers/reconciliation/overrides_test.go

  • Added integration-level tests exercising valueProgram through the full SnapshotWithOverridesOp.Apply path

docs/overrides.md

  • Added documentation for the valueProgram field with usage examples

Testing

  • All existing and new unit tests pass (go test ./internal/resource/mutation/... ./pkg/function/overrides/... ./internal/cel/...)

@JinghanFu JinghanFu force-pushed the jinghanfu/kcl-override-celValue branch from 107ff51 to 0b8abe2 Compare April 21, 2026 08:26
Comment thread pkg/function/overrides/override.go Outdated
Value any `json:"value"`
Condition string `json:"condition"`
Path string `json:"path"`
Value any `json:"value,omitempty"`
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is adding omitempty intentinoal?

@@ -55,6 +56,13 @@ func (o *Override) validate() (cel.Program, error) {
return nil, fmt.Errorf("failed to create program: %w", err)
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: If value and valueProgram are mutually exclusive, can we add a logging here for telemetry?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Ruinan. May I know whether you mean checking the cases for mutually exclusive and then return nil, Errorf("error message") here (consistent with all other checks in this function) or printing this out by fmt.Printf("error message")? Thank you!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sorry maybe not here, but it would be better if we can add a logging in telemetry saying that we are currently using valueProgram instead of Value.

Not a blocker, if it's too complicated we don't need this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, Ruinan! I put the logging in telemetry in mutation.go and they look like:
if o.ValueProgram != nil { ... logger.Info("override using valueProgram (resolved CEL value expression)", "path", o.Path.String()) } else { logger.Info("override using static default value", "path", o.Path.String()) }

Comment thread internal/resource/mutation/mutation.go Outdated
logger.Info("CEL value expression evaluated to null, skipping mutation", "path", o.Path.String())
return StatusInactive, nil
}
logger.Info("resolved CEL value expression", "path", o.Path.String(), "resolvedValue", resolvedValue)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would logging resolvedValue be too sensitive to log in telemetry.

@ruinan-liu
Copy link
Copy Markdown
Collaborator

ruinan-liu commented Apr 21, 2026

Could we also add a e2e test like in this PR? https://github.com/Azure/eno/pull/590/changes#diff-ff578928676707b352eb1dd2ee6a304cf2887ec8969d83e872ba5b4243b1f930.
You could ask copilot to generate the tests, it's pretty accurate in writing tests.

@ruinan-liu
Copy link
Copy Markdown
Collaborator

For smoke test failures, I have a fix in this PR: #526. I'm currently working on getting it to merge.

Copy link
Copy Markdown
Collaborator

@ruinan-liu ruinan-liu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jinghan for the Pr. I've added some minor comments. Please let me know if you have any question.

Copy link
Copy Markdown
Collaborator

@ruinan-liu ruinan-liu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reply to comments

…lues

Add valueProgram field to overrides that allows CEL expressions to
compute override values dynamically from the current resource state,
instead of only supporting static values. This enables use cases like
preserving VPA-managed resource quantities conditionally.

- Add ValueProgram field to Op and Override structs
- Evaluate valueProgram CEL expressions during Op.Apply
- Add validation for valueProgram in override helpers
- Add comprehensive unit and integration tests
- Add overrides documentation
… review feedback

- Merge TestAnnotateOverrides_* into single table-driven TestAnnotateOverrides_Table
- Rename TestOverrideValueExpression to TestOverrideValueProgram
- Fix valueExpression references in comments to valueProgram
- Add StatusInvalidValueProgram for CEL value program evaluation errors
- Remove '(returns Inactive)' from docs/overrides.md
- Add valueProgram field for dynamic CEL-based override values
- Add telemetry logging to distinguish valueProgram vs static value usage
- Add e2e tests for CEL override functionality
@JinghanFu JinghanFu force-pushed the jinghanfu/kcl-override-celValue branch from 7b81d01 to d239a3d Compare April 21, 2026 23:50
Comment thread pkg/function/overrides/override.go Outdated
Path string `json:"path"`
Value any `json:"value,omitempty"`
Condition string `json:"condition"`
ValueProgram string `json:"valueProgram,omitempty"`
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Would be Expression or ValueExpression better?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially named it ValueExpression in override.go to reflect its string type, and ValueProgram in mutation.go to reflect its compiled cel.Program type. However, since the other three fields (Path, Condition, Value) use identical names across both files, I was concerned that having different names for this new field might introduce unnecessary confusion. So I opted to align them both as ValueProgram for consistency.

But if the naming discrepancy wouldn't cause confusion, I'm happy to rename it back to ValueExpression in override.go to more accurately reflect its string type!

Comment thread docs/overrides.md Outdated
[
{
"path": "self.spec.resourcePolicy.containerPolicies[0].maxAllowed.memory",
"valueProgram": "self.spec.resourcePolicy.containerPolicies[0].maxAllowed.memory",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename here too?

Comment thread internal/resource/mutation/mutation.go Outdated
StatusMissingParent Status = "MissingParent"
StatusIndexOutOfRange Status = "IndexOutOfRange"
StatusPathTypeMismatch Status = "PathTypeMismatch"
StatusInvalidValueProgram Status = "InvalidValueProgram"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename here too

Comment thread internal/resource/mutation/mutation.go Outdated
val, err := enocel.Eval(ctx, o.ValueProgram, comp, current, o.Path)
if err != nil {
logger.Error(err, "failed to evaluate value expression", "path", o.Path.String())
return StatusInvalidValueProgram, nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should return error

Comment thread internal/resource/mutation/mutation.go Outdated
return StatusInvalidValueProgram, nil
}
resolvedValue = val.Value()
if resolvedValue == nil || resolvedValue == structpb.NullValue_NULL_VALUE {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the user want to unset a value?

…ueExpression, return error, handle null unset
…e error handling

- Rename Op.ValueProgram field to Op.ValueExpression for consistency
  with the existing jsonOp.ValueExpression and API naming convention
- Wrap returned errors with fmt.Errorf to provide path context
- Move error key-value pair to end of logger.Info calls for readability
- Update all test case names and struct literals accordingly
Comment thread go.work.sum
cloud.google.com/go/workflows v1.12.4/go.mod h1:yQ7HUqOkdJK4duVtMeBCAOPiN1ZF1E9pAMX51vpwB/w=
cloud.google.com/go/workflows v1.13.0/go.mod h1:StCuY3jhBj1HYMjCPqZs7J0deQLHPhF6hDtzWJaVF+Y=
codeberg.org/go-fonts/liberation v0.5.0/go.mod h1:zS/2e1354/mJ4pGzIIaEtm/59VFCFnYC7YV6YdGl5GU=
codeberg.org/go-latex/latex v0.1.0/go.mod h1:LA0q/AyWIYrqVd+A9Upkgsb+IqPcmSTKc9Dny04MHMw=
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are we using those changes?

Comment thread internal/resource/mutation/mutation.go Outdated

if resolvedValue == structpb.NullValue_NULL_VALUE {
logger.Info("value expression evaluated to null (explicit unset)", "path", o.Path.String())
return StatusActive, nil
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Why are we returning StatusActive here?

Copy link
Copy Markdown
Collaborator

@ruinan-liu ruinan-liu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments

if resolvedValue == structpb.NullValue_NULL_VALUE {
logger.Info("value expression evaluated to null (explicit unset)", "path", o.Path.String())
return StatusActive, nil
}
Copy link
Copy Markdown
Collaborator

@xiazhan xiazhan Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we apply the value as "null"?

Copy link
Copy Markdown
Collaborator

@xiazhan xiazhan Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we verify that it works in e2e? Shall we add an e2e test case as well?

Convert structpb.NullValue_NULL_VALUE to Go nil and fall through to
Apply() instead of returning early. This ensures that when a
valueExpression evaluates to null, the corresponding field is actually
deleted from the object rather than silently skipping the mutation.
@JinghanFu JinghanFu force-pushed the jinghanfu/kcl-override-celValue branch from ee62c03 to 5fe02c7 Compare April 25, 2026 01:01
Add e2e test verifying that a null valueExpression deletes the targeted
field. Remove condition to prevent oscillation; use polling to wait for
the second reconciliation to apply the deletion.
The null valueExpression override cannot work in non-SSA mode because
buildNonStrategicPatch applies the override to both sides of the diff,
producing an empty patch. Skip the test when DISABLE_SSA=true.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants