Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add logic to support custom liveness and readiness probes #57

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

AbrohamLincoln
Copy link

Closes #54

Retains current liveness and readiness probe logic while adding the ability to override the defaults.

@vladciobancai
Copy link

Are there any plans for this PR to be merged ?

templates/_helpers.tpl Outdated Show resolved Hide resolved
values.yaml Outdated
targetMemoryUtilizationPercentage: 60 # available only on Kubernetes ≥1.23 [required "autoscaling/v2"]
behavior: {} # available only on Kubernetes ≥1.23 [required "autoscaling/v2"]
targetMemoryUtilizationPercentage: 60 # available only on Kubernetes ≥1.23 [required "autoscaling/v2"]
behavior: {} # available only on Kubernetes ≥1.23 [required "autoscaling/v2"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spurious whitespace changes here. Please remove.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removes spaces. Yamllint has a rule that requires two spaces between content and comments by default.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, interesting. I've never seen that rule before. Thanks for the context! I didn't know yamllint existed, and I'm glad it does. CI for this repo is currently doing helm lint --strict, which doesn't enforce nearly as much as one might naively expect it to. We should add yamllint to this repo's CI and get a PR together to fix any current violations so we have a basis for code style consistency.

Copy link
Member

@canterberry canterberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing! Just made a review pass and have quite minor blocking bits of feedback (just about adding a prefix to the new variable definitions and omitting extraneous whitespace changes). With that, I have no concerns with 👍 and having this merged.

@canterberry
Copy link
Member

canterberry commented Aug 9, 2022

During local manual verification after cloning this branch locally, I expected to be able to install the chart with --set readinessProbe.httpGet.path=/metrics to patch just the path of the readiness probe without impacting the rest of its configuration. Instead, I got an error for missing required field "port". As an end-user, I don't know what value I need to provide for that, because port 5000 is hard-coded into the chart as the port on which the container is listening. Further, in the default scenario, the chart decides when to set the scheme based on the presence of a TLS secret.

Upon closer inspection, then, and for the above reasons, I'm not sure that a full-on livenessProbe and readinessProbe override is an appropriate solution. It simply requires too much intermingled knowledge of anxilliary parts of the chart.

Based on the ask in #54, a more targeted solution may be more appropriate and less cumbersome.

To help in narrowing down on a solution:

  • Is it the livenessProbe or the readinessProbe (or both) where the failure count is too low? Maybe only one or the other needs to be adjusted. If they're essentially the same (which in the chart currently, they're identical), then it may be a signal for the chart to make them configurable as a set.
  • At what point are the failures happening? If it's as your registry is being launched, then the default initialDelaySeconds may be too low for you (and likely others).

Here are the timing and threshold related properties on liveness and readiness probes, for reference:

  • failureThreshold
  • initialDelaySeconds
  • periodSeconds
  • successThreshold
  • terminationGracePeriodSeconds
  • timeoutSeconds

My soft suggestion here would be to, instead of allowing the whole livenessProbe and readinessProbe blocks to be overridden, to support overriding each of these individual properties above, for each.

A simpler implementation might be to merge any configured livenessProbe into the default one, so the livenessProbe value becomes a patch rather than a replacement. It's much more powerful than it needs to be, though, to remedy the false-negatives reported in #54, and opens up a higher probability of misconfigured or dysfunctional probes.

@canterberry
Copy link
Member

@AbrohamLincoln Hey, sorry for what may have come across as shutting down your good work here. I'd really like to address the core issue! Any thoughts or suggestions on how to move forward?

@AbrohamLincoln
Copy link
Author

AbrohamLincoln commented Aug 15, 2022 via email

path: /
port: 5000
{{- end -}}
{{- end -}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add \n

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Add support for custom livenessProbe and readinessProbes
4 participants