I came across the following issue:
If for some reason Passbolt application looses connectivity to database or cache, Passbolt’s /healthcheck/status still returns “OK”, even though service as such is degraded.
Users are getting “An Internal Error Has Occurred. Try again” error or just simply cannot login.
Therefore in such situations /healthcheck/status becomes unreliable for monitoring purposes.
Can we get healtcheck status to return “Degraded” in such situations?
I am running Passbolt application image5.10.0-1 using helm chart on OpenShift with external PostgreSQL database.
Yes the current healthcheck status endpoint is really just there to check if the site is up or not, not that it can connect to database or if cache is working etc. We faced the same issue with passbolt cloud and fixed it as you mentioned.
Right now it is not possible to define strategies for the healthcheck for CE/PRO but this is something we have done for the Cloud however. I checked with the team and we will bring back to the community edition as it’s quite useful.
In the meantime it’s possible for you to use passbolt CLI and run this periodically to check. E.g. build a service that performs a more complex check by calling multiple endpoint and run this as a separate service that you probe instead.