Related to:

This FAQ explains how the liveness probe failures manifested on a pod, why they occurred, and the steps taken to restore stability.

Problem:

The pod entered a restart loop with frequent restarts over several hours. kubectl get pods showed the pod in 0/1 Running with an increasing restart count.

Pod events:

Events: 
  Type     Reason     Age                  From               Message 
  ----     ------     ----                 ----               ------- 
  Warning  Unhealthy  10h (x3 over 10h)    kubelet            Liveness probe failed: Get "http://***HIDDEN***:8080/healthcheck": dial tcp ***HIDDEN***:8080: connect: connection refused  
  Warning  Unhealthy  10h (x162 over 10h)  kubelet            Readiness probe failed: Get "http://***HIDDEN***:8080/healthcheck": dial tcp ***HIDDEN***:8080: connect: connection refused 
  Normal   Killing    10h (x9 over 10h)    kubelet            Container iax-app-capacity-daemon failed liveness probe, will be restarted

Kubernetes events reported repeated Liveness probe failed and Readiness probe failed messages, followed by Normal Killing when the kubelet restarted the container:

“Liveness probe failed … connection refused”
“Readiness probe failed … connection refused”
“Container … failed liveness probe, will be restarted”

Container logs did not show a hard error; instead, they stopped at random startup lines, e.g.:

Starting metric indexer for config …
HikariPool-7 - Starting… then HikariPool-7 - Start completed.
The service had not yet made its health endpoint responsive when the liveness probe fired.

Possible cause:

The pod's startup time exceeded the default liveness probe window. While the service was still initializing, the liveness probe attempted to call /healthcheck and timed out/was refused, causing Kubernetes to mark the container unhealthy and restart it prematurely—creating a loop. The defaults in the deployment were too aggressive for this environment:

initialDelaySeconds: 120
timeoutSeconds: 5
failureThreshold: 3

This behavior was reproducible in busy/demo-heavy environments and was recognized as an issue with the default probe timing.

Possible solution:

Extending the liveness probe window allowed the service to finish initialization and expose a healthy endpoint before the kubelet judged it. Specifically:

livenessProbe:
  httpGet:
    path: /healthcheck
    port: 8080
  initialDelaySeconds: 300   # increased from 120
  timeoutSeconds: 10         # increased from 5
  failureThreshold: 5        # increased from 3
  periodSeconds: 10

After increasing initialDelaySeconds to 300 (with the optional timeout/failure tweaks), the pod stabilized: the pod reached a healthy state, the health probe began responding (HTTP health probe server listening … /healthcheck), and restarts ceased.

Steps:

Identify the pod's parent deployment.
- kubectl get deployment -n <namespace>
increase/adjust the livenessProbe settings.
- kubectl edit deployment <deployment_name> -n <namespace>

If you need further help:

Please contact our support team via the chat service box on any of our websites or raise a support request.
Make sure you provide us with:
- Background of the issue or request.
- Use cases, requirements, business impact, etc.
- Encountered error messages.
- Log files or diagnostic files.
- Screenshots.
- And other important information relevant to your inquiry.

Articles in this section

ITRS Analytics - “Liveness probe failed” seen in pod events