Skip to main content
Version: 2026.03.0

Troubleshooting

Check node status

Check node status with:

Terminal
kubectl get nodes

Expected output when the node is healthy:

Example output
NAME                          STATUS   ROLES                  AGE   VERSION
speech-platform.localdomain Ready control-plane,master 9s v1.30.5+k3s1
info

Node list can be empty (No resources found) or node can be in NotReady state if the virtual appliance is starting up. This is normal and should resolve within a few moments.

The node also needs enough free disk and memory capacity. When resources are insufficient, pressure events are emitted. Run the following command to see node conditions:

Terminal
kubectl describe node | grep -A 6 Conditions:
Healthy node conditions
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure False Mon, 29 Apr 2024 08:13:54 +0000 Mon, 29 Apr 2024 07:46:39 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Mon, 29 Apr 2024 08:13:54 +0000 Mon, 29 Apr 2024 08:06:45 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Mon, 29 Apr 2024 08:13:54 +0000 Mon, 29 Apr 2024 07:46:39 +0000 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Mon, 29 Apr 2024 08:13:54 +0000 Mon, 29 Apr 2024 07:46:39 +0000 KubeletReady kubelet is posting ready status

Disk pressure

Disk pressure node event is emitted when Kubernetes is running out of disk capacity in the /var filesystem:

Disk pressure detected
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
DiskPressure True Mon, 29 Apr 2024 08:13:54 +0000 Mon, 29 Apr 2024 08:06:45 +0000 KubeletHasDiskPressure kubelet has disk pressure

Follow the procedure for extending the disks.

Memory pressure

Memory pressure node event is emitted when Kubernetes is running out of free memory:

Memory pressure detected
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure True Mon, 29 Apr 2024 08:50:50 +0000 Mon, 29 Apr 2024 08:50:50 +0000 KubeletHasInsufficientMemory kubelet has insufficient memory available

You need to grant more memory to the virtual appliance.

View pod logs

Logs are stored in /data/log/pods/ or in /data/logs/containers. You can view them via Filebrowser if needed.

Alternatively you can display logs with kubectl command:

Terminal
kubectl -n speech-platform logs -f voiceprint-extraction-7867578b97-w7bzd
Example output
[2024-04-29 08:59:10.250] [Configuration] [info] model: /models/xl-5.0.0.model
[2024-04-29 08:59:10.250] [Configuration] [info] port: 8080
[2024-04-29 08:59:10.250] [Configuration] [info] device: cpu
[2024-04-29 08:59:10.250] [critical] base64_decode: invalid character ''<''

Changes in configuration are not applied

When to use this

Use this when you have made changes to /data/speech-platform/speech-platform-values.yaml but they do not seem to take effect (e.g., new settings aren't reflected in the application, services don't start properly, etc.).

Why this happens: The Helm controller automatically watches for changes in the config file. If the YAML configuration file is invalid, the update job fails and the system continues running the old config or fails to deploy completely.

How to troubleshoot: If the configuration is incorrect, the update job will not complete successfully, and the underlying pod will either restart or be in an error state. The pod status will reflect this issue.

Step 1. Check the Helm install job status:

Terminal
kubectl get pods -n kube-system | grep -i helm-install
Example output — note the Error status on speech-platform
helm-install-filebrowser-2b7pn                  0/1     Completed   0             51m
helm-install-ingress-nginx-m87d4 0/1 Completed 0 51m
helm-install-nginx-nrcvk 0/1 Completed 0 51m
helm-install-dcgm-exporter-fjqzz 0/1 Completed 0 51m
helm-install-kube-prometheus-stack-jn5bz 0/1 Completed 0 51m
helm-install-keda-vsn95 0/1 Completed 0 51m
helm-install-speech-platform-9l9vj 0/1 Error 4 (46s ago) 6m15s

Step 2. Inspect the logs of the failing job:

Terminal
kubectl logs -f <failing-job-name> -n kube-system
Example output — YAML parsing error
Upgrading speech-platform
+ helm_v3 upgrade --namespace speech-platform speech-platform https://10.43.0.1:443/static/phonexia-charts/speech-platform-0.0.0-36638f5-helm.tgz --values /config/values-10_HelmChartConfig.yaml
Error: failed to parse /config/values-10_HelmChartConfig.yaml: error converting YAML to JSON: yaml: line 494: could not find expected ':'

Step 3. Validate the YAML (see next section).

Check configuration file validity

When to use this

Whenever changes are made to speech-platform-values.yaml, or if a Helm update job fails due to YAML syntax issues.

Why this matters: Helm requires a valid YAML configuration file to parse and apply configuration. A missing colon, incorrect indentation, or misplaced value can break the deployment.

Step 1. Validate the config:

Terminal
yq .spec.valuesContent /data/speech-platform/speech-platform-values.yaml | yq .

If the configuration file is valid, the content of the file will be printed. Otherwise, the line number with an error will be printed:

Example error output
Error: bad file '-': yaml: line 253: could not find expected ':'
Line number offset

The actual configuration is nested under spec.valuesContent, usually starting on line 7. If you see an error on line 253, add 7 (253 + 7 = 260) to get the actual line in the file.

Step 2. View the lines around the error:

Terminal
cat -n /data/speech-platform/speech-platform-values.yaml | grep 260 -B 10 -A 10
Example output — invalid YAML
   250
251 model:
252 volume:
253 hostPath:
254 path: /data/models/enhanced_speech_to_text_built_on_whisper
255
256 # Name of a model file inside the volume, for example "large_v2-1.0.0.model"
257 file: "large_v2-1.0.1.model"
258 license:
259 value:
260 "eyJ2ZX...=="
261
262 # Uncomment this to grant access to GPU on whisper pod
263 resources:

Step 3. Fix the error

In the example above, the license value is on a separate line (260) from its key (259). This is invalid YAML — merge the two lines:

Invalid
value:
"eyJ2ZX...=="
Correct
value: "eyJ2ZX...=="

The resulting file should look like this:

Fixed output
   258          license:
259 value: "eyJ2ZX...=="
260
261 # Uncomment this to grant access to GPU on whisper pod
262 resources:
263 limits:
264 nvidia.com/gpu: "1"
265
266 # Uncomment this to run whisper on GPU
267 runtimeClassName: "nvidia"
268
269 service:
270 clusterIP: "None"

Disable DNS resolving for specific domains

When to use this

Use this when you see long response times, timeout errors, or task processing delays due to DNS lookup issues, particularly when using DHCP or custom DNS setups.

Why this happens: This happens when DHCP is used for IP address assignment for the virtual appliance which usually configures nameserver and search domains in /etc/resolv.conf:

/etc/resolv.conf
nameserver 192.168.137.1
search localdomain

Check coreDNS logs first:

Terminal
kubectl -n kube-system logs -l k8s-app=kube-dns

Following lines in the logs indicate this issue:

coreDNS timeout errors
[ERROR] plugin/errors: 2 speech-platform-envoy.localdomain. AAAA: read udp 10.42.0.27:60352->192.168.137.1:53: i/o timeout
[ERROR] plugin/errors: 2 speech-platform-envoy.localdomain. AAAA: read udp 10.42.0.27:40254->192.168.137.1:53: i/o timeout
[ERROR] plugin/errors: 2 speech-platform-envoy.localdomain. AAAA: read udp 10.42.0.27:47838->192.168.137.1:53: i/o timeout

Communication within the virtual appliance does not use FQDN, which means that each DNS name is resolved with all domains. Internal Kubernetes domains (<namespace>.svc.cluster.local, svc.cluster.local and cluster.local) are resolved immediately with coreDNS, non-Kubernetes domains are resolved with nameserver provided by DHCP. If access to the nameserver is blocked (for example, by firewall), then resolving of a single name can take up to 10 seconds, which can significantly increase task processing duration.

How to resolve: To avoid this issue, you can either allow communication from virtual appliance to DHCP-configured DNS server or configure Kubernetes resolver to skip lookup for DHCP-provided domain(s):

Step 1. Create a DNS override file:

Create file /data/speech-platform/coredns-custom.yaml with the following content. Replace <domain1.com> and <domain2.com> with the domains you want to disable lookup for:

/data/speech-platform/coredns-custom.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns-custom
namespace: kube-system
data:
custom.server: |
<domain1.com>:53 {
log
}
<domain2.com>:53 {
log
}

Step 2. Restart CoreDNS to apply the change:

Terminal
kubectl -n kube-system rollout restart deploy/coredns

Step 3. Verify CoreDNS is healthy and the pod is running:

Terminal
kubectl -n kube-system get pods -l k8s-app=kube-dns

Step 4. Restart all speech-platform pods:

Terminal
kubectl -n speech-platform rollout restart deploy
kubectl -n speech-platform rollout restart sts

Deployment in an air-gapped environment

If you plan to deploy the virtual appliance in an environment without DHCP or DNS availability, you will need to make certain adjustments.

Detected networking issues

In certain network configurations, the Speech Platform system fails to start and shows detected issues with networking in the welcome screen. Typically, this happens in peer-to-peer, ad-hoc networks without a router (e.g. multiple computers connected just to a switch), with either static IP addresses configured manually, or dynamically assigned by a local DHCP server.

The issue with such network setup is that there is no default gateway defined — as it is not needed in such setup — but the main Speech Platform k3s service requires to have a gateway IP address defined.

As a result, when the Speech Platform system does not find a default gateway IP address assigned to a device, the main k3s service fails to start, thus the entire Speech Platform system won't start.

In that case it's necessary to configure the network manually, with statically defined IP addresses. Use the following commands to do that.

  1. First of all, check the name of the network connection — if you see different name than Wired connection 1, use the shown connection name instead in the further commands
    Terminal
    nmcli con show
  2. Assign a static IP address (replace IP_ADDRESS with the IP address you want the Virtual Appliance to use)
    Terminal
    nmcli con mod "Wired connection 1" ipv4.addr IP_ADDRESS/24
  3. Set the gateway IP address (replace GATEWAY_ADDRESS with the gateway IP address — in networks without router it can be any IP, e.g. the IP address of the machine itself)
    Terminal
    nmcli con mod "Wired connection 1" ipv4.gateway GATEWAY_ADDRESS
  4. Then configure the connection to use manual IP settings
    Terminal
    nmcli con mod "Wired connection 1" ipv4.method manual
  5. Deactivate the connection, so that the changes can be applied
    Terminal
    nmcli con down "Wired connection 1"
  6. Reactivate the connection
    Terminal
    nmcli con up "Wired connection 1"
  7. Once the connection is active, reset and restart the k3s service:
    Terminal
    systemctl reset-failed k3s
    systemctl start k3s

After completing these steps, Speech Platform should perform its startup sequence and complete it successfully by showing the welcome screen. However, if your network lacks access to upstream DNS, or does not contain DNS at all, further modifications may be needed by following the instructions in the next section.

No Upstream DNS Available

If your environment has a router assigning IP addresses but is isolated from upstream DNS servers, complete the following steps.

warning

The speech-platform will start normally, but all processing tasks will return an error state until this is resolved.

  1. Create a file named coredns-config.yaml in the directory /data/speech-platform/
  2. Insert the following content into the file
    /data/speech-platform/coredns-config.yaml
    apiVersion: v1
    kind: ConfigMap
    metadata:
    name: coredns
    namespace: kube-system
    data:
    Corefile: |
    .:53 {
    errors
    health
    ready
    kubernetes cluster.local in-addr.arpa ip6.arpa {
    pods insecure
    fallthrough in-addr.arpa ip6.arpa
    }
    hosts /etc/coredns/NodeHosts {
    ttl 60
    reload 15s
    fallthrough
    }
    prometheus :9153
    cache 30
    loop
    reload
    loadbalance
    import /etc/coredns/custom/*.override
    }
    import /etc/coredns/custom/*.server
  3. Save the file and run:
    Terminal
    kubectl rollout restart deploy coredns -n kube-system