Skip to main content
Version: 1.1.0

Speech to text whisper enhanced helm chart

Version: 1.1.0-helm Type: application AppVersion: 1.1.0

Phonexia Speech To Text (STT) with Whisper

Maintainers

NameEmailUrl
Phonexiasupport@phonexia.comhttps://www.phonexia.com

Helm: >= 3.2.0

Values

KeyTypeDefaultDescription
affinityobject{}
config.devicestring"cpu"Compute device used for inference Can be cpu or cuda If you use cuda you have to use also image tag with gpu support
config.instancesPerDeviceint1Parallel tasks per device. GPU only.
config.license.useSecretboolfalseGet license from secret object
config.license.valuestring"invalidLicenseKey"License key
config.listeningAddressstring"[::]"Override address where the server will listen
config.logLevelstringinfoOverride log level Supported values: error, warning, info, debug, trace
config.model.filestring""Name of a model file inside the volume, for example "xl-5.0.0.model"
config.model.volumeobject{}Volume with Phonexia model
config.portint8080Port where the service will listen. The value must be same as service.port.
config.threadsPerInstanceint8Number of threads to use when running on CPU
fullnameOverridestring""String to fully override speech-to-text-whisper-enhanced.fullname template
image.pullPolicystring"IfNotPresent"Image pull policy
image.registrystring"registry.cloud.phonexia.com"Image registry
image.repositorystring"phonexia/dev/technologies/microservices/speech-to-text-whisper-enhanced"Image repository
image.tagstringappVersion specified in Chart.yamlSee speech-to-text-whisper-enhanced on dockerhub for available tags
imagePullSecretslist[]Specify docker-registry secret names as an array
ingress.annotationsobject{}
ingress.classNamestring""
ingress.enabledboolfalse
ingress.hosts[0]object{"host":"speech-to-text-whisper-enhanced.example.com","paths":[{"path":"/","pathType":"ImplementationSpecific"}]}Default host for the ingress resource
ingress.hosts[0].paths[0].pathTypestring"ImplementationSpecific"Ingress path type
ingress.tlslist[]
initContainerslist[]Init containers Evaluated as a template.
livenessProbeobject{"failureThreshold":3,"initialDelaySeconds":0,"periodSeconds":10,"successThreshold":1,"timeoutSeconds":1}Liveness probe settings
nameOverridestring""String to partially override speech-to-text-whisper-enhanced.fullname template (will maintain the release name)
nodeSelectorobject{}Startup probe settings startupProbe: initialDelaySeconds: 0 periodSeconds: 10 timeoutSeconds: 1 failureThreshold: 3 successThreshold: 1
podAnnotationsobject{}Annotations for pods
podSecurityContextobject{}Security context for pods
readinessProbeobject{"failureThreshold":3,"initialDelaySeconds":0,"periodSeconds":10,"successThreshold":1,"timeoutSeconds":1}Readiness probe settings
replicaCountint1Number of replicas to deploy
resourcesobject{}
runtimeClassNamestring""Specify runtime class
securityContextobject{}Security context for speech-to-text-whisper-enhanced container
service.clusterIPstring""
service.portint8080
service.typestring"ClusterIP"
serviceAccount.annotationsobject{}Annotations to add to the service account
serviceAccount.createbooltrueSpecifies whether a service account should be created
serviceAccount.namestring""The name of the service account to use. If not set and create is true, a name is generated using the fullname template
tolerationslist[]
updateStrategyobject{"type":"RollingUpdate"}Deployment update strategy

Installation

To successfully install the chart you have to obtain model at first. Service is unable to start without model. Feel free to contact phonexia support to obtain model for evaluation purpose.

Model

There are 2 ways how to pass a model to pods:

  • Pass the model via initContainer
  • Pass the model via volume

Pass the model via initContainer

With this approach no persistent volume is needed. InitContainer is added to the pod instead. It downloads model from specified location to ephemeral volume which is shared with main container. This happens each time when pod is re-deployed.

In values file it looks like:

# Set config.model.volume to emptyDir
config:
model:
volume:
emptyDir: {}
file: "speech_to_text_whisper_enhanced-medium-1.0.0.model"

initContainers:
- name: init-copy-model
image: alpine
command:
- sh
- -c
- |
set -e

# Install aws-cli package
apk add --no-cache aws-cli

# Create directory for models
mkdir -p /models

# Download model from s3 and store it to volume
aws s3 cp s3://some-bucket/some-path-to-model/speech_to_text_whisper_enhanced-medium-1.0.0.model ${PHX_MODEL_PATH}
env:
# PHX_MODEL_PATH variable must be same as in main container
- name: "PHX_MODEL_PATH"
value: "/models/{{ .Values.config.model.file }}"
# Set AWS_* variables to make aws cli work
- name: "AWS_DEFAULT_REGION"
value: "us-east-1"
- name: "AWS_ACCESS_KEY_ID"
value: "AKAI...CN"
- name: "AWS_SECRET_ACCESS_KEY"
value: "0lW...Yw"
# Mount empty volume to initContainer
volumeMounts:
- name: '{{ include "speech-to-text-whisper-enhanced.fullname" . }}-models-volume'
mountPath: /models

Pass the model via volume

With this approach you need to create persistent volume, copy model there and mount it to pod.

Following example shows how to do it in EKS with EBS-based dynamic provisioning.

  1. Create persistentVolumeClaim
# filename: speech-to-text-whisper-enhanced.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: speech-to-text-whisper-enhanced
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: ebs-sc

and apply it

kubectl apply -f speech-to-text-whisper-enhanced.yaml
  1. Create job which downloads model to persistent volume:
# filename: job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: speech-to-text-whisper-enhanced-download-model
spec:
template:
spec:
containers:
- name: download-model
image: alpine
command:
- sh
- -c
- |
set -e

# Install aws-cli package
apk add --no-cache aws-cli

# Create directory for models
mkdir -p /models

# Download model from s3 and store it to volume
aws s3 cp s3://some-bucket/some-path-to-model/speech_to_text_whisper_enhanced-medium-1.0.0.model ${PHX_MODEL_PATH}
env:
# PHX_MODEL_PATH variable must be same as .Values.config.model.file in values files
- name: "PHX_MODEL_PATH"
value: "/models/speech_to_text_whisper_enhanced-medium-1.0.0.model"
# Set AWS_* variables to make aws cli work
- name: "AWS_DEFAULT_REGION"
value: "us-east-1"
- name: "AWS_ACCESS_KEY_ID"
value: "AKAI...CN"
- name: "AWS_SECRET_ACCESS_KEY"
value: "0lW...Yw"
volumeMounts:
- name: persistent-storage
mountPath: /models
volumes:
- name: persistent-storage
persistentVolumeClaim:
claimName: speech-to-text-whisper-enhanced
restartPolicy: Never
backoffLimit: 3

Apply it and wait until job is finished:

kubectl apply -f job.yaml
  1. Configure values file to use existing PVC:
config:
model:
# Volume with Phonexia model
volume:
persistentVolumeClaim:
claimName: speech-to-text-whisper-enhanced

# Name of a model file inside the volume, for example "xl-5.0.0.model"
file: "speech_to_text_whisper_enhanced-medium-1.0.0.model"

Installing the Chart

When you have configure model you can proceed with installation itself. To install the chart with the release name my-release:

helm install my-release oci://registry-1.docker.io/phonexia/speech-to-text-whisper-enhanced

This command deploy speech-to-text-whisper-enhanced on the Kubernetes cluster in the default configuration.

Use --version parameter to install specific version:

helm install my-release oci://registry-1.docker.io/phonexia/speech-to-text-whisper-enhanced --version 1.0.0-helm

Exposing the service

To expose the service outside of kubernetes cluster follow Using a Service to Expose Your App.

Ingress

GID service is using GRPC protocol which can be exposed by some ingress controllers. For example nginx-ingress controller support this. To expose speech-to-text-whisper-enhanced service via ingress use following configuration:

ingress:
# Deploy ingress object
enabled: true
# Ingress class name
className: "nginx"
annotations:
# Force redirect to SSL
nginx.ingress.kubernetes.io/ssl-redirect: "true"

# Tell nginx that backend service use GRPC
nginx.ingress.kubernetes.io/backend-protocol: "GRPC"
hosts:
# Hostnames
- host: speech-to-text-whisper-enhanced.example.com
paths:
- path: /
pathType: ImplementationSpecific
# Use tls
tls:
# Secret containing TLS certificate
- secretName: speech-to-text-whisper-enhanced-tls
# TLS hostnames
hosts:
- speech-to-text-whisper-enhanced.example.com

Use grpcurl to check if everything works as expected. Output of the following command

$ grpcurl --insecure speech-to-text-whisper-enhanced.example.com:443 grpc.health.v1.Health/Check

should be

{
"status": "SERVING"
}

Uninstalling the Chart

To uninstall/delete the my-release release:

helm delete my-release

The command removes all the Kubernetes components associated with the chart and deletes the release.