Speech to text whisper enhanced helm chart
Phonexia Speech To Text (STT) with Whisper
Maintainers
Name | Url | |
---|---|---|
Phonexia | support@phonexia.com | https://www.phonexia.com |
Helm: >= 3.2.0
Values
Key | Type | Default | Description |
---|---|---|---|
affinity | object | {} | |
config.device | string | "cpu" | Compute device used for inference Can be cpu or cuda If you use cuda you have to use also image tag with gpu support |
config.instancesPerDevice | int | 1 | Parallel tasks per device. GPU only. |
config.license.useSecret | bool | false | Get license from secret object |
config.license.value | string | "invalidLicenseKey" | License key |
config.listeningAddress | string | "[::]" | Override address where the server will listen |
config.logLevel | string | info | Override log level Supported values: error , warning , info , debug , trace |
config.model.file | string | "" | Name of a model file inside the volume, for example "xl-5.0.0.model" |
config.model.volume | object | {} | Volume with Phonexia model |
config.port | int | 8080 | Port where the service will listen. The value must be same as service.port. |
config.threadsPerInstance | int | 8 | Number of threads to use when running on CPU |
fullnameOverride | string | "" | String to fully override speech-to-text-whisper-enhanced.fullname template |
image.pullPolicy | string | "IfNotPresent" | Image pull policy |
image.registry | string | "registry.cloud.phonexia.com" | Image registry |
image.repository | string | "phonexia/dev/technologies/microservices/speech-to-text-whisper-enhanced" | Image repository |
image.tag | string | appVersion specified in Chart.yaml | See speech-to-text-whisper-enhanced on dockerhub for available tags |
imagePullSecrets | list | [] | Specify docker-registry secret names as an array |
ingress.annotations | object | {} | |
ingress.className | string | "" | |
ingress.enabled | bool | false | |
ingress.hosts[0] | object | {"host":"speech-to-text-whisper-enhanced.example.com","paths":[{"path":"/","pathType":"ImplementationSpecific"}]} | Default host for the ingress resource |
ingress.hosts[0].paths[0].pathType | string | "ImplementationSpecific" | Ingress path type |
ingress.tls | list | [] | |
initContainers | list | [] | Init containers Evaluated as a template. |
nameOverride | string | "" | String to partially override speech-to-text-whisper-enhanced.fullname template (will maintain the release name) |
nodeSelector | object | {} | |
podAnnotations | object | {} | Annotations for pods |
podSecurityContext | object | {} | Security context for pods |
replicaCount | int | 1 | Number of replicas to deploy |
resources | object | {} | |
runtimeClassName | string | "" | Specify runtime class |
securityContext | object | {} | Security context for speech-to-text-whisper-enhanced container |
service.clusterIP | string | "" | |
service.port | int | 8080 | |
service.type | string | "ClusterIP" | |
serviceAccount.annotations | object | {} | Annotations to add to the service account |
serviceAccount.create | bool | true | Specifies whether a service account should be created |
serviceAccount.name | string | "" | The name of the service account to use. If not set and create is true, a name is generated using the fullname template |
tolerations | list | [] | |
updateStrategy | object | {"type":"RollingUpdate"} | Deployment update strategy |
Installation
To successfully install the chart you have to obtain model at first. Service is unable to start without model. Feel free to contact phonexia support to obtain model for evaluation purpose.
Model
There are 2 ways how to pass a model to pods:
- Pass the model via initContainer
- Pass the model via volume
Pass the model via initContainer
With this approach no persistent volume is needed. InitContainer is added to the pod instead. It downloads model from specified location to ephemeral volume which is shared with main container. This happens each time when pod is re-deployed.
In values file it looks like:
# Set config.model.volume to emptyDir
config:
model:
volume:
emptyDir: {}
file: "speech_to_text_whisper_enhanced-medium-1.0.0.model"
initContainers:
- name: init-copy-model
image: alpine
command:
- sh
- -c
- |
set -e
# Install aws-cli package
apk add --no-cache aws-cli
# Create directory for models
mkdir -p /models
# Download model from s3 and store it to volume
aws s3 cp s3://some-bucket/some-path-to-model/speech_to_text_whisper_enhanced-medium-1.0.0.model ${PHX_MODEL_PATH}
env:
# PHX_MODEL_PATH variable must be same as in main container
- name: "PHX_MODEL_PATH"
value: "/models/{{ .Values.config.model.file }}"
# Set AWS_* variables to make aws cli work
- name: "AWS_DEFAULT_REGION"
value: "us-east-1"
- name: "AWS_ACCESS_KEY_ID"
value: "AKAI...CN"
- name: "AWS_SECRET_ACCESS_KEY"
value: "0lW...Yw"
# Mount empty volume to initContainer
volumeMounts:
- name: '{{ include "speech-to-text-whisper-enhanced.fullname" . }}-models-volume'
mountPath: /models
Pass the model via volume
With this approach you need to create persistent volume, copy model there and mount it to pod.
Following example shows how to do it in EKS with EBS-based dynamic provisioning.
- Create persistentVolumeClaim
# filename: speech-to-text-whisper-enhanced.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: speech-to-text-whisper-enhanced
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: ebs-sc
and apply it
kubectl apply -f speech-to-text-whisper-enhanced.yaml
- Create job which downloads model to persistent volume:
# filename: job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: speech-to-text-whisper-enhanced-download-model
spec:
template:
spec:
containers:
- name: download-model
image: alpine
command:
- sh
- -c
- |
set -e
# Install aws-cli package
apk add --no-cache aws-cli
# Create directory for models
mkdir -p /models
# Download model from s3 and store it to volume
aws s3 cp s3://some-bucket/some-path-to-model/speech_to_text_whisper_enhanced-medium-1.0.0.model ${PHX_MODEL_PATH}
env:
# PHX_MODEL_PATH variable must be same as .Values.config.model.file in values files
- name: "PHX_MODEL_PATH"
value: "/models/speech_to_text_whisper_enhanced-medium-1.0.0.model"
# Set AWS_* variables to make aws cli work
- name: "AWS_DEFAULT_REGION"
value: "us-east-1"
- name: "AWS_ACCESS_KEY_ID"
value: "AKAI...CN"
- name: "AWS_SECRET_ACCESS_KEY"
value: "0lW...Yw"
volumeMounts:
- name: persistent-storage
mountPath: /models
volumes:
- name: persistent-storage
persistentVolumeClaim:
claimName: speech-to-text-whisper-enhanced
restartPolicy: Never
backoffLimit: 3
Apply it and wait until job is finished:
kubectl apply -f job.yaml
- Configure values file to use existing PVC:
config:
model:
# Volume with Phonexia model
volume:
persistentVolumeClaim:
claimName: speech-to-text-whisper-enhanced
# Name of a model file inside the volume, for example "xl-5.0.0.model"
file: "speech_to_text_whisper_enhanced-medium-1.0.0.model"
Installing the Chart
When you have configure model you can proceed with installation itself. To install the chart with the release name my-release:
helm install my-release oci://registry-1.docker.io/phonexia/speech-to-text-whisper-enhanced
This command deploy speech-to-text-whisper-enhanced on the Kubernetes cluster in the default configuration.
Use --version
parameter to install specific version:
helm install my-release oci://registry-1.docker.io/phonexia/speech-to-text-whisper-enhanced --version 1.0.0
Exposing the service
To expose the service outside of kubernetes cluster follow Using a Service to Expose Your App.
Ingress
GID service is using GRPC protocol which can be exposed by some ingress controllers. For example nginx-ingress controller support this. To expose speech-to-text-whisper-enhanced service via ingress use following configuration:
ingress:
# Deploy ingress object
enabled: true
# Ingress class name
className: "nginx"
annotations:
# Force redirect to SSL
nginx.ingress.kubernetes.io/ssl-redirect: "true"
# Tell nginx that backend service use GRPC
nginx.ingress.kubernetes.io/backend-protocol: "GRPC"
hosts:
# Hostnames
- host: speech-to-text-whisper-enhanced.example.com
paths:
- path: /
pathType: ImplementationSpecific
# Use tls
tls:
# Secret containing TLS certificate
- secretName: speech-to-text-whisper-enhanced-tls
# TLS hostnames
hosts:
- speech-to-text-whisper-enhanced.example.com
Use grpcurl to check if everything works as expected. Output of the following command
$ grpcurl --insecure speech-to-text-whisper-enhanced.example.com:443 grpc.health.v1.Health/Check
should be
{
"status": "SERVING"
}
Uninstalling the Chart
To uninstall/delete the my-release release:
helm delete my-release
The command removes all the Kubernetes components associated with the chart and deletes the release.