Enhanced Speech To Text Built On Whisper helm chart
Phonexia Speech To Text (STT) built on Whisper
Maintainers
Name | Url | |
---|---|---|
Phonexia | support@phonexia.com | https://www.phonexia.com |
Helm: >= 3.2.0
Values
Key | Type | Default | Description |
---|---|---|---|
affinity | object | {} | Affinity for pod assignment |
annotations | object | {} | |
config.device | string | "cpu" | Compute device used for inference Can be cpu or cuda If you use cuda you have to use also image tag with gpu support |
config.instancesPerDevice | int | 1 | Parallel tasks per device. GPU only. |
config.keepAliveTime | int | 60 | Time between 2 consecutive keep-alive messages, that are sent if there is no activity from the client. |
config.keepAliveTimeout | int | 20 | Time to wait for keep alive acknowledgement until the connection is dropped by the server. |
config.license.useSecret | bool | false | Get license from secret object |
config.license.value | string | "invalidLicenseKey" | License key |
config.listeningAddress | string | "[::]" | Override address where the server will listen |
config.logLevel | string | info | Override log level Supported values: error , warning , info , debug , trace |
config.model.file | string | "" | Name of a model file inside the volume, for example "large_v2-1.0.0.model" |
config.model.volume | object | {} | Volume with Phonexia model |
config.port | int | 8080 | Port where the service will listen. The value must be same as service.port. |
config.threadsPerInstance | int | 8 | Number of threads to use when running on CPU |
extraEnvVars | list | [] | |
fullnameOverride | string | "" | String to fully override enhanced-speech-to-text-built-on-whisper.fullname template |
image.pullPolicy | string | "IfNotPresent" | Image pull policy |
image.registry | string | "registry.cloud.phonexia.com" | Image registry |
image.repository | string | "phonexia/dev/technologies/microservices/enhanced-speech-to-text-built-on-whisper" | Image repository |
image.tag | string | appVersion specified in Chart.yaml | See enhanced-speech-to-text-built-on-whisper on dockerhub for available tags |
imagePullSecrets | list | [] | Specify docker-registry secret names as an array |
ingress.annotations | object | {} | |
ingress.className | string | "" | |
ingress.enabled | bool | false | |
ingress.hosts[0] | object | {"host":"enhanced-speech-to-text-built-on-whisper.example.com","paths":[{"path":"/","pathType":"ImplementationSpecific"}]} | Default host for the ingress resource |
ingress.hosts[0].paths[0].pathType | string | "ImplementationSpecific" | Ingress path type |
ingress.tls | list | [] | |
initContainers | list | [] | Init containers Evaluated as a template. |
livenessProbe | object | {"failureThreshold":3,"initialDelaySeconds":0,"periodSeconds":10,"successThreshold":1,"timeoutSeconds":1} | Liveness probe settings |
nameOverride | string | "" | String to partially override enhanced-speech-to-text-built-on-whisper.fullname template (will maintain the release name) |
nodeSelector | object | {} | Node labels for pod assignment. |
podAnnotations | object | {} | Annotations for pods |
podSecurityContext | object | {} | Security context for pods |
readinessProbe | object | {"failureThreshold":3,"initialDelaySeconds":0,"periodSeconds":10,"successThreshold":1,"timeoutSeconds":1} | Readiness probe settings |
replicaCount | int | 1 | Number of replicas to deploy |
resources | object | {} | The resources limits/requests for the enhanced-speech-to-text-built-on-whisper container |
runtimeClassName | string | "" | Specify runtime class |
securityContext | object | {} | Security context for enhanced-speech-to-text-built-on-whisper container |
service.clusterIP | string | "" | Use None to create headless service |
service.port | int | 8080 | Service port The port must be same as config.port |
service.type | string | "ClusterIP" | Service type |
serviceAccount.annotations | object | {} | Annotations to add to the service account |
serviceAccount.create | bool | true | Specifies whether a service account should be created |
serviceAccount.name | string | "" | The name of the service account to use. If not set and create is true, a name is generated using the fullname template |
tolerations | list | [] | Tolerations for pod assignment. |
updateStrategy | object | {"type":"RollingUpdate"} | Deployment update strategy |
Installation
To successfully install the chart you have to obtain model at first. Service is unable to start without model. Feel free to contact phonexia support to obtain model for evaluation purpose.
Model
There are 2 ways how to pass a model to pods:
- Pass the model via initContainer
- Pass the model via volume
Pass the model via initContainer
With this approach no persistent volume is needed. InitContainer is added to the pod instead. It downloads model from specified location to ephemeral volume which is shared with main container. This happens each time when pod is re-deployed.
In values file it looks like:
# Set config.model.volume to emptyDir
config:
model:
volume:
emptyDir: {}
file: "enhanced_speech_to_text_built_on_whisper-medium-1.0.0.model"
initContainers:
- name: init-copy-model
image: alpine
command:
- sh
- -c
- |
set -e
# Install aws-cli package
apk add --no-cache aws-cli
# Create directory for models
mkdir -p /models
# Download model from s3 and store it to volume
aws s3 cp s3://some-bucket/some-path-to-model/enhanced_speech_to_text_built_on_whisper-medium-1.0.0.model ${PHX_MODEL_PATH}
env:
# PHX_MODEL_PATH variable must be same as in main container
- name: "PHX_MODEL_PATH"
value: "/models/{{ .Values.config.model.file }}"
# Set AWS_* variables to make aws cli work
- name: "AWS_DEFAULT_REGION"
value: "us-east-1"
- name: "AWS_ACCESS_KEY_ID"
value: "AKAI...CN"
- name: "AWS_SECRET_ACCESS_KEY"
value: "0lW...Yw"
# Mount empty volume to initContainer
volumeMounts:
- name: '{{ include "enhanced-speech-to-text-built-on-whisper.fullname" . }}-models-volume'
mountPath: /models
Pass the model via volume
With this approach you need to create persistent volume, copy model there and mount it to pod.
Following example shows how to do it in EKS with EBS-based dynamic provisioning.
- Create persistentVolumeClaim
# filename: enhanced-speech-to-text-built-on-whisper.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: enhanced-speech-to-text-built-on-whisper
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: ebs-sc
and apply it
kubectl apply -f enhanced-speech-to-text-built-on-whisper.yaml
- Create job which downloads model to persistent volume:
# filename: job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: enhanced-speech-to-text-built-on-whisper-download-model
spec:
template:
spec:
containers:
- name: download-model
image: alpine
command:
- sh
- -c
- |
set -e
# Install aws-cli package
apk add --no-cache aws-cli
# Create directory for models
mkdir -p /models
# Download model from s3 and store it to volume
aws s3 cp s3://some-bucket/some-path-to-model/enhanced_speech_to_text_built_on_whisper-medium-1.0.0.model ${PHX_MODEL_PATH}
env:
# PHX_MODEL_PATH variable must be same as .Values.config.model.file in values files
- name: "PHX_MODEL_PATH"
value: "/models/enhanced_speech_to_text_built_on_whisper-medium-1.0.0.model"
# Set AWS_* variables to make aws cli work
- name: "AWS_DEFAULT_REGION"
value: "us-east-1"
- name: "AWS_ACCESS_KEY_ID"
value: "AKAI...CN"
- name: "AWS_SECRET_ACCESS_KEY"
value: "0lW...Yw"
volumeMounts:
- name: persistent-storage
mountPath: /models
volumes:
- name: persistent-storage
persistentVolumeClaim:
claimName: enhanced-speech-to-text-built-on-whisper
restartPolicy: Never
backoffLimit: 3
Apply it and wait until job is finished:
kubectl apply -f job.yaml
- Configure values file to use existing PVC:
config:
model:
# Volume with Phonexia model
volume:
persistentVolumeClaim:
claimName: enhanced-speech-to-text-built-on-whisper
# Name of a model file inside the volume, for example "xl-5.0.0.model"
file: "enhanced_speech_to_text_built_on_whisper-medium-1.0.0.model"
Installing the Chart
When you have configure model you can proceed with installation itself. To install the chart with the release name my-release:
helm install my-release oci://registry-1.docker.io/phonexia/enhanced-speech-to-text-built-on-whisper
This command deploy enhanced-speech-to-text-built-on-whisper on the Kubernetes cluster in the default configuration.
Use --version
parameter to install specific version:
helm install my-release oci://registry-1.docker.io/phonexia/enhanced-speech-to-text-built-on-whisper --version 1.0.0-helm
Exposing the service
To expose the service outside of kubernetes cluster follow Using a Service to Expose Your App.
Ingress
GID service is using GRPC protocol which can be exposed by some ingress controllers. For example nginx-ingress controller support this. To expose enhanced-speech-to-text-built-on-whisper service via ingress use following configuration:
ingress:
# Deploy ingress object
enabled: true
# Ingress class name
className: "nginx"
annotations:
# Force redirect to SSL
nginx.ingress.kubernetes.io/ssl-redirect: "true"
# Tell nginx that backend service use GRPC
nginx.ingress.kubernetes.io/backend-protocol: "GRPC"
hosts:
# Hostnames
- host: enhanced-speech-to-text-built-on-whisper.example.com
paths:
- path: /
pathType: ImplementationSpecific
# Use tls
tls:
# Secret containing TLS certificate
- secretName: enhanced-speech-to-text-built-on-whisper-tls
# TLS hostnames
hosts:
- enhanced-speech-to-text-built-on-whisper.example.com
Use grpcurl to check if everything works as expected. Output of the following command
$ grpcurl --insecure enhanced-speech-to-text-built-on-whisper.example.com:443 grpc.health.v1.Health/Check
should be
{
"status": "SERVING"
}
Uninstalling the Chart
To uninstall/delete the my-release release:
helm delete my-release
The command removes all the Kubernetes components associated with the chart and deletes the release.