Version: 4.0.2

Components

This is the list of basic description of the components that the virtual appliance is composed of. More information about some of them can be found below this table.

Component	Overview
Operating system	There is Rocky Linux 9.5 under the hood.
GPU support	Virtual appliance has all necessary prerequisites pre-baked to allow running GPU-powered workloads (especially enhanced-speech-to-text-built-on-whisper). NVIDIA drivers and container toolkit are already installed. GPU time-based sharing is enabled by default, allowing multiple technologies to run on a single GPU simultaneously. GPU-powered images of all technologies are included.
Kubernetes	There is k3s Kubernetes distribution deployed inside.
Container registry	Registry is used for storing all necessary images. No need to pull images from the internet.
Ingress controller	We use ingress-nginx ingress controller. This component serves as a reverse proxy and load balancer.
Speech platform	This is the application for solving various voice-related problems like speaker identification, speech-to-text transcription, and many more. Speech platform is accessible via web browser or API.
Admin console	Admin console is a simple web page containing links to various admin-related tools. Console is located at `http://<IP_of_virtual_appliance>/admin`. It contains links to filebrowser, prometheus, grafana.
File Browser	File Browser is a web-based file browser/editor used to work with data on a data disk. It is accessible at `<IP_address_of_VA>/filebrowser`.
Prometheus	Prometheus is a tool for providing monitoring information about Kubernetes components. It is accessible at `<IP_address_of_VA>/prometheus`.
Grafana	Grafana is a tool for visualization of Prometheus metrics. It is accessible at `<IP_address_of_VA>/grafana`.

Each component is pre-configured and ready to use out of the box.

Further Details on Selected Components

Grafana

Grafana is tool for visualizing application and kubernetes metrics. List of most useful dashboards available in the grafana:

Envoy Clusters - See envoy cluster statistics
Kubernetes / Compute Resources / Pod - See resource consumption of individual pods
NGINX Ingress controller - See ingress controller stats
NVIDIA DCGM Exporter Dashboard - See GPU device stats
Node Exporter / Nodes - See stats about virtual appliance
Speech Platform API capacity - See metrics about speech platform itself

Speech Platform

List of the components of Speech Platform:

frontend - simple webserver serving static html, css, javascript and image files
docs - simple webserver serving documentation
assets - simple webserver hosting examples
api - python component providing REST API interface
envoy - router and loadbalancer for GRPC messages
media-conversion - python component used for converting audio files from various formats to simple wav format splitting multi-channel audio into multiple single-channel files
technology microservices

Request flow

Graphical representation

Step-by-Step representation

User POST request (for example transcribe speech to text) to API.
API creates task for processing and output task id to the user.
From this point user can poll on the task to get the result.
API calls media-conversion via envoy.
Media conversion converts the audiofile to wav format and possibly splits it into multiple mono-channel files.
API gets converted audiofile from media-conversion.
API calls enhanced-speech-to-text-built-on-whisper via envoy.
Enhanced-speech-to-text-built-on-whisper transcribes the audiofile.
API gets the transcription.
User can retrieve the task result.

Components​

Further Details on Selected Components​

Grafana​

Speech Platform​

Request flow​

Graphical representation​

Step-by-Step representation​