Version: 4.0.2

Upgrade Guide

This section describes manual steps which need to be done prior upgrading. There are various changes in the configuration which must be reflected before upgrade. We suggest to always use configuration file bundled with new version of the virtual appliance and update it to suit your needs (insert licenses, enable/disable service, set replicas, ...). If you are not willing to do this, then you must modify your current configuration file to work with new version of the virtual appliance.

This section describes how to perform upgrade of the virtual appliance.

Upgrade and retain data disk

This upgrade approach retains all the data and configuration stored on the data disk.

Pros:

No need to configure virtual appliance from scratch
Prometheus metrics are kept

Cons:

You have to do version-specific upgrade steps

Import new version of virtual appliance (version X+1) into your virtualization platform
Stop current version of virtual appliance (version X)
Detach data disk from current version of virtual appliance (version X)
Attach data disk to new version of virtual appliance (version X+1)
Start new version of virtual appliance (version X+1)
Delete old version of virtual appliance (version X)
Follow version-specific upgrade steps

Upgrade and discard data disk

This upgrade approach discard current data disk and uses new one.

Pros:

Easier upgrade procedure
No version-specific upgrade steps
No accumulated disarray on the data disk

Cons:

You have to configure virtual appliance from the scratch:
- Enable needed technologies
- Insert license keys
- Insert models

Import new version of virtual appliance (version X+1) into your virtualization platform
Stop current version of virtual appliance (version X)
Start new version of virtual appliance (version X+1)
Delete old version of virtual appliance (version X)
Configure virtual appliance from scratch

Upgrade guides

Upgrade to 4.0.0

This section describes the manual steps which need to be done prior to upgrading to 4.0.0.

Add configuration for Age Estimation

Age Estimation was added in this release. Therefore it must be configured properly.

Add configuration for Audio Manipulation Detection

Audio Manipulation Detection was added in this release. Therefore it must be configured properly.

Add configuration for Keyword Spotting

Keyword Spotting was added in this release. Therefore it must be configured properly.

Add configuration for Replay Attack Detection

Replay Attack Detection was added in this release. Therefore it must be configured properly.

Add UI limits for Age Estimation

Configuration section for Age Estimation be added.

Add UI limits for Audio Manipulation Detection

Configuration section for Audio Manipulation Detection be added.

Add UI limits for Keyword Spotting

Configuration section for Keyword Spotting be added.

Add UI limits for Replay Attack Detection

Configuration section for Replay Attack Detection be added.

Step by step upgrade guide to 4.0.0

This section describes how to upgrade Virtual Appliance from 3.7.0 to 4.0.0 with retaining data disk content.

Open the text file /data/speech-platform/speech-platform-values.yaml either directly from inside the virtual appliance or via a file browser.

Append following content to the .spec.valuesContent.frontend.config.limits:

# Limits for age estimation
ageEstimation:
  # Maximum filesize for upload in bytes
  maxFileSize: 5000000
  # Maximum number of file which can be processed
  maxFilesCount: 100
  # Maximum duration voice recorder is able to record in seconds
  maxVoiceRecorderDuration: 300

# Limits for authenticity verification
authenticityVerification:
  # Maximum filesize for upload in bytes
  maxFileSize: 5000000
  # Maximum number of file which can be processed
  maxFilesCount: 100
  # Maximum duration voice recorder is able to record in seconds
  maxVoiceRecorderDuration: 300

# Limits for Keyword Spotting
keywordSpotting:
  # Maximum filesize for upload in bytes
  maxFileSize: 5000000
  # Maximum number of file which can be processed
  maxFilesCount: 100
  # Maximum duration voice recorder is able to record in seconds
  maxVoiceRecorderDuration: 300

Put following content in the end of the file:

#################################################
# Audio Manipulation Detection sub-chart config #
#################################################
audio-manipulation-detection:
  enabled: false
  replicaCount: 1

  # Extra environment variables
  extraEnvVars: []

  config:
    # -- Number of threads to use when running on CPU
    threadsPerInstance: 1
    # -- Parallel tasks per device. GPU only.
    instancesPerDevice: 1
    # -- Index of device to use. GPU only.
    #deviceIndex: 0

    # Uncomment this to force audio-manipulation-detection to run on GPU
    #device: cuda

    # Set logging level
    logLevel: info

    model:
      # Name of a model file, for example "beta-1.0.0.model"
      file: "beta-1.0.0.model"
    license:
      key: "license"

  # Uncomment this to grant access to GPU for audio-manipulation-detection pod
  resources:
    limits: {}
  #    nvidia.com/gpu: "1"

  # Uncomment this to run audio-manipulation-detection on GPU
  #runtimeClassName: "nvidia"

  #updateStrategy:
  #type: Recreate

############################################
# Replay Attack Detection sub-chart config #
############################################
replay-attack-detection:
  enabled: false
  replicaCount: 1

  # Extra environment variables
  extraEnvVars: []

  config:
    # -- Number of threads to use when running on CPU
    threadsPerInstance: 1
    # -- Parallel tasks per device. GPU only.
    instancesPerDevice: 1
    # -- Index of device to use. GPU only.
    #deviceIndex: 0

    # Uncomment this to force replay-attack-detection to run on GPU
    #device: cuda

    # Set logging level
    logLevel: info

    model:
      # Name of a model file, for example "beta-1.0.0.model"
      file: "beta-1.0.0.model"
    license:
      key: "license"

    # Uncomment this to grant access to GPU for replay-attack-detection pod
    resources:
      limits: {}
    #    nvidia.com/gpu: "1"

    # Uncomment this to run replay-attack-detection on GPU
    #runtimeClassName: "nvidia"

    #updateStrategy:
    #type: Recreate

#######################################
# Keyword spotting subchart config #
#######################################
keyword-spotting:
  enabled: false
  replicaCount: 1

  # Extra environment variables
  extraEnvVars: []

  config:
    # -- Number of threads to use when running on CPU
    threadsPerInstance: 1
    # -- Parallel tasks per device. GPU only.
    instancesPerDevice: 1
    # -- Index of device to use. GPU only.
    #deviceIndex: 0

    # Uncomment this to force keyword-spotting to run on GPU
    #device: cuda

    # Set logging level
    logLevel: info

    # Set path to license
    license:
      key: "license"

    # Define language instances
    # instances:
    # - name: en
    instances: []

  # Uncomment this to grant access to GPU for keyword-spotting pod
  resources:
    limits: {}
  #    nvidia.com/gpu: "1"

  # Uncomment this to run keyword-spotting on GPU
  #runtimeClassName: "nvidia"

  service:
    clusterIP: "None"

  #updateStrategy:
  #type: Recreate

############################################
# Age Estimation subchart config #
############################################
age-estimation:
  enabled: false
  replicaCount: 1

  # Extra environment variables
  extraEnvVars: []

  # Configuration section for age-estimation itself
  config:
    # -- Number of threads to use when running on CPU
    threadsPerInstance: 1
    # -- Parallel tasks per device. GPU only.
    instancesPerDevice: 1
    # -- Index of device to use. GPU only.
    #deviceIndex: 0

    # Uncomment this to force age-estimation to run on GPU
    #device: cuda
    model:
      # Name of a model file inside the volume, for example "xl-5.1.0.model"
      file: "xl-5.1.0.model"

    # License section
    license:
      key: "xl-5.1.0"

    # Set logging level
    logLevel: info

  # Uncomment this to grant access to GPU for age-estimation pod
  resources:
    limits: {}
  #    nvidia.com/gpu: "1"

  # Uncomment this to run age-estimation on GPU
  #runtimeClassName: "nvidia"

  #updateStrategy:
  #type: Recreate

Update the deepfake-detection license secret key .spec.valuesContent.deepfake-detection.config.license.key to "license":

######################################
# Deepfake Detection subchart config #
######################################
deepfake-detection:
  config:
    license:
      key: "license"

Save the file.
The application automatically recognizes when the file is updated and redeploys itself with updated the configuration.
Check that the configuration is valid and successfully applied (Step 5 of Installation Guide).

Upgrade to 3.7.0

This section describes the manual steps which need to be done prior to upgrading to 3.7.0.

Add configuration for Gender Identification and Emotion Recognition

Gender Identification and Emotion Recognition were added in this release. Therefore they must be configured properly.

Add UI limits for Gender Identification and Emotion Recognition

Configuration section for Gender Identification and Emotion Recognition must be added.

Step by step upgrade guide to 3.7.0

This section describes how to upgrade Virtual Appliance from 3.6.0 to 3.7.0 with retaining data disk content.

Open the text file /data/speech-platform/speech-platform-values.yaml either directly from inside the virtual appliance or via a file browser.

Append following content to the .spec.valuesContent.frontend.config.limits:

# Limits for Gender Identification
genderIdentification:
  # Maximum filesize for upload in bytes
  maxFileSize: 5000000
  # Maximum number of file which can be processed
  maxFilesCount: 100
  # Maximum duration voice recorder is able to record in seconds
  maxVoiceRecorderDuration: 300

# Limits for Emotion Recognition
emotionRecognition:
  # Maximum filesize for upload in bytes
  maxFileSize: 5000000
  # Maximum number of file which can be processed
  maxFilesCount: 100
  # Maximum duration voice recorder is able to record in seconds
  maxVoiceRecorderDuration: 300

Put following content in the end of the file:

########################################
# Gender Identification sub-chart config #
########################################
gender-identification:
  enabled: false
  replicaCount: 1

  # Extra environment variables
  extraEnvVars: []

  config:
    # -- Number of threads to use when running on CPU
    threadsPerInstance: 1
    # -- Parallel tasks per device. GPU only.
    instancesPerDevice: 1
    # -- Index of device to use. GPU only.
    #deviceIndex: 0

    # Uncomment this to force gender-identification to run on GPU
    #device: cuda

    # Set logging level
    logLevel: info

    model:
      # Name of a model file, for example "xl-5.1.0.model"
      file: "xl-5.1.0.model"
    license:
      key: "xl-5.1.0"

  # Uncomment this to grant access to GPU for gender-identification pod
  #resources:
  #  limits:
  #    nvidia.com/gpu: "1"

  # Uncomment this to run gender-identification on GPU
  #runtimeClassName: "nvidia"

  #updateStrategy:
  #type: Recreate

########################################
# Emotion Recognition sub-chart config #
########################################
emotion-recognition:
  enabled: false
  replicaCount: 1

  # Extra environment variables
  extraEnvVars: []

  config:
    # -- Number of threads to use when running on CPU
    threadsPerInstance: 1
    # -- Parallel tasks per device. GPU only.
    instancesPerDevice: 1
    # -- Index of device to use. GPU only.
    #deviceIndex: 0

    # Uncomment this to force emotion-recognition to run on GPU
    #device: cuda

    # Set logging level
    logLevel: info

    model:
      # Name of a model file inside the volume, for example "generic-1.1.0.model"
      file: "generic-1.1.0.model"
    license:
      key: "generic-1.1.0"

  # Uncomment this to grant access to GPU for emotion-recognition pod
  #resources:
  #  limits:
  #    nvidia.com/gpu: "1"

  # Uncomment this to run emotion-recognition on GPU
  #runtimeClassName: "nvidia"

  #updateStrategy:
  #type: Recreate

Save the file.
The application automatically recognizes when the file is updated and redeploys itself with updated the configuration.
Check that the configuration is valid and successfully applied (Configuration file checks).

Upgrade to 3.6.0

This section describes the manual steps which need to be done prior to upgrading to 3.6.0.

Note: In this release, we have significantly simplified the configuration file (speech-platform-values.yaml). We strongly suggest configuring the Virtual Appliance from scratch. If you are unable or unwilling to do so, you may proceed with the upgrade guide as usual.

Add configuration for Deepfake Detection

Deepfake detection technology was added in this release. Therefore it must be configured properly.

Add UI limits for Authenticity Verification

Configuration section for Authenticity Verification must be added.

Note: Authenticity Verification is UI/frontend name for possibly multiple technologies. Currently there is only one - Deepfake Detection.

Step by step upgrade guide to 3.6.0

This section describes how to upgrade Virtual Appliance from 3.5.0 to 3.6.0 with retaining data disk content.

Open the text file /data/speech-platform/speech-platform-values.yaml either directly from inside the virtual appliance or via a file browser.

Append following content to the .spec.valuesContent.frontend.config.limits:

# Limits for Authenticity Verification
authenticityVerfication:
  # Maximum filesize for upload in bytes
  maxFileSize: 5000000
  # Maximum number of file which can be processed
  maxFilesCount: 100
  # Maximum duration voice recorder is able to record in seconds
  maxVoiceRecorderDuration: 300

Put following content in the end of the file:

# Deepfake Detection subchart config
deepfake-detection:
  enabled: false
  replicaCount: 1

  # Extra environment variables
  extraEnvVars: []

  config:
    # -- Number of threads to use when running on CPU
    threadsPerInstance: 1
    # -- Parallel tasks per device. GPU only.
    instancesPerDevice: 1
    # -- Index of device to use. GPU only.
    #deviceIndex: 0

    # Uncomment this to force deepfake-detection to run on GPU
    #device: cuda

    # Set logging level
    logLevel: info

    model:
      # Name of a model file, for example "beta-1.0.0.model"
      file: "beta-1.0.0.model"
    license:
      key: "beta-1.0.0"

  # Uncomment this to grant access to GPU for deepfake-detection pod
  #resources:
  #  limits:
  #    nvidia.com/gpu: "1"

  # Uncomment this to run deepfake-detection on GPU
  #runtimeClassName: "nvidia"

  #updateStrategy:
  #type: Recreate

Save the file.
The application automatically recognizes when the file is updated and redeploys itself with updated the configuration.
Check that the configuration is valid and successfully applied (Configuration file checks).

Upgrade to 3.5.0

This section describes the manual steps which need to be done prior to upgrading to 3.5.0.

Add configuration for UI limits

UI limits can be configured for Language Identification, Speech Translation, Voice Activity Detection, Speaker Diarization and Audio Quality Estimation.

Use latest models

Technologies used in this version are not compatible with previous version of models. Therefore technologies must be reconfigured to use the latest models.

Step by step upgrade guide to 3.5.0

This section describes how to upgrade virtual appliance from 3.4.0 to 3.5.0 with retaining data disk content.

Open the text file /data/speech-platform/speech-platform-values.yaml either directly from inside the virtual appliance or via a file browser.

Append following content to the .spec.valuesContent.frontend.config.limits:

# Limits for language identification
languageIdentification:
  # Maximum filesize for upload in bytes
  maxFileSize: 5000000
  # Maximum number of file which can be processed
  maxFilesCount: 100
  # Maximum duration voice recorder is able to record in seconds
  maxVoiceRecorderDuration: 300

# Limits for speech translation
speechTranslation:
  # Maximum filesize for upload in bytes
  maxFileSize: 5000000
  # Maximum number of file which can be processed
  maxFilesCount: 100
  # Maximum duration voice recorder is able to record in seconds
  maxVoiceRecorderDuration: 300

# Limits for speaker diarization
speakerDiarization:
  # Maximum filesize for upload in bytes
  maxFileSize: 5000000
  # Maximum number of file which can be processed
  maxFilesCount: 100
  # Maximum duration voice recorder is able to record in seconds
  maxVoiceRecorderDuration: 300

# Limits for voice activity detection
voiceActivityDetection:
  # Maximum filesize for upload in bytes
  maxFileSize: 5000000
  # Maximum number of file which can be processed
  maxFilesCount: 100
  # Maximum duration voice recorder is able to record in seconds
  maxVoiceRecorderDuration: 300

# Limits for audio quality estimation
audioQualityEstimation:
  # Maximum filesize for upload in bytes
  maxFileSize: 5000000
  # Maximum number of file which can be processed
  maxFilesCount: 100
  # Maximum duration voice recorder is able to record in seconds
  maxVoiceRecorderDuration: 300

Change .spec.valuesContent.voiceprint-extraction.config.model.file to xl-5.2.0.model and .spec.valuesContent.voiceprint-extraction.config.license.key to xl-5.2.0:

voiceprint-extraction:
<Not significant lines omitted>
 config:
 <Not significant lines omitted>
   model:
     <Not significant lines omitted>
     file: "xl-5.2.0.model"
   license:
     <Not significant lines omitted>
     key: "xl-5.2.0"

Change .spec.valuesContent.voiceprint-comparison.config.model.file to xl-5.2.0.model and .spec.valuesContent.voiceprint-comparison.config.license.key to xl-5.2.0:

voiceprint-comparison:
<Not significant lines omitted>
 config:
 <Not significant lines omitted>
   model:
     <Not significant lines omitted>
     file: "xl-5.2.0.model"
   license:
     <Not significant lines omitted>
     key: "xl-5.2.0"

Change .spec.valuesContent.voiceprint-comparison.config.model.file to xl-5.2.0.model and .spec.valuesContent.voiceprint-comparison.config.license.key to xl-5.2.0:

voiceprint-comparison:
<Not significant lines omitted>
 config:
 <Not significant lines omitted>
   model:
     <Not significant lines omitted>
     file: "xl-5.2.0.model"
   license:
     <Not significant lines omitted>
     key: "xl-5.2.0"

Change .spec.valuesContent.voice-activity-detection.config.model.file to generic-3.1.0.model and .spec.valuesContent.voice-activity-detection.config.license.key to generic-3.1.0:

voice-activity-detection:
<Not significant lines omitted>
 config:
 <Not significant lines omitted>
   model:
     <Not significant lines omitted>
     file: "generic-3.1.0.model"
   license:
     <Not significant lines omitted>
     key: "generic-3.1.0"

Change .spec.valuesContent.language-identification.config.model.file to xl-5.3.0.model and .spec.valuesContent.language-identification.config.license.key to xl-5.3.0:

language-identification:
<Not significant lines omitted>
 config:
 <Not significant lines omitted>
   model:
     <Not significant lines omitted>
     file: "xl-5.3.0.model"
   license:
     <Not significant lines omitted>
     key: "xl-5.3.0"

Change .spec.valuesContent.speaker-diarization.config.model.file to xl-5.1.0.model and .spec.valuesContent.speaker-diarization.config.license.key to xl-5.1.0:

speaker-diarization:
<Not significant lines omitted>
 config:
 <Not significant lines omitted>
   model:
     <Not significant lines omitted>
     file: "xl-5.1.0.model"
   license:
     <Not significant lines omitted>
     key: "xl-5.1.0"

Change .spec.valuesContent.enhanced-speech-to-text-built-on-whisper.config.model.file to large_v2-1.1.0.model and .spec.valuesContent.enhanced-speech-to-text-built-on-whisper.config.license.key to large_v2-1.1.0:

enhanced-speech-to-text-built-on-whisper:
<Not significant lines omitted>
 config:
 <Not significant lines omitted>
   model:
     <Not significant lines omitted>
     file: "large_v2-1.1.0.model"
   license:
     <Not significant lines omitted>
     key: "large_v2-1.1.0"

Upgrade to 3.4.0

This section describes the manual steps which need to be done prior to upgrading to 3.4.0.

Add configuration for Audio Quality Estimation

Audio Quality Estimation is being added in this release. Therefore it must be configured properly.

Add configuration for Voice Activity Detection

Voice Activity Detection is being added in this release. Therefore it must be configured properly.

Step by step upgrade guide to 3.4.0

This section describes how to upgrade virtual appliance from 3.3.0 to 3.4.0 with retaining data disk content.

Open the text file /data/speech-platform/speech-platform-values.yaml either directly from inside the virtual appliance or via a file browser.

Put following content in the end of the file:

# Audio Quality Estimation sub-chart
audio-quality-estimation:
  enabled: false
  parallelism: 1

  grpcAdapter:
    image:
      registry: airgapped.phonexia.com
    config:
      license:
        useSecret: true
        secret: audio-quality-estimation-license
        key: grpc-adapter-license

  image:
    registry: airgapped.phonexia.com

  # Set defaults for onDemand instances
  onDemand:
    trigger:
      activationThreshold: "0.9"
      query: |
        '
          service_running_tasks{
            namespace="{{ $.Release.Namespace }}",
            exported_service="time_analysis"
          }
          +
          service_waiting_tasks{
            namespace="{{ $.Release.Namespace }}",
            exported_service="time_analysis"
          }
        '

  config:
    license:
      useSecret: true
      secret: audio-quality-estimation-license
      key: license

    instances:
      - name: sqe
        imageTag: 3.62.0
        onDemand:
          enabled: true

  annotations:
    secret.reloader.stakater.com/reload: "audio-quality-estimation-license"

  service:
    clusterIP: "None"

# Voice Activity Detection subchart config
voice-activity-detection:
  enabled: true
  replicaCount: 1
  image:
    repository: phonexia/dev/technologies/microservices/voice-activity-detection/main
    registry: airgapped.phonexia.com

  config:
    # Set logging level
    logLevel: debug

    # Uncomment this to force voice-activity-detection to run on GPU
    #device: cuda

    model:
      volume:
        hostPath:
          path: /data/models/voice_activity_detection

      # Name of a model file inside the volume, for example "generic-3.0.0.model"
      file: "generic-3.0.0.model"
    license:
      useSecret: true
      secret: voice-activity-detection-license
      key: "generic-3.0.0"

  annotations:
    secret.reloader.stakater.com/reload: "voice-activity-detection-license"

  # Uncomment this to grant access to GPU for voice-activity-detection pod
  #resources:
  #  limits:
  #    nvidia.com/gpu: "1"

  # Uncomment this to run voice-activity-detection on GPU
  #runtimeClassName: "nvidia"

  service:
    clusterIP: "None"

  #updateStrategy:
  #  type: Recreate

Update the configurator annotation .spec.valuesContent.configurator.annotations."secret.reloader.stakater.com/reload":

# Configurator component
configurator:
  annotations:
    secret.reloader.stakater.com/reload: >-
      audio-quality-estimation-license,
      audio-quality-estimation-license-extensions,
      enhanced-speech-to-text-built-on-whisper-license,
      enhanced-speech-to-text-built-on-whisper-license-extensions,
      language-identification-license,
      language-identification-license-extensions,
      speaker-diarization-license, speaker-diarization-license-extensions,
      speaker-identification-license,  speaker-identification-license-extensions,
      speech-to-text-phonexia-license,
      speech-to-text-phonexia-license-extensions, time-analysis-license,
      time-analysis-license-extensions, voice-activity-detection-license,
      voice-activity-detection-license-extensions

Upgrade to 3.3.0

This section describes the manual steps which need to be done prior to upgrading to 3.3.0.

Upgrade Speech to Text Phonexia and Time Analysis

Both Speech to Text Phonexia and Time Analysis technologies were updated to 3.62. You need to reflect this change in the values file.

Add configuration to Configurator service

Configurator service is being used in this release. Therefore it must be configured properly.

Deploy additional components

New technology Speaker Diarization was added. Configuration section must be added before using this technology.

Step by step upgrade guide to 3.3.0

This section describes how to upgrade virtual appliance from 3.2.0 to 3.3.0 with retaining data disk content.

Open the text file /data/speech-platform/speech-platform-values.yaml either directly from inside the virtual appliance or via a file browser.

Put following content in the end of the file:

# speaker-diarization subchart config
speaker-diarization:
  enabled: true
  replicaCount: 1
  image:
    repository: phonexia/dev/technologies/microservices/speaker-diarization/main
    registry: airgapped.phonexia.com

  # Extra environment variables
  extraEnvVars: []

  config:
    # Uncomment this to force speaker-diarization to run on GPU
    #device: cuda

    model:
      volume:
        hostPath:
          path: /data/models/speaker_diarization

      # Name of a model file inside the volume, for example "xl-5.0.0.model"
      file: "xl-5.0.0.model"
    license:
      useSecret: true
      secret: speaker-diarization-license
      key: "xl-5.0.0"

  annotations:
    secret.reloader.stakater.com/reload: "speaker-diarization-license"

  # Uncomment this to grant access to GPU for speaker-diarization pod
  #resources:
  #  limits:
  #    nvidia.com/gpu: "1"

  # Uncomment this to run speaker-diarization on GPU
  #runtimeClassName: "nvidia"

  service:
    clusterIP: "None"

  #updateStrategy:
  #type: Recreate

# Configurator component
configurator:
  enabled: true
  image:
    registry: airgapped.phonexia.com

  annotations:
    secret.reloader.stakater.com/reload: >-
      enhanced-speech-to-text-built-on-whisper-license,
      enhanced-speech-to-text-built-on-whisper-license-extensions,
      language-identification-license,
      language-identification-license-extensions,
      speaker-diarization-license, speaker-diarization-license-extensions,
      speaker-identification-license,
      speaker-identification-license-extensions,
      speech-to-text-phonexia-license,
      speech-to-text-phonexia-license-extensions, time-analysis-license,
      time-analysis-license-extensions

Locate .spec.valuesContent.time-analysis.config.instances
Change imageTag version to 3.62.0.

Section then looks like:

 # Time-analysis subchart
 time-analysis:
 <Not significant lines omitted>
   config:
   <Not significant lines omitted>
     instances:
       - name: tae
         imageTag: 3.62.0
         onDemand:
           enabled: true

Locate .spec.valuesContent.speech-to-text-phonexia.config.instances
Change imageTag version to 3.62.0.

Section then looks like:

 # Speech-to-text-phonexia subchart
 speech-to-text-phonexia:
 <Not significant lines omitted>
   config:
   <Not significant lines omitted>
     instances:
       - name: ar-kw
         imageTag: 3.62.0
         onDemand:
           enabled: true
       - name: ar-xl
         imageTag: 3.62.0
         onDemand:
           enabled: true
       - name: bn
         imageTag: 3.62.0
         onDemand:
           enabled: true
       .
       .
       .

Save the file.
The application automatically recognizes when the file is updated and redeploys itself with updated the configuration.
Check that the configuration is valid and successfully applied (Configuration file checks).

Upgrade to 3.2.0

This section describes the manual steps which need to be done prior to upgrading to 3.2.0.

Add grpcAdapter license configuration for Time Analysis and Speech to Text Phonexia

Speech Engine technologies now require additional license. The license is deployed automatically from model package but license configuration must be added.

GPU sharing is enabled by default but it does not work until configuration is created.

Deploy additional components

New technology Language Identification was added. Configuration section must be added before using this technology.

Step by step upgrade guide to 3.2.0

This section describes how to upgrade virtual appliance from 3.1.0 to 3.2.0 with retaining data disk content.

Open the text file /data/speech-platform/speech-platform-values.yaml either directly from inside the virtual appliance or via a file browser.

Put following content in the end of the file:

# language-identification subchart config
language-identification:
  enabled: true
  replicaCount: 1
  image:
    repository: phonexia/dev/technologies/microservices/language-identification/main
    registry: airgapped.phonexia.com

  # Extra environment variables
  extraEnvVars: []

  config:
    # Uncomment this to force language-identification to run on GPU
    #device: cuda

    model:
      volume:
        hostPath:
          path: /data/models/language_identification

      # Name of a model file inside the volume, for example "xl-5.1.0.model"
      file: "xl-5.2.0.model"
    license:
      useSecret: true
      secret: language-identification-license
      key: "xl-5.2.0"

  annotations:
    secret.reloader.stakater.com/reload: "language-identification-license"

  # Uncomment this to grant access to GPU for language-identification pod
  #resources:
  #  limits:
  #    nvidia.com/gpu: "1"

  # Uncomment this to run language-identification on GPU
  #runtimeClassName: "nvidia"

  service:
    clusterIP: "None"

  #updateStrategy:
  #type: Recreate

Locate .spec.valuesContent.time-analysis.grpcAdapter

Append config section:

config:
  license:
    useSecret: true
    secret: time-analysis-license
    key: grpc-adapter-license

Section then looks like:

 # Time-analysis subchart
 time-analysis:
 <Not significant lines omitted>
   grpcAdapter:
   <Not significant lines omitted>
     config:
       license:
         useSecret: true
         secret: time-analysis-license
         key: grpc-adapter-license

Locate .spec.valuesContent.speech-to-text-phonexia.grpcAdapter

Append config section:

config:
  license:
    useSecret: true
    secret: speech-to-text-phonexia-license
    key: grpc-adapter-license

Section then looks like:

 # Speech-to-text-phonexia subchart
 speech-to-text-phonexia:
 <Not significant lines omitted>
   grpcAdapter:
   <Not significant lines omitted>
     config:
       license:
         useSecret: true
         secret: speech-to-text-phonexia-license
         key: grpc-adapter-license

Save the file.
The application automatically recognizes when the file is updated and redeploys itself with updated the configuration.
Check that the configuration is valid and successfully applied (Configuration file checks).

Create new text file /data/speech-platform/nvidia-device-plugin-configs.yaml either directly from inside the virtual appliance or via a file browser with following content:

apiVersion: v1
kind: ConfigMap
metadata:
name: nvidia-device-plugin-configs
namespace: nvidia-device-plugin
data:
default: |-
  version: v1
  sharing:
    timeSlicing:
      renameByDefault: false
      failRequestsGreaterThanOne: false
      resources:
        - name: nvidia.com/gpu
          replicas: 3

Save the file.
GPU sharing will be configured in a while.

Upgrade to 3.1.0

This section describes the manual steps which need to be done prior to upgrading to 3.1.0.

Change license secret field for Time Analysis and Speech to Text-Phonexia

To ensure unification of loading secrets for all technologies field the way how to load license from secret was changed in Time-Analysis and Speech to Text Phonexia technologies. This will simplify the user experience with the loading licenses.

Upload licenses from secret

Way how the licenses are uploaded to the virtual appliance has simplified. From now on the licenses are imported from models and licenses bundle (.zip file) provided by Phonexia, which after the unzipping loads the licenses and models automatically. This require the configuration change, however the old way is still working.

Deploy additional components

Billing feature is mature enough to be part of the virtual appliance. To deploy billing related components, add following section to the configuration:

    billing:
      enabled: true
      image:
        registry: airgapped.phonexia.com

    restApiGateway:
      image:
        registry: airgapped.phonexia.com
      enabled: true

    postgresql:
      enabled: true
      auth:
        postgresPassword: postgresPassword
      image:
        registry: airgapped.phonexia.com
      metrics:
        enabled: true
        image:
          registry: airgapped.phonexia.com
        serviceMonitor:
          enabled: true
      primary:
        persistence:
          storageClass: manual
          selector:
            matchLabels:
              app.kubernetes.io/name: postgresql

Step by step upgrade guide to 3.1.0

This section describes how to upgrade virtual appliance from 3.0.0 to 3.1.0 with retaining data disk content.

IF YOU ARE ALREADY LOADING LICENSE TROUGH SECRET:

Rename the field loading the Speech-to-Text-Phonexia and Time-Analysis licenses
Open the text file /data/speech-platform/speech-platform-values.yaml either directly from inside the virtual appliance or via a file browser.
Locate .spec.valuesContent.<speech-to-text-phonexia OR time-analysis>.config.license

Change it from:

license:
  existingSecret: <secret-name>

To:

license:
  useSecret: true
  secret: <secret-name>
  key: <secret-license-key>

IF YOU WANT TO LOAD LICENSES FROM SECRETS:

Load licenses from secret files
Open the text file /data/speech-platform/speech-platform-values.yaml either directly from inside the virtual appliance or via a file browser.
Locate .spec.valuesContent.<technology>.config.license. <technology> are all the services requiring license (voiceprint-comparison, voiceprint-extraction, enhanced-speech-to-text-built-on-whisper, speech-to-text-phonexia, time-analysis).

Change it from:

license:
  value: "<license>"

To:

license:
  useSecret: true
  secret: "<technology>-license"
  key: "<model_name>_<model_version>"

Example:

license:
  useSecret: true
  secret: "enhanced-speech-to-text-built-on-whisper"
  key: "small-1.0.1"

Open the text file /data/speech-platform/speech-platform-values.yaml either directly from inside the virtual appliance or via a file browser.
Locate .spec.valuesContent.envoy

Put following content before envoy section:

billing:
  enabled: true
  image:
    registry: airgapped.phonexia.com

restApiGateway:
  image:
    registry: airgapped.phonexia.com
  enabled: true

postgresql:
  enabled: true
  auth:
    postgresPassword: postgresPassword
  image:
    registry: airgapped.phonexia.com
  metrics:
    enabled: true
    image:
      registry: airgapped.phonexia.com
    serviceMonitor:
      enabled: true
  primary:
    persistence:
      storageClass: manual
      selector:
        matchLabels:
          app.kubernetes.io/name: postgresql

Section then looks like this:

     serviceMonitor:
       enabled: true
       additionalLabels:
         release: kube-prometheus-stack

 billing:
   enabled: true
   image:
     registry: airgapped.phonexia.com

 restApiGateway:
   image:
     registry: airgapped.phonexia.com
   enabled: true

 postgresql:
   enabled: true
   auth:
     postgresPassword: postgresPassword
   image:
     registry: airgapped.phonexia.com
   metrics:
     enabled: true
     image:
       registry: airgapped.phonexia.com
     serviceMonitor:
       enabled: true
   primary:
     persistence:
       storageClass: manual
       selector:
         matchLabels:
           app.kubernetes.io/name: postgresql

 envoy:
   enabled: true

Save the file.
The application automatically recognizes when the file is updated and redeploys itself with updated the configuration.
Check that the configuration is valid and successfully applied (Configuration file checks).

Upgrade to 3.0.0

This section describes the manual steps which need to be done prior to upgrading to 3.0.0.

Rename Whisper technology

Due to licensing reasons we had to rename the speech-to-text-whisper-enhanced technology. The new name is enhanced-speech-to-text-built-on-whisper. This change must be reflected in the values file.

Step by step upgrade guide to 3.0.0

This section describes how to upgrade virtual appliance from 2.1.0 to 3.0.0 with retaining data disk content.

Rename Enhanced Speech to Text Built on Whisper technology in currently running version of virtual appliance.
Open the text file /data/speech-platform/speech-platform-values.yaml either directly from inside the virtual appliance or via a file browser.
Locate .spec.valuesContent.speech-to-text-whisper-enhanced.
Replace all occurences of speech-to-text-whisper-enhanced with enhanced-speech-to-text-built-on-whisper.
Replace all occurences of speech_to_text_whisper_enhanced with enhanced_speech_to_text_built_on_whisper.
The updated file should look like this:

apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: speech-platform
  namespace: kube-system
spec:
  valuesContent: |-
  <Not significant lines omitted>
    enhanced-speech-to-text-built-on-whisper:
    <Not significant lines omitted>
      image:
        repository: phonexia/dev/technologies/microservices/enhanced-speech-to-text-built-on-whisper/main
      <Not significant lines omitted>
      config:
      <Not significant lines omitted>
        model:
          volume:
            hostPath:
              path: /data/models/enhanced_speech_to_text_built_on_whisper

Save the file
Rename the directory with Whisper models with the following command:

mv /data/models/speech_to_text_whisper_enhanced /data/models/enhanced_speech_to_text_built_on_whisper

Import new version of virtual appliance (version X+1) into your virtualization platform
Stop current version of virtual appliance (version X)
Detach data disk from current version of virtual appliance (version X)
Attach data disk to new version of virtual appliance (version X+1)
Start new version of virtual appliance (version X+1)
Delete old version of virtual appliance (version X)

Upgrade to 2.1.0

This section describes manual steps which need to be done prior upgrading to 2.1.0.

Load Speech to Text Phonexia and Time Analysis model from data disk instead of image

In new version, default way how to load models for Speech to Text Phonexia and Time Analysis will change. Before, models were loaded from image which lead to lot of duplicity in images. From now on, we will consider loading models from data disk as a default. However, the old way of loading models from image will still work.

Upgrade to load models from data disk (/data/models) require to update speech platform values file:

Open the text file /data/speech-platform/speech-platform-values.yaml either directly from inside virtual appliance or via file browser
Locate .spec.valuesContent.speech-to-text-phonexia.config.instances or .spec.valuesContent.time-analysis.config.instances key
Define versions of images (imageTag) without model (e.g. 3.62.0)
Updated file should look like:

apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: speech-platform
  namespace: kube-system
spec:
  valuesContent: |-
  <Not significant lines omitted>
    speech-to-text-phonexia:
    <Not significant lines omitted>
      config:
      <Not significant lines omitted>
        instances:
          - name: ar-kw
            imageTag: 3.62.0
            onDemand:
              enabled: true
          - name: ar-kx
            imageTag: 3.62.0
            onDemand:
              enabled: true
    time-analysis:
    <Not significant lines omitted>
      config:
      <Not significant lines omitted>
        instances:
          - name: tae
            imageTag: 3.62.0
            onDemand:
              enabled: true

Locate .spec.valuesContent.speech-to-text-phonexia.image or .spec.valuesContent.time-analysis.image key to uncomment the image section.
Updated file should look like:

apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: speech-platform
  namespace: kube-system
spec:
  valuesContent: |-
  <Not significant lines omitted>
    speech-to-text-phonexia:
    <Not significant lines omitted>
      image:
        registry: airgapped.phonexia.com
      <Not significant lines omitted>

    time-analysis:
    <Not significant lines omitted>
      image:
        registry: airgapped.phonexia.com
      <Not significant lines omitted>

Save the file

Add ingressAdmin section

Open the text file /data/speech-platform/speech-platform-values.yaml either directly from inside virtual appliance or via file browser.
Locate the key .spec.valuesContent.ingress.extraBackends
Remove the extraBackends scope with all of its contents
Add new ingressAdmin scope on the same indentation as the ingress scope. The resulting file should look like this:

apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: speech-platform
  namespace: kube-system
spec:
  valuesContent: |-
    ingress:
      <Not significant lines omitted>

    ingressAdmin:
      enabled: true
      annotations: {}
      singleFileUploadSize: "5368709120"
      singleFileUploadTimeout: 120
    <Not significant lines omitted>

Save the file
Proceed with upgrade

Fix permission for prometheus storage

This is post-upgrade task. Must be run when virtual appliance is upgraded to 2.1.0.

Run following command in the virtual appliance to fix permissions of the prometheus storage:

$ chmod -R a+w /data/storage/prometheus/prometheus-db/

Upgrade to 2.0.0

This section describes manual steps which need to be done prior upgrading to 2.0.0.

Rename speech-engine subchart to speech-to-text-phonexia

Due to renaming speech-engine subchart you have to update speech platform values file before upgrading:

Open the new text file /data/speech-platform/speech-platform-values.yaml either directly from inside virtual appliance or via file browser.
Locate key .spec.valuesContent.speech-engine.
Rename speech-engine to speech-to-text-phonexia.

Updated file should look like:

apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: speech-platform
  namespace: kube-system
spec:
  valuesContent: |-
  <Not significant lines omitted>
    speech-to-text-phonexia:
    <Not significant lines omitted>

Save the file

Rename speech-to-text-phonexia instances

Open the new text file /data/speech-platform/speech-platform-values.yaml either directly from inside virtual appliance or via file browser.
Locate key .spec.valuesContent.speech-to-text-phonexia.config.instances.
Remove stt- prefix from the name of each instance.

Updated file should look like:

apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: speech-platform
  namespace: kube-system
spec:
  valuesContent: |-
  <Not significant lines omitted>
    speech-to-text-phonexia:
    <Not significant lines omitted>
      config:
      <Not significant lines omitted>
        instances:
          - name: ar-kw
            imageTag: 3.62.0-stt-ar_kw_6
            onDemand:
              enabled: true
          - name: ar-kx
            imageTag: 3.62.0-stt-ar_xl_6
            onDemand:
              enabled: true

Save the file

Add proper tag suffix for Media Conversion

Open the new text file /data/speech-platform/speech-platform-values.yaml either directly from inside virtual appliance or via file browser.
Locate key .spec.valuesContent.media-conversion.image.
Change the value of the tagSuffix key to -free.

Updated file should look like:

apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: speech-platform
  namespace: kube-system
spec:
  valuesContent: |-
  <Not significant lines omitted>
    media-conversion:
      <Not significant lines omitted>
      image:
        <Not significant lines omitted>
        tagSuffix: "-free"
    <Not significant lines omitted>

Save the file
Proceed with upgrade

Update path to models

Default model location was changed from /data/models to /data/models/<technology>. If you plan to upgrade and keep current data disk, no steps are needed. Model are loaded from old location which is /data/models. If you plan to upgrade from scratch (discarding the current data disk), no steps are needed as well - models are loaded from new location which is /data/models/<technology>.

Upgrade and retain data disk​

Upgrade and discard data disk​

Upgrade guides​

Add configuration for Age Estimation​

Add configuration for Audio Manipulation Detection​

Add configuration for Keyword Spotting​

Add configuration for Replay Attack Detection​

Add UI limits for Age Estimation​

Add UI limits for Audio Manipulation Detection​

Add UI limits for Keyword Spotting​

Add UI limits for Replay Attack Detection​

Step by step upgrade guide to 4.0.0​

Add configuration for Gender Identification and Emotion Recognition​

Add UI limits for Gender Identification and Emotion Recognition​

Step by step upgrade guide to 3.7.0​

Add configuration for Deepfake Detection​

Add UI limits for Authenticity Verification​

Step by step upgrade guide to 3.6.0​

Add configuration for UI limits​

Use latest models​

Step by step upgrade guide to 3.5.0​

Upgrade to 3.4.0​

Add configuration for Audio Quality Estimation​

Add configuration for Voice Activity Detection​

Step by step upgrade guide to 3.4.0​

Upgrade Speech to Text Phonexia and Time Analysis​

Add configuration to Configurator service​

Deploy additional components​

Step by step upgrade guide to 3.3.0​

Add grpcAdapter license configuration for Time Analysis and Speech to Text Phonexia​

Create configuration for GPU sharing​

Deploy additional components​

Step by step upgrade guide to 3.2.0​

Change license secret field for Time Analysis and Speech to Text-Phonexia​

Upload licenses from secret​

Deploy additional components​

Step by step upgrade guide to 3.1.0​

Rename Whisper technology​

Step by step upgrade guide to 3.0.0​

Load Speech to Text Phonexia and Time Analysis model from data disk instead of image​

Add ingressAdmin section​

Fix permission for prometheus storage​

Rename speech-engine subchart to speech-to-text-phonexia​

Rename speech-to-text-phonexia instances​

Add proper tag suffix for Media Conversion​

Update path to models​

Upgrade and retain data disk

Upgrade and discard data disk

Upgrade guides

Add configuration for Age Estimation

Add configuration for Audio Manipulation Detection

Add configuration for Keyword Spotting

Add configuration for Replay Attack Detection

Add UI limits for Age Estimation

Add UI limits for Audio Manipulation Detection

Add UI limits for Keyword Spotting

Add UI limits for Replay Attack Detection

Step by step upgrade guide to 4.0.0

Add configuration for Gender Identification and Emotion Recognition

Add UI limits for Gender Identification and Emotion Recognition

Step by step upgrade guide to 3.7.0

Add configuration for Deepfake Detection

Add UI limits for Authenticity Verification

Step by step upgrade guide to 3.6.0

Add configuration for UI limits

Use latest models

Step by step upgrade guide to 3.5.0

Upgrade to 3.4.0

Add configuration for Audio Quality Estimation

Add configuration for Voice Activity Detection

Step by step upgrade guide to 3.4.0

Upgrade Speech to Text Phonexia and Time Analysis

Add configuration to Configurator service

Deploy additional components

Step by step upgrade guide to 3.3.0

Add grpcAdapter license configuration for Time Analysis and Speech to Text Phonexia

Create configuration for GPU sharing

Deploy additional components

Step by step upgrade guide to 3.2.0

Change license secret field for Time Analysis and Speech to Text-Phonexia

Upload licenses from secret

Deploy additional components

Step by step upgrade guide to 3.1.0

Rename Whisper technology

Step by step upgrade guide to 3.0.0

Load Speech to Text Phonexia and Time Analysis model from data disk instead of image

Add ingressAdmin section

Fix permission for prometheus storage

Rename speech-engine subchart to speech-to-text-phonexia

Rename speech-to-text-phonexia instances

Add proper tag suffix for Media Conversion

Update path to models