Skip to content

Releases: yum-food/TaSTT

v0.19.2

10 Jun 00:23
Compare
Choose a tag to compare

Download the .zip at the bottom of this page, then install 1 dependency:

  • Microsoft Visual C++ Redistributable. Direct link: vc_redist.x64.exe. Required by some Python dependencies.

Then follow the v0.13 setup guide: https://youtu.be/SqOGTnKXgag

Please report all issues to the discord: https://discord.gg/YWmCvbCRyn

Hotfix 2:

  • Include cuda 12 .dlls in .zip.

Hotfix 1:

  • Bump CUDNN to v8.9.7.
  • Drop two apparently unused CUDA .dlls.
  • Disable flash attention by default, since it requires a 3000-series (Ampere) GPU or newer.
  • Disable flash attention when CPU mode is selected. flash attention is a GPU algorithm, so it doesn't make sense to request it when CPU mode is on.

Significant changes:

  • Upgrade to ctranslate2 4.2.1, which implements flash-attention. This should improve performance across the board.

Minor changes:

  • Change defaults to work with custom chatbox prefab from Gumroad

Avatar resources used by custom chatbox:

  • 1 material slot
  • 12 polygons
  • 108 parameter bits
  • 5.33 MB of texture memory
  • 5 audio sources

How to uninstall:

  • Simply delete the TaSTT/ directory. All runtime dependencies are installed under that directory, so deleting it leaves nothing behind.

v0.19.1

09 Jun 23:49
Compare
Choose a tag to compare

Download the .zip at the bottom of this page, then install 1 dependency:

  • Microsoft Visual C++ Redistributable. Direct link: vc_redist.x64.exe. Required by some Python dependencies.

Then follow the v0.13 setup guide: https://youtu.be/SqOGTnKXgag

Please report all issues to the discord: https://discord.gg/YWmCvbCRyn

Hotfix 1:

  • Bump CUDNN to v8.9.7.
  • Drop two apparently unused CUDA .dlls.
  • Disable flash attention by default, since it requires a 3000-series (Ampere) GPU or newer.
  • Disable flash attention when CPU mode is selected. flash attention is a GPU algorithm, so it doesn't make sense to request it when CPU mode is on.

Significant changes:

  • Upgrade to ctranslate2 4.2.1, which implements flash-attention. This should improve performance across the board.

Minor changes:

  • Change defaults to work with custom chatbox prefab from Gumroad

Avatar resources used by custom chatbox:

  • 1 material slot
  • 12 polygons
  • 108 parameter bits
  • 5.33 MB of texture memory
  • 5 audio sources

How to uninstall:

  • Simply delete the TaSTT/ directory. All runtime dependencies are installed under that directory, so deleting it leaves nothing behind.

v0.19.0

08 Jun 03:13
Compare
Choose a tag to compare

Download the .zip at the bottom of this page, then install 1 dependency:

  • Microsoft Visual C++ Redistributable. Direct link: vc_redist.x64.exe. Required by some Python dependencies.

Then follow the v0.13 setup guide: https://youtu.be/SqOGTnKXgag

Please report all issues to the discord: https://discord.gg/YWmCvbCRyn

Significant changes:

  • Upgrade to ctranslate2 4.2.1, which implements flash-attention. This should improve performance across the board.

Minor changes:

  • Change defaults to work with custom chatbox prefab from Gumroad

Avatar resources used by custom chatbox:

  • 1 material slot
  • 12 polygons
  • 108 parameter bits
  • 5.33 MB of texture memory
  • 5 audio sources

How to uninstall:

  • Simply delete the TaSTT/ directory. All runtime dependencies are installed under that directory, so deleting it leaves nothing behind.

v0.18.1

10 Feb 01:52
Compare
Choose a tag to compare

Download the .zip at the bottom of this page, then install 1 dependency:

  • Microsoft Visual C++ Redistributable. Direct link: vc_redist.x64.exe. Required by some Python dependencies.

Then follow the v0.13 setup guide: https://youtu.be/SqOGTnKXgag

Please report all issues to the discord: https://discord.gg/YWmCvbCRyn

Hotfix 1:

  • Finish plumbing GPU compute type

Significant changes:

  • Add additional filtering targeted at removing common hallucinations like "Thanks for watching!".
  • Add dropdown to select GPU compute type. Users with older GPUs should experiment with this setting.

Minor changes:

  • Fix how mipmaps levels are selected for the font bitmaps with the custom chatbox.

Avatar resources used:

  • 65 bits of avatar parameters (may be increased for faster paging)
  • 340 KB of texture memory (non-English configurations use 130MB)
  • 1 material slot
  • 1 slot in avatar menu
  • No write defaults on your animations (may work with them, but this is not tested)

How to update an existing avatar:

  • Simply regenerate your avatar assets using the GUI, then upload. Unity will automatically pick up the new shader, animations, and avatar descriptor components.

How to uninstall:

  • Simply delete the TaSTT/ directory. All runtime dependencies are installed under Resources/.
  • If you put it on an avatar:
    • Delete TaSTT_Generated
    • Delete World-Constraint from your hierarchy
    • Update your avatar descriptor to use your old FX controller, menu, and parameters

v0.18.0

10 Feb 01:31
Compare
Choose a tag to compare

Download the .zip at the bottom of this page, then install 1 dependency:

  • Microsoft Visual C++ Redistributable. Direct link: vc_redist.x64.exe. Required by some Python dependencies.

Then follow the v0.13 setup guide: https://youtu.be/SqOGTnKXgag

Please report all issues to the discord: https://discord.gg/YWmCvbCRyn

Significant changes:

  • Add additional filtering targeted at removing common hallucinations like "Thanks for watching!".
  • Add dropdown to select GPU compute type. Users with older GPUs should experiment with this setting.

Minor changes:

  • Fix how mipmaps levels are selected for the font bitmaps with the custom chatbox.

Avatar resources used:

  • 65 bits of avatar parameters (may be increased for faster paging)
  • 340 KB of texture memory (non-English configurations use 130MB)
  • 1 material slot
  • 1 slot in avatar menu
  • No write defaults on your animations (may work with them, but this is not tested)

How to update an existing avatar:

  • Simply regenerate your avatar assets using the GUI, then upload. Unity will automatically pick up the new shader, animations, and avatar descriptor components.

How to uninstall:

  • Simply delete the TaSTT/ directory. All runtime dependencies are installed under Resources/.
  • If you put it on an avatar:
    • Delete TaSTT_Generated
    • Delete World-Constraint from your hierarchy
    • Update your avatar descriptor to use your old FX controller, menu, and parameters

v0.17.0

10 Dec 02:23
Compare
Choose a tag to compare

Download the .zip at the bottom of this page, then install 1 dependency:

  • Microsoft Visual C++ Redistributable. Direct link: vc_redist.x64.exe. Required by some Python dependencies.

Then follow the v0.13 setup guide: https://youtu.be/SqOGTnKXgag

Please report all issues to the discord: https://discord.gg/YWmCvbCRyn

Significant changes:

  • Add distilled Whisper models. These might be suitable for certain users, but I would generally recommend using the base.en or small.en models.
  • Reduce OSC sync rate from 5 Hz to 3 Hz, improving reliability in busy lobbies at the theoretical cost of speed. In most cases, OSC sync rate isn't a significant bottleneck, so this shouldn't negatively impact user experience.

Hardware requirements:

  • ~5GB of disk space (my python and NVIDIA dependencies are big)
  • An NVIDIA GPU with least 2GB of VRAM to spare (recommended)

Avatar resources used:

  • 65 bits of avatar parameters (may be increased for faster paging)
  • 340 KB of texture memory (non-English configurations use 130MB)
  • 1 material slot
  • 1 slot in avatar menu
  • No write defaults on your animations (may work with them, but this is not tested)

How to update an existing avatar:

  • Simply regenerate your avatar assets using the GUI, then upload. Unity will automatically pick up the new shader, animations, and avatar descriptor components.

How to uninstall:

  • Simply delete the TaSTT/ directory. All runtime dependencies are installed under Resources/.
  • If you put it on an avatar:
    • Delete TaSTT_Generated
    • Delete World-Constraint from your hierarchy
    • Update your avatar descriptor to use your old FX controller, menu, and parameters

v0.16.0

07 Oct 01:16
Compare
Choose a tag to compare

Download the .zip at the bottom of this page, then install 1 dependency:

  • Microsoft Visual C++ Redistributable. Direct link: vc_redist.x64.exe. Required by some Python dependencies.

Then follow the v0.13 setup guide: https://youtu.be/SqOGTnKXgag

Please report all issues to the discord: https://discord.gg/YWmCvbCRyn

Significant changes:

  • Long pauses in speech (>= 10 seconds) cause text before the pause to be removed from the transcript. The length of the pause can be configured in the UI with the "Reset after silence" field. Set it to -1 to disable the feature.
  • Add ability to select CPU process priority of transcription application in UI. By default, it's set to standard priority. Higher priorities may improve performance under load, at the cost of other applications' performance.

Minor changes:

  • Normalize mic audio volume before transcribing, improving accuracy when speaking very softly, at the cost of some accuracy when speaking normally.
  • Browser source deletes old segments in larger batches now, making text suddenly reflow less often.

Known bugs:

  • Speech typing is broken.

Hardware requirements:

  • ~5GB of disk space (my python and NVIDIA dependencies are big)
  • An NVIDIA GPU with least 2GB of VRAM to spare (recommended)

Avatar resources used:

  • 65 bits of avatar parameters (may be increased for faster paging)
  • 340 KB of texture memory (non-English configurations use 130MB)
  • 1 material slot
  • 1 slot in avatar menu
  • No write defaults on your animations (may work with them, but this is not tested)

How to update an existing avatar:

  • Simply regenerate your avatar assets using the GUI, then upload. Unity will automatically pick up the new shader, animations, and avatar descriptor components.

How to uninstall:

  • Simply delete the TaSTT/ directory. All runtime dependencies are installed under Resources/.
  • If you put it on an avatar:
    • Delete TaSTT_Generated
    • Delete World-Constraint from your hierarchy
    • Update your avatar descriptor to use your old FX controller, menu, and parameters

v0.15.4

16 Sep 23:51
Compare
Choose a tag to compare

Download the .zip at the bottom of this page, then install 1 dependency:

  • Microsoft Visual C++ Redistributable. Direct link: vc_redist.x64.exe. Required by some Python dependencies.

Then follow the v0.13 setup guide: https://youtu.be/SqOGTnKXgag

Please report all issues to the discord: https://discord.gg/YWmCvbCRyn

Hotfix 4:

  • Fix how paths are passed to the script that removes audio sources from the custom chatbox prefab
  • Fix uwu filter

Hotfix 3:

  • Listing input devices works again

Hotfix 2:

  • Pin huggingface_hub to 0.16.4, fixing issue where models fail to download.
  • Fix remove trailing period filter: non-final periods are no longer removed.

Hotfix 1:

  • Fix animator bug: some letter layers could hit a dead end leading to undefined behavior. Now they always have a valid path to take.
  • Fix paging bug: custom chatbox rows/cols are correctly passed to OSC layer again.
  • Add ability to choose custom chatbox font texture size in UI.

Significant changes:

  • Switch to voice activity detection (VAD) based segmentation of speech
    • More accurate, more computationally efficient.
    • Better able to keep up with fast speech.
  • Add plugin interface. See transcribe_v2.py, class Plugin.

Minor changes:

  • Chatbox is locked at spawn by default.
  • Browser source now shows preview text slightly transparent.
  • Previews can be disabled.
  • Transcription loop delay is configurable.
  • VAD min silence duration & max speech duration are configurable.

Bugfixes:

  • UI text buffers, log file, and transcript are all constrained in size now.

Known bugs:

  • Speech typing is broken.

Hardware requirements:

  • ~5GB of disk space (my python and NVIDIA dependencies are big)
  • An NVIDIA GPU with least 2GB of VRAM to spare (recommended)

Avatar resources used:

  • 65 bits of avatar parameters (may be increased for faster paging)
  • 340 KB of texture memory (non-English configurations use 130MB)
  • 1 material slot
  • 1 slot in avatar menu
  • No write defaults on your animations (may work with them, but this is not tested)

How to update an existing avatar:

  • Simply regenerate your avatar assets using the GUI, then upload. Unity will automatically pick up the new shader, animations, and avatar descriptor components.

How to uninstall:

  • Simply delete the TaSTT/ directory. All runtime dependencies are installed under Resources/.
  • If you put it on an avatar:
    • Delete TaSTT_Generated
    • Delete World-Constraint from your hierarchy
    • Update your avatar descriptor to use your old FX controller, menu, and parameters

v0.15.3

14 Sep 04:56
Compare
Choose a tag to compare

Download the .zip at the bottom of this page, then install 1 dependency:

  • Microsoft Visual C++ Redistributable. Direct link: vc_redist.x64.exe. Required by some Python dependencies.

Then follow the v0.13 setup guide: https://youtu.be/SqOGTnKXgag

Please report all issues to the discord: https://discord.gg/YWmCvbCRyn

Hotfix 3:

  • Listing input devices works again

Hotfix 2:

  • Pin huggingface_hub to 0.16.4, fixing issue where models fail to download.
  • Fix remove trailing period filter: non-final periods are no longer removed.

Hotfix 1:

  • Fix animator bug: some letter layers could hit a dead end leading to undefined behavior. Now they always have a valid path to take.
  • Fix paging bug: custom chatbox rows/cols are correctly passed to OSC layer again.
  • Add ability to choose custom chatbox font texture size in UI.

Significant changes:

  • Switch to voice activity detection (VAD) based segmentation of speech
    • More accurate, more computationally efficient.
    • Better able to keep up with fast speech.
  • Add plugin interface. See transcribe_v2.py, class Plugin.

Minor changes:

  • Chatbox is locked at spawn by default.
  • Browser source now shows preview text slightly transparent.
  • Previews can be disabled.
  • Transcription loop delay is configurable.
  • VAD min silence duration & max speech duration are configurable.

Bugfixes:

  • UI text buffers, log file, and transcript are all constrained in size now.

Known bugs:

  • Speech typing is broken.

Hardware requirements:

  • ~5GB of disk space (my python and NVIDIA dependencies are big)
  • An NVIDIA GPU with least 2GB of VRAM to spare (recommended)

Avatar resources used:

  • 65 bits of avatar parameters (may be increased for faster paging)
  • 340 KB of texture memory (non-English configurations use 130MB)
  • 1 material slot
  • 1 slot in avatar menu
  • No write defaults on your animations (may work with them, but this is not tested)

How to update an existing avatar:

  • Simply regenerate your avatar assets using the GUI, then upload. Unity will automatically pick up the new shader, animations, and avatar descriptor components.

How to uninstall:

  • Simply delete the TaSTT/ directory. All runtime dependencies are installed under Resources/.
  • If you put it on an avatar:
    • Delete TaSTT_Generated
    • Delete World-Constraint from your hierarchy
    • Update your avatar descriptor to use your old FX controller, menu, and parameters

v0.15.1

11 Sep 02:11
Compare
Choose a tag to compare

Download the .zip at the bottom of this page, then install 1 dependency:

  • Microsoft Visual C++ Redistributable. Direct link: vc_redist.x64.exe. Required by some Python dependencies.

Then follow the v0.13 setup guide: https://youtu.be/SqOGTnKXgag

Please report all issues to the discord: https://discord.gg/YWmCvbCRyn

Hotfix 1:

  • Fix animator bug: some letter layers could hit a dead end leading to undefined behavior. Now they always have a valid path to take.
  • Fix paging bug: custom chatbox rows/cols are correctly passed to OSC layer again.
  • Add ability to choose custom chatbox font texture size in UI.

Significant changes:

  • Switch to voice activity detection (VAD) based segmentation of speech
    • More accurate, more computationally efficient.
    • Better able to keep up with fast speech.
  • Add plugin interface. See transcribe_v2.py, class Plugin.

Minor changes:

  • Chatbox is locked at spawn by default.
  • Browser source now shows preview text slightly transparent.
  • Previews can be disabled.
  • Transcription loop delay is configurable.
  • VAD min silence duration & max speech duration are configurable.

Bugfixes:

  • UI text buffers, log file, and transcript are all constrained in size now.

Known bugs:

  • Speech typing is broken.

Hardware requirements:

  • ~5GB of disk space (my python and NVIDIA dependencies are big)
  • An NVIDIA GPU with least 2GB of VRAM to spare (recommended)

Avatar resources used:

  • 65 bits of avatar parameters (may be increased for faster paging)
  • 340 KB of texture memory (non-English configurations use 130MB)
  • 1 material slot
  • 1 slot in avatar menu
  • No write defaults on your animations (may work with them, but this is not tested)

How to update an existing avatar:

  • Simply regenerate your avatar assets using the GUI, then upload. Unity will automatically pick up the new shader, animations, and avatar descriptor components.

How to uninstall:

  • Simply delete the TaSTT/ directory. All runtime dependencies are installed under Resources/.
  • If you put it on an avatar:
    • Delete TaSTT_Generated
    • Delete World-Constraint from your hierarchy
    • Update your avatar descriptor to use your old FX controller, menu, and parameters