TTS support with dataset bug fixes by pcsid · Pull Request #19 · ServiceNow/AU-Harness

pcsid · 2025-11-10T01:26:19Z

…mosv2 for mean opinion score estimation. Small dataset pathway bug adjustments.

📌 Description

This feature was to added to support text to speech evaluations for AU-Harness. Cartesia, Deepgram, and ElevenLabs clients are supported.

Also made some bug fixes to the dataset paths in the ASR task.

🛠️ Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality including new tasks)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Refactor / Code cleanup
Maintenance / Chore / Task
Other (please describe):

✅ How Has This Been Tested?

Unit tests
Integration tests
Manual testing

Test Results / Screenshots (if applicable):

📸 Screenshots / Demos

📋 Checklist

Code follows project style guidelines
Tests have been added/updated (if applicable)
Documentation has been updated (if applicable)
Linked relevant issue(s)
Self-reviewed my code

🙌 Additional Notes

…mosv2 for mean opinion score estimation. Small dataset pathway bug adjustments.

jonggunp · 2026-04-22T20:19:24Z

+
+        # URL is only required for non-TTS inference types
+        inference_type = info.get('inference_type')
+        if inference_type not in ['cartesia_tts', 'elevenlabs_tts', 'deepgram_tts']:


not a big deal, but should we be using constants instead?

jonggunp · 2026-04-22T20:22:25Z

                raise ValueError(f"Model {index}: '{field}' must be a non-empty string")
+
+        # Require voice_id for TTS inference types
+        if inference_type in ['cartesia_tts', 'elevenlabs_tts']:


do we need deepgram here?

jonggunp · 2026-04-22T20:32:42Z

+                for i, audio_file in enumerate(batch_files):
+                    temp_name = f"audio_{i:06d}.wav"
+                    temp_path = os.path.join(temp_dir, temp_name)
+                    shutil.copy(audio_file, temp_path)


audio_paths in tts_postprocessor.py can be an empty string. This will cause "audio_file" to be an empty string.

You may need to address line 109 so that it only shutil.copies when audio_file exists

jonggunp · 2026-04-22T20:36:30Z

+
+        Args:
+            message: Input message containing ground_truth_text
+            run_params: Runtime parameters for the inference request


this is never used in this method.

text-to-speech pathway for Cartesia, ElevenLabs, and Deepgram with ut…

f0cfc7f

…mosv2 for mean opinion score estimation. Small dataset pathway bug adjustments.

pcsid requested review from akshaykalkunte, nhhoang96, oluwanifemibamgbose and shruthan November 10, 2025 01:26

pcsid self-assigned this Nov 10, 2025

language specification removal

2073f26

pcsid changed the title ~~text-to-speech pathway for Cartesia, ElevenLabs, and Deepgram with ut…~~ TTS support with bug fixes Nov 10, 2025

pcsid changed the title ~~TTS support with bug fixes~~ TTS support with dataset bug fixes Nov 10, 2025

nhhoang96 requested a review from jonggunp April 22, 2026 20:06

jonggunp reviewed Apr 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TTS support with dataset bug fixes#19

TTS support with dataset bug fixes#19
pcsid wants to merge 2 commits intomainfrom
feat/tts

pcsid commented Nov 10, 2025

Uh oh!

jonggunp Apr 22, 2026

Uh oh!

jonggunp Apr 22, 2026

Uh oh!

jonggunp Apr 22, 2026

Uh oh!

jonggunp Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pcsid commented Nov 10, 2025

📌 Description

🛠️ Type of Change

✅ How Has This Been Tested?

📸 Screenshots / Demos

📋 Checklist

🙌 Additional Notes

Uh oh!

jonggunp Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

jonggunp Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

jonggunp Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

jonggunp Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants