Connected Speech SOP

Connected Speech SOP

Connected speech refers to natural, continuous spoken language produced in conversation or narrative, rather than isolated words or sentences. It provides rich acoustic and linguistic information on lexical retrieval, syntax, fluency, and speech rate, making it a sensitive measure of language impairment in PPA.

The administration and data analysis of the connected speech samples constitutes the research the Aim 2: Identify bilingualism factors associated with differential patterns of language impairment in Hispanics with PPA using metrics derived from connected speech.

Overview of the processes and this guide

 In connected speech, there are different processes, our main communication is completed though R01_CS_Data_Processing | Multilingual Aphasia and Dementia Research Lab | Microsoft Teams and in the https://cloud.wikis.utexas.edu/wiki/spaces/MADRWiki/pages/482182128

Process and guide

Current leads

Responsabilities of the lead

Process and guide

Current leads

Responsabilities of the lead

 

https://cloud.wikis.utexas.edu/wiki/x/7oBHI

Connected Speech supervisor

 

  • Oversees data accuracy and completeness across all Connected Speech and VISTA–Connected Speech SmartSheets and REDCap entries.

  • Supports coordination of data tracking, ensures missing data are identified and added, and assists with supervising data processes related to connected speech.

https://cloud.wikis.utexas.edu/wiki/x/7oBHI

Data processing lead: Jada Li

  • Tasks assigned by connected speech supervisor

Adminisration, Recording and Saving audios

https://cloud.wikis.utexas.edu/wiki/spaces/MADRWiki/pages/342033746

https://cloud.wikis.utexas.edu/wiki/spaces/MADRWiki/pages/134283924

Speech and Language Pathologists

DFT Sant Pau (Depanem): Jesús

  • Following SOPs to unsure accurate administration, recording and data saving procedures

Clipping+Whisper team (Arely, Aaliyah, Carmen, Jimena C, Jimena P, Jada)

3.1 Clipping and Whisper Student Team | Multilingual Aphasia and Dementia Research Lab | Microsoft Teams

Meeting: Biweekly meeting with student leads, any other student part of the team is welcome to join in they are in the lab Thursdays at 3.30pm: https://cloud.wikis.utexas.edu/wiki/x/8IO9H

https://cloud.wikis.utexas.edu/wiki/spaces/MADRWiki/pages/56197971

Arely Aguilar

  • Train new students on clipping

  • Supervise clipping status using the report

  • Supervise reclipping status using the reclipping reports

  • Make any changes to the Connected Speech SmartSheets or reports if indicated by the supervisor

https://cloud.wikis.utexas.edu/wiki/spaces/MADRWiki/pages/56197201

Aaliyah M Segura

  • Train new students on Whisper

  • Once a week, supervise pending samples to run (Picnic Scene or Important Event)

  • Run samples or post them in the Whisper & Clipping channel for the next student starting their shift

  • Ensure all samples requested via the channel are being run

  • Supervise Whisper status using the reports

  • Keep running Connected Speech samples in order of priority set in the “3. Whisper Transcription Process (Research Assistants)” page -- Deadline goal - running all samples before 5 dic 2025

Transcription team (Whendy, Helena, Jaume)

https://cloud.wikis.utexas.edu/wiki/spaces/MADRWiki/pages/56197560

 

4.2 Utterance segmentation process (transcribers)

 

6. Coding process (transcribers)archived

 

Data processing team

5. Language analysis process/overview

Project lead

 

6. Acoustic Derivations Guide

Project lead

 

7. Connected Speech/Transcription Reliability

Project lead

 

Detailed methodology

image-20251031-170526.png
Source: AoA_2025_Grasso_Santos

 

Training videos

Specific training videos of these processes. CS_Connected SpeechR01Protocol_Training Videos

Data structure overview

Connected speech overview structure, that explains how the files are saved in Box:

 

Pre-R01 Connected Speech samples

Before April 2024 all Connected Speech Samples were recorded via ZOOM. These samples are part of the Connected Speech Data Raw but the videos/audios are kept in a separate folder because they were recorded differently, and the format is also different.

The Pre-R01 tasks include:

  1. 20251001 Dr.Grasso and Sonia meeting, for Pre-R01 samples we currently have the 1st visit date for each timepoint in the Connected Speech smartsheets for the administration of the Connected Speech samples. We plan on having the exact date of administration for all the timepoints to have more accurate data, but for now we will use those dates that have been extracted from the MADR participant sheet for all the timepoints except for POST, since we don’t have a post timepoint in the MADR participant smartsheet. Jada (RA student) is currently working on adding the post-tx exact dates to the connected speech smartsheets, and we plan on start adding the rest of the dates switching them from the current approximate dates once the post-tx dates are added for both Spanish and Catalan.

Scheme of where the pre-R01 samples are saved:

 

Overview of Processes/ Procedures

Can the following elements get included in the app? (Y/N/M)

The general overview of processing procedures is as follows:

  1. Sample is recorded YES

  2. Sample is saved in folder on Box NO- and Maybe not necessary

  3. Sample is clipped to ensure no clinician speech or background noise is included Maybe. Probably this could happen OUTSIDE of app and just have an option to re-upload when finished so that the following steps happen automatically. Could diarization take care of this Kesha, has it gotten any better? Then, it could happen in the app and decide to apply diarization so it’s patient-only speech or not!

    1. Cody says in the app there should be a way to clip within app and that way they can stay in app.

      1. Still we want an upload option for audios for older samples

  4. a. Audio sample gets run through the following steps

    1. A script that cuts pauses from start and end of samples (Kesha confirmed this is correct and is already in the Acoustic Pipeline) maybe Would definitely like this included. Kesha is this detected via a specific dB level at present? Takes in audio sample and for each audio segment it determines if it’s silent or not. Kesha will look into it

    2. Audio is run through Acoustic Pipeline (Kesha’s script) MAYBE Would definitely like this included

    3. https://www.youtube.com/watch?v=YxZ8cLGWDaE

4. b.Audio sample goes through the following steps

a. Processed through Whisper Maybe- Currently TACC (need to see if Kesha has a more verbatim model for Whisper yet) Would like this included but seems like the thing that could get pricey. Would need to be available only to certain users (true of the steps above and below as well)

b. Transcriber finalizes transcript from Whisper for CLAN (corrects it and formats it) MAYBE?? Probably this could happen OUTSIDE of app and just have an option to re-upload when finished so that the following steps happen automatically (how hard do we think it would be to automatize it?)

c. Transcript is formatted for CLAN so we can derive specific features from CLAN’s system

d. Transcript is then stripped using a specific code to ensure certain elements that can negatively influence linguistic pipeline aren’t included (this goes through a script that strips specific content) Maybe Would definitely like this included

e. Linguistic features are extracted from linguistic pipeline (Kesha’s script) MAYBE Would definitely like this included

MUCH LATER: Some graphs showing how the person did on each feature relative to controls (or others with the dx) on the same task