Connected Speech SOP
Connected speech refers to natural, continuous spoken language produced in conversation or narrative, rather than isolated words or sentences. It provides rich acoustic and linguistic information on lexical retrieval, syntax, fluency, and speech rate, making it a sensitive measure of language impairment in PPA.
The administration and data analysis of the connected speech samples constitutes the research the Aim 2: Identify bilingualism factors associated with differential patterns of language impairment in Hispanics with PPA using metrics derived from connected speech.
Overview of the processes and this guide
In connected speech, there are different processes, our main communication is completed though R01_CS_Data_Processing | Multilingual Aphasia and Dementia Research Lab | Microsoft Teams and in the https://cloud.wikis.utexas.edu/wiki/spaces/MADRWiki/pages/482182128
Process and guide | Current leads | Responsabilities of the lead |
|---|---|---|
| Connected Speech supervisor
|
|
Data processing lead: Jada Li |
| |
Adminisration, Recording and Saving audios | ||
https://cloud.wikis.utexas.edu/wiki/spaces/MADRWiki/pages/342033746 https://cloud.wikis.utexas.edu/wiki/spaces/MADRWiki/pages/134283924 | Speech and Language Pathologists DFT Sant Pau (Depanem): Jesús |
|
Clipping+Whisper team (Arely, Aaliyah, Carmen, Jimena C, Jimena P, Jada) Meeting: Biweekly meeting with student leads, any other student part of the team is welcome to join in they are in the lab Thursdays at 3.30pm: https://cloud.wikis.utexas.edu/wiki/x/8IO9H | ||
https://cloud.wikis.utexas.edu/wiki/spaces/MADRWiki/pages/56197971 | Arely Aguilar |
|
https://cloud.wikis.utexas.edu/wiki/spaces/MADRWiki/pages/56197201 | Aaliyah M Segura |
|
Transcription team (Whendy, Helena, Jaume) | ||
https://cloud.wikis.utexas.edu/wiki/spaces/MADRWiki/pages/56197560 |
|
|
| ||
| ||
Data processing team | ||
Project lead |
| |
Project lead |
| |
Project lead |
| |
Detailed methodology
Training videos
Specific training videos of these processes. CS_Connected SpeechR01Protocol_Training Videos
Data structure overview
Connected speech overview structure, that explains how the files are saved in Box:
Here are the Connected Speech Data management smart sheets: CONNECTED SPEECH DATA MANAGEMENT SMARTSHEET FOLDER
All participants follow this structure, independently of the site of treatment (Mexico, Barcelona, Austin, etc.,) (Decision 20241115)
Pre-R01 Connected Speech samples
Before April 2024 all Connected Speech Samples were recorded via ZOOM. These samples are part of the Connected Speech Data Raw but the videos/audios are kept in a separate folder because they were recorded differently, and the format is also different.
The Pre-R01 tasks include:
STUDY: Therapy trial pre-R01.
Picnic Scene
Cat Rescue
Important Event
STUDY: SpeechFTLD A and B, samples can currently be found here: https://utexas.box.com/s/mu7f0437ls63p22wy4on6f04jxotpgx2
Procedural task Brushing teeth
Picnic scene
- 20251001 Dr.Grasso and Sonia meeting, for Pre-R01 samples we currently have the 1st visit date for each timepoint in the Connected Speech smartsheets for the administration of the Connected Speech samples. We plan on having the exact date of administration for all the timepoints to have more accurate data, but for now we will use those dates that have been extracted from the MADR participant sheet for all the timepoints except for POST, since we don’t have a post timepoint in the MADR participant smartsheet. Jada (RA student) is currently working on adding the post-tx exact dates to the connected speech smartsheets, and we plan on start adding the rest of the dates switching them from the current approximate dates once the post-tx dates are added for both Spanish and Catalan.
Scheme of where the pre-R01 samples are saved:
Overview of Processes/ Procedures
Can the following elements get included in the app? (Y/N/M)
The general overview of processing procedures is as follows:
Sample is recorded YES
Sample is saved in folder on Box NO- and Maybe not necessary
Sample is clipped to ensure no clinician speech or background noise is included Maybe. Probably this could happen OUTSIDE of app and just have an option to re-upload when finished so that the following steps happen automatically. Could diarization take care of this Kesha, has it gotten any better? Then, it could happen in the app and decide to apply diarization so it’s patient-only speech or not!
Cody says in the app there should be a way to clip within app and that way they can stay in app.
Still we want an upload option for audios for older samples
a. Audio sample gets run through the following steps
A script that cuts pauses from start and end of samples (Kesha confirmed this is correct and is already in the Acoustic Pipeline) maybe Would definitely like this included. Kesha is this detected via a specific dB level at present? Takes in audio sample and for each audio segment it determines if it’s silent or not. Kesha will look into it
Audio is run through Acoustic Pipeline (Kesha’s script) MAYBE Would definitely like this included
4. b.Audio sample goes through the following steps
a. Processed through Whisper Maybe- Currently TACC (need to see if Kesha has a more verbatim model for Whisper yet) Would like this included but seems like the thing that could get pricey. Would need to be available only to certain users (true of the steps above and below as well)
b. Transcriber finalizes transcript from Whisper for CLAN (corrects it and formats it) MAYBE?? Probably this could happen OUTSIDE of app and just have an option to re-upload when finished so that the following steps happen automatically (how hard do we think it would be to automatize it?)
c. Transcript is formatted for CLAN so we can derive specific features from CLAN’s system
d. Transcript is then stripped using a specific code to ensure certain elements that can negatively influence linguistic pipeline aren’t included (this goes through a script that strips specific content) Maybe Would definitely like this included
e. Linguistic features are extracted from linguistic pipeline (Kesha’s script) MAYBE Would definitely like this included
MUCH LATER: Some graphs showing how the person did on each feature relative to controls (or others with the dx) on the same task