VISTA Reliability - Blinding Procedure

VISTA Reliability - Blinding Procedure

Because our projects have multiple steps in which lab members must transcribe patients' speech, we have decided to use a system that takes advantage of work that has already been done. Checking the reliability of transcriptions normally requires comparing the transcriptions of two (or more) individuals and identifying any discrepancies between them. As we derive two independent pipelines for VISTA data (POM & Connected Speech), this would typically take 4 individuals transcribing patients' data (2 for POM and 2 for Connected Speech). However, we elected to use a reliability structure that only requires 3 individuals by using the transcription of one transcriber (who we denote as Rater 2) for both POM and Connected Speech reliability. This allows us to minimize the work that is required for assessing reliability.

 

For a diagram, please see here.

 

In order to use Rater 2’s transcriptions for both POM and Connected Speech, we need to ensure that we control for potential biases similarly across POM and Connected Speech. In other words, we need to be careful about the amount of information we give the Raters/transcribers for any given observation that they need to transcribe. For information that must be withheld from the transcribers to prevent bias, we must blind

it. While the specific factors that are relevant for this process differ according to whether Reliability is for POM or Connected Speech (see Script Selection boxes below), here are some key factors:

  • Patient ID

  • Language (Spanish, Catalan)

  • Observation (pre1, pre2, post1, tx session, post2, 3mo, 6mo, 12mo)

  • Script (1, 2, 3, 4, 5, 6, 7, 8)

  • Script training status (trained, untrained)

 

Our goal is to harmonize the two VISTA Reliability procedures as much as possible. For blinding, there are some differences that need to be addressed. For POM, the SLP will always be Rater 1. As such, they will have access to the Patient ID, the language, the observation, the script and the script training status. However, for Connected Speech, both transcribers must be blinded to the observation time point of the audio. We have decided that the most straightforward way to unify these processes is to have VISTA POM Reliability use clipped audio samples (rather than session videos like previously) that come directly from the current transcription pipeline along with different transcripts to be used as the base for the transcribers' coding. For Transcriber 1, the Reliability Supervisor will provide them with the SLP’s POM transcript to use as a base. For Transcriber 2, the Reliability Supervisor will provide them with the Whisper transcript. This means that the clipping and whisper teams will carry on like normal for the sessions for VISTA Connected Speech with the addition of the two treatment sessions needed for VISTA POM Reliability.

 

The Reliability Supervisor will achieve the above goals by blinding the materials needed for Reliability in the Box folders and linking to these blinded materials in the relevant SmartSheets. This means that both the audio and transcript files for the Raters to use will need to have a naming convention that hides the observation time point. VISTA POM and Connected Speech have different requirements for the number of transcriptions that must be checked and the number of possible observations that can be chosen. Read through the tabs in this table for a more in-depth explanation and overview of each procedure’s script selection and rationale.

Script selection for VISTA POM takes into account the language of the scripts and the observation time point. For VISTA POM, there are 6 total sessions. However, unlike with Connected Speech, Reliability for POM involves checking ALL scripts for each observation point. Also note that Pre_1 and Post_1/Post_2 sessions are probed in one language per session (either Spanish OR Catalan/English, depending on the multilingual being probed) the treatment sessions have scripts probed in both languages and that the post-treatment session can be either Post_1 or Post_2.

Observation

Language

Pre_1

Tx_Phase1

Tx_Phase2

Post_1
OR
Post_2

Spanish

(tick)

Catalan/English

(tick)

Spa + Cat/Eng

(tick)

(tick)

Spanish

(tick)

Catalan/English

(tick)

VISTA POM Reliability will use blinded SmartSheets as sources for materials. The Reliability Supervisor will populate the SmartSheet with the Box links to the blinded materials.

As shown above, VISTA POM Reliability requires that we choose two treatment sessions for Reliability. This means that we need to be careful in applying randomization to treatment session selection. We can randomly select numbers from 1-18. As there are 18 treatment sessions split across two phases (the first 9 treatment sessions in phase 1 and the last 9 treatment sessions in phase 2). We can use a Random Number Generator and generate numbers until we have one number that falls between 1-9 for Tx_Phase1 and one number that falls between 10-18 for Tx_Phase2.

Mid-Tx (between Tx_Phase1 and Tx_Phase2) will be a good time to do Reliability for the first two sessions (Pre_1 and Tx_Phase1).

Reliability for Connected Speech will use a different approach to randomization than VISTA POM. We have randomized which scripts will be selected by each Observation rather than by each individual patient. For each observation, one trained script (1,2,3,5,6,7) and one untrained script (4,8) will be selected for each patient. For trained scripts, we rotate through so that all trained scripts are selected at least once. For untrained scripts, we oscillate between script 4 and script 8. The Reliability Supervisor will need to blind different materials for each Transcriber. For both Transcribers, the relevant time point’s Audios must be blinded and linked. For Transcriber 1, the SLP’s POM transcript will need to be pasted into the SmartSheet. For Transcriber 2, the Whisper transcript will need to be blinded and linked in the SmartSheet.

Observation

Script

Status

Pre_1

Pre_2

Post_1

Post_2

3moFU*

6moFU

12moFU

1

Trained

(tick)

2

Trained

(tick)

(tick)

3

Trained

(tick)

4

Untrained

(tick)

(tick)

(tick)

5

Trained

(tick)

6

Trained

(tick)

7

Trained

(tick)

8

Untrained

(tick)

(tick)

(tick)

(tick)

Not all participants will have a 3 month follow-up.

Codename

Meaning

BISE004_Literatura_AU

For naming convention, we simply remove the observation and the date from the clipped Data Raw files and replace it with the code below for the corresponding observation.

AU

Pre1

AD

Pre2

PU

Post1

PD

Post2

TE

3mo

SE

6mo

DD

12mo

LC

trained

BS

untrained

NN

treatment 1

WW

treatment 2

RR

treatment 3

UO

treatment 4

VI

treatment 5

II

treatment 6

VV

treatment 7

GG

treatment 8

NI

treatment 9

EE

treatment 10

VE

treatment 11

LE

treatment 12

TI

treatment 13

TU

treatment 14

TF

treatment 15

TX

treatment 16

TN

treatment 17

TH

treatment 18

Page-specific notes:

July 15, 2025:

-Updated the page to use advanced tabs/tables.

March 7, 2025:
-All the scripts probed in the session and the sample/source of the sample is changed from being the video of the session to the individual audios of the scripts.
-We can randomly select the two treatment sessions for each patient.
-Pre1 and Post sessions will be probed in one language per session. The Tx sessions will be include probes in both languages during each session.
-When we do this by language, only the Pre + Post observations/sessions will be language-specific because the Tx sessions include both languages.
-Mid Tx will be a good time to do Reliability for the first two observations.