7. Supervision of Connected Speech Data

7. Supervision of Connected Speech Data

Procedures

Connected Speech Mastersheets

Maintain and periodically review the Connected Speech Data Quality smartsheets, which contain all updated data of all the connected speech procedures:

  • These sheets provide an overview of the full Connected Speech workflow, from raw data to analysis-ready output.

  • Supervisors ensure that there is no missing data, filling it out and sending reminders if necessary to the different teams to fill out all necessary information.

Recommendation: create filter per participant to check and fill out missing information

image-20251105-005325.png

R Dashboards and Data Visualization

  • The Connected Speech Dashboard in R provides a real-time visualization of data completeness and progress using the data of the Spanish Connected Speech Data Analysis and Catalan Connected Speech Data Analysis.

    • Track project metrics such as: sample size for a specific study, available visit pairs (e.g., Pre–Follow-up, Pre–Post) and days between visits (for sanity checks) and for longitudinal studies.

  • Supervisors use these dashboards to monitor trends, confirm data coverage, and identify outliers or missing entries.

    • When those errors are detected please refer to the REDCap reports and MADR participant smartsheet to check data and correct it in the MADR paticipant smartheet. If there’s clarification needed from clinicians or data entry please adress it in the R01_CS_Data_Processing teams channel tagging the clinicians.

  • R project dashboard: https://utexas.box.com/s/96jez19c5hdh2w2rpov4w7addvt95pw1


REDCap Data Monitoring


SmartSheet Data Monitoring

  • Use the MADR Participant SmartSheet as the master tracking sheet with respect to decisions about participant inclusion in connected speech smartsheets.

    • Clinicians update the sheet after each visit or follow-up to indicate:

      • Participant status (timepoints completed, pending, withdrawn, DNQ (Do Not Qualify), Corrections in timepoints or visit labels

  • Supervisors cross-reference this SmartSheet to ensure all data is reflected consistently across systems.

  • Check here participant naming convention: https://cloud.wikis.utexas.edu/wiki/x/IrXhB


Quality Control and Review - Connected Speech Supervisor

  • Weekly: Review REDCap reports and SmartSheets for missing or inconsistent data.

  • Monthly: Audit R dashboard summaries and confirm data integrity across projects.

  • Quarterly (or before the start of each cohort): checking the progress of the different processes, use smartsheets reports to visualize progress (clipping, whisper, transcription, reliability)

Procedure When a Connected Speech Sample Is Missing or Data Are Inconsistent

  1. Checking Available Data

  • Verify expected timepoints using the MADR participant smartsheet and:

    • assessment timeline (in Participants folder)

    • REDCap (Connected Speech instrument).

  • Check Connected Speech Smartsheets for existing rows and confirm that each row corresponds to an actual recording. (! if there’s no audio or video or transcription available we will not have a row in the connected speech smartsheet)

  • Check in Box for the participants video/audio:

    • the Connected Speech Data Raw folder

    • the ALL IP SESSIONS folder (this is a backup folder for all the files, all the clinicians upload the audios-videos here and then they copy them to Connected Speech Data Raw)

    • the participant’s Box folder

  • Identify any mismatches (row without file, file without row, wrong date, etc.).

  1. Correct the mismatches: Copying Files if They Exist

  • If the recording exists but is misplaced or mislabeled, copy it to the correct timepoint folder and rename using CS conventions.

  • If a file exists but there is no Smartsheet row, add the row.

  • If the Smartsheet row exists but the file does not, delete the row.

  1. Adding Notes and Documenting Inconsistencies

  • In Smartsheet: Add comments explaining added/removed rows or missing recordings.

  • In REDCap: Add a note in the Connected Speech instrument indicating missing or inconsistent data.

  • In Box: Ensure final folder structure and naming reflect the corrected timepoint status.

Data Analysis Tasks

For assigning tasks we use student supervisor chat.

Recurrent tasks:

  • Create new filters for new participants to help visualize them in smartsheet report, eg.

image-20251126-173527.png

Assigned Tasks:

@Jada Li add exact date of administration for all pre-r01 CS samples starting from post, pre, 3m, 6m, 12m. Decision to start with post to make quick progress with inclusion of participants for longitudinal picnic scene ppa study, but we want the exact dates for the 3 tasks - important event, cat rescue and picnic scene.
@Jada Li create 3 columns in English Smartsheet for dates of picnic scene, cat rescue and important event Sheet

Future RA Tasks (not assigned yet):

  • High priority:

Add In TaskName_4.Coded the “In Progress-Codeswitch question” option, for all the connected speech tasks in SPANISH, CATALAN, ENGLISH. The options should be “Completed, In Progress, In Progress-Codeswitch question, No audio clip, Not administered”) Sonia completed this for picnic and important event in both Spanish and Catalan
create Redcap Connected Speech DellMed reports using the BACC ones as template and update this guide 7. Supervision of Connected Speech Data | REDCap Data Monitoring
  • Mid priority:

1. finish creating FRENCH Connected Speech Data analysis smartsheet: delete all rows with participants, but keep all the grey headers “clinical trial”, “screening” etc.
2. copy the FRENCH smartsheet and create the GERMAN smartheet
  • Low priority:

update Connected Speech Detailed overview and Box folder structure Connected Speech Detailed overview and Box folder structure
create and get familiar with Box participant folder structure, we used to have a tree scheme in box, start from that and create a whiteboard in box

Completed Tasks

1st task: Catalan: move the pre-r01 data “Picnic Scene Pre-R01 recorded in (audio format)” to the current “Picnic Description Clip Recorded in” recorded in column and delete the pre-r01 column
@Jada Li 2nd task: Add pre-r01 dates to Catalan Connected Speech smartsheet using the MADR_Participant Information: pre, mid, 3m, 6m (actual date of 6m), 12m (actual date of 12m) - no connected speech happening in tx or in 9m
@Jada Li fill out Dx and Center of origin columns in CATALAN Connected Speech Data analysis looking at the SPANISH Connected Speech Data analysis, feel free to use the filter by participants, it helps fill out the data more easily, eg. BILP006, you add the diagnosis in the first row and then you copy and paste it to the rest of the rows, same with center of origin
@Jada Li fill out in SPANISH Connected Speech Data analysis mid, and 3m dates (just to have it there, not for this specific project, even though a future plan would be to analyze pre post samples to see tx effect, but thats not the objective of the current study.
@Jada Li adding date column for Cat Rescue, Picnic Scene and Important event SPANISH, CATALAN