You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need to clean up and de-identify RP023's Intern Application interview transcripts so that we can move on to the data analysis and insights generation phase.
Resources/Instructions: This section is at the bottom of this issue (scroll to the bottom to view it now). You will be asked to add links to this section while completing the issue.
Tip: Use two windows side by side. One with the issue open and the other one with resource links displayed to avoid back and forth. To prevent loss of work, refresh both windows after each edit.
Action Items
UX lead adds the assignee to the Internship - PII's "My Drive" as a Contributor so that they have access to internship - PII's My Drive where the recordings and transcripts with PII are stored
UX lead accesses the Internship - PII's "My Drive"
Choose "Manage members", which is located towards the top right side of the browser
Enter the assignee's email address and select the role as "Content Manager"
Confirm with the assignee that they have access to the Internship - PII's "My Drive"
Customize Resource Links
Customize Resource for Wiki Page Link
Go to the wiki page: Research Output Overview (Resources # 1.01)
Choose the link in the Research by Plan Number section
Locate relevant wiki page for RP023
Copy the link for the wiki page
In Resources # 2.01, place the link you just copied between parentheses at the end of the line with no space in between the right bracket ] and the left parenthesis (, so it turns into a hyperlink
Choose "Update comment" in Github and make sure all the checkboxes above have been checked
Customize Resource for this Research Plan's Google Drive Folder
Open the Google Drive's Research by Type Folder (Resources # 1.02)
Choose the Intern folder
Choose the RP023 folder
Copy the link of the RP023 folder
Choose the three vertical dots on the right side of the RP023 folder
Choose "Share"
Choose "Copy Link"
In Resources # 2.02, place the link you just copied between parentheses at the end of the line with no space in between the right bracket ] and the left parenthesis (, so it turns into a hyperlink
Choose "Update comment" in Github and make sure all the checkboxes above have been checked
Customize Resource for Interview Recordings and Transcriptions Tracking Sheet
Open the Research Plan's Google Drive folder in the Internship's shared drive (Resource # 2.02)
Locate the TWE: IS22: RP023: Intern Application Interview Recordings and Transcriptions Tracking Sheet in the folder
Copy the link of TWE: IS22: RP023: Intern Application Interview Recordings and Transcriptions Tracking Sheet
Choose the three vertical dots on the top right side of the file
Choose "Share"
Choose "Copy Link"
In Resources # 2.03, place the link you just copied between parentheses at the end of the line with no space in between the right bracket ] and the left parenthesis (, so it turns into a hyperlink
Choose "Update comment" in Github and make sure all the checkboxes above have been checked
Customize Resource for De-identified Participants List from Interviews (UXR Excluded) spreadsheet
Open the Research Plan's Google Drive folder in the Internship's shared drive (Resource # 2.02) if it is not open yet
Locate the TWE: IS22: RP023: De-identified Participants List from Interviews (UXR Excluded) in the folder
Copy the link of TWE: IS22: RP023: De-identified Participants List from Interviews (UXR Excluded)
Choose the three vertical dots on the top right side of the file
Choose "Share"
Choose "Copy Link"
In Resources # 2.04, place the link you just copied between parentheses at the end of the line with no space in between the right bracket ] and the left parenthesis (, so it turns into a hyperlink
Choose "Update comment" in Github and make sure all the checkboxes above have been checked
Customize Resource for the De-identified Transcripts Folder in the Shared Drive
Go to RP023 folder (Resources # 2.02)
Locate the RP023 De-identified Transcripts folder
Copy the link of the folder
Choose the three vertical dots on the right side of the folder
Choose "Share"
Choose "Copy Link"
In Resources # 2.05, place the link you just copied between parentheses at the end of the line with no space in between the right bracket ] and the left parenthesis (, so it turns into a hyperlink
Choose "Update comment" in GitHub and make sure all the checkboxes above have been checked
Customize Resource for Interview Recording Folder stored in the Internship - PII's My Drive
Log into your Google account associated with TWE project so you will be able to access the Internship - PII's 'My Drive' in the next steps
Choose TWE Raw Data Collection by Plan # folder in Internship - PII's My drive (linked in Resources # 1.03)
Locate the video recording folder for RP023 inside the folder TWE Raw Data Collection by Plan #
Copy the link of the video recording folder for RP023
Choose the three vertical dots on the right side of the folder
Choose "Share"
Choose "Copy Link"
In Resources # 2.06, place the link you just copied between parentheses at the end of the line with no space in between the right bracket ] and the left parenthesis (, so it turns into a hyperlink
Choose "Update comment" in Github and make sure all the checkboxes above have been checked
Customize Resource for Participants List from Interviews (UXR Excluded) spreadsheet
Open the RP023 Interview Recording Folder in the Internship - PII drive (Resource # 2.06)
Locate the TWE: IS22: RP023: Participants List from Interviews (UXR Excluded) in the folder
Copy the link of TWE: IS22: RP023: Participants List from Interviews (UXR Excluded)
Choose the three vertical dots on the top right side of the file
Choose "Share"
Choose "Copy Link"
In Resources # 2.07, place the link you just copied between parentheses at the end of the line with no space in between the right bracket ] and the left parenthesis (, so it turns into a hyperlink
Choose "Update comment" in Github and make sure all the checkboxes above have been checked
Customize Resources for the Recording and Transcript you are Assigned to
Check the title of this issue to identify the participant number, which comes after De-identify:
Open the TWE: IS22: RP023: Intern Application Interview Recordings and Transcriptions Tracking Sheet (Resources # 2.03)
Locate the recording video in .mp4 format that matches the participant number in Column B of the tracking sheet
Copy the link of the recording that matches the participant number in Column B of the tracking sheet
In Resources # 2.08, place the link you just copied between parentheses at the end of the line with no space in between the right bracket ] and the left parenthesis (, so it turns into a hyperlink
Locate the transcript in .txt format that matches the participant number in Column C of the tracking sheet
Copy the link of the .txt file that matches the participant number in Column C of the tracking sheet
In Resources # 2.09, place the link you just copied between parentheses at the end of the line with no space in between the right bracket ] and the left parenthesis (, so it turns into a hyperlink
Prepare for cleaning up and de-identification by converting the .txt file into a Google Doc
Click on the link in Column C Transcription Link that matches the participant number (Resources # 2.09)
Choose Open with Google Docs
A Google Doc is generated in a new window with the same .txt file name
The Google Doc is now saved into the same folder with the corresponding video and .txt file
Make a copy of the Google Doc that was just generated and rename it
Choose "File" in the Google Doc you recently created
Choose "Make a copy"
In the Name text box, delete Copy of from the file name
Copy
To be de-identified-
Paste what you copied into the beginning of the file name text box
The new file name should be formatted like "To be de-identified-RP023-UX Researcher Abbreviation###-Intern Abbreviation###"
An example: "De-identified Interview Transcript-RP006-U007-I001"
Choose "Make a copy"
Copy the link of the "To be de-identified" Google Doc transcript you just created
Choose "Share"
Choose "Copy link" and "Done"
In Resources # 2.10, place the link you just copied between parentheses at the end of the line with no space in between the right bracket ] and the left parenthesis (, so it turns into a hyperlink
Choose "Update comment" in Github and make sure all the checkboxes above have been checked
Clean up and de-identify the transcript
Open the "To be de-identified" Google Doc transcript if it is not open yet (Resource # 2.10)
Listen to the recording to get yourself familiar with the transcript
Go to Resources # 1.04 to get a refresher on the cleaning up and de-identification process. This is particularly important if it is your first time cleaning up an interview transcript.
No need to correct all grammatical mistakes in the transcript because the transcript needs to stay authentic
Check Resources # 1.04.01 to understand the conventions for cleaning up and de-identifying the transcript, so you can learn to use them properly.
Watch the videos in Resources # 1.04.02 as they walk you through the basics on how to clean up and de-identify the transcript so we can stay consistent in this process
Read Resources # 1.04.03 and 1.04.04 for more best practices to clean up and de-identify an interview transcript
Follow the following guidelines on the transcript formatting:
Use 6 digit format for all the timestamps, for example, 01:01:01 as hour one/minute one/second one
Use "UXR ###" and "Intern ###" throughout the transcript to indicate the interviewer and interviewee. See an example here in RP012 Intern 008's transcript.
If there were other people recorded on the transcript, name the unknown person based on the interview set-up or context, such as Program Manager, Notetaker, etc. If you are not sure what to name them, ask leads on the project for clarification.
If there are other numerical values in the transcript other than the timestamps, UXR ###, and Intern ###, transcribe them using APA formatting so we can easily scan the transcript. For example, write out numbers below ten as words ("one", "two"), and using numerical values for numbers ten and above ("10", "20"). For more info on numbers' formatting, please visit https://apastyle.apa.org/style-grammar-guidelines/numbers/numerals.
Listen to the interview recording again and clean up the inaccuracies in the transcript as you read along because the transcripts often contain mistakes since the auto transcription is not accurate
Separate the texts based on speakers and timestamps (timestamps should match the video file timestamps in case a researcher needs to go back and double-check the original video)
Add any missed words if the transcribing process missed any or to provide more context
Write down exactly what they say, even if it is grammatically incorrect or a topic you are not familiar with
If there are any typos and misidentified words in the transcript, please edit them based on what you hear in the original video because the transcribing software is not 100% accurate
Delete repeated words
Delete non-important verbal fillers such as "um" and "uh". However, you may keep interjections like “hmm”, “uh”, “woah”, “yeah”, “ohh”, “mmm” because they often contain emotions, reactions, and meanings (e.g. Mmmm [no], Mhmm [yes])
For verbal filler words, see Resources # 1.04.04 for more on this topic.
When the verbal fillers are distractions and don't serve any purpose, you may remove them.
When the verbal fillers can indicate the interviewee's emotions or thoughts. In this case, keep the verbal fillers and provide context to demonstrate the emotions or thoughts. E.g., (The participant hesitated for a while before coming up with the answer).
Where needed, add context so that a reader can understand what was happening without watching the video. E.g., he [the mentor] was helpful; or (steps off camera).
Read through the transcripts again and make edits to keep track of and de-identify any personally identifiable information (PII)
Search for the interviewee's name
Replace with their participant number, i.e. Intern 001 is I001.
Search for interviewer or unknown speakers in the transcript
Replace with either UXR number or their role (i.e. notetaker) associated with the project
Look for any other names being mentioned by the interviewee during the interviews
If a person's name is mentioned, write down the name and relevant information of that individual in the Participants List from Interviews (UXR Excluded) spreadsheet (Resources # 2.07), and assign a participant number to them. You may need to open the spreadsheet in a new tab for easier access.
Their name(s) in the transcript might be spelled wrong, so pay close attention to the original interview video.
If you are not sure of the roles of any of the names mentioned, please list the names, timestamps, and transcript links in the spreadsheet, and leave the role column and the participant number column blank. Then ask Research Lead or Project Lead for clarification.
Please check each participant type's abbreviation under the "Research Documents by Participant Type" section of the Research Output Overview Wiki Page. For example, mentor is M.
The participant numbers are sequential based on the time of entry and occurrence of the mention. I.e. if there are already two Hack for LA website team members being listed previously in the spreadsheet, then the next Hack for LA website team member will be HfLAWTM003.
If the name is already included in the sheet because other interviewees have mentioned them, please still list out their names and relevant information, and make sure to use the same role and the same participant number that has already been assigned to them so we can keep track of the people being mentioned across all interviews associated with one research plan.
If an interviewee repeatedly mentioned an individual throughout the interview, please list out all the timestamps when the individual was mentioned.
No need to include the interviewee and the UXR names since we track them in the roll call and session table.
After confirming the names, roles, and other relevant information of other individuals mentioned during the interviews with the Research Lead or Project Lead, use the search and replace function in Google Doc to replace their name(s) in the transcript with the participant number assigned to them in the spreadsheet
Open De-identified Participants List from Interviews (UXR Excluded) spreadsheet (Resources # 2.04), and enter the de-identified info based on Participants List from Interviews (UXR Excluded) spreadsheet (Resources # 2.07), so we have de-identified info in the research plan folder in the shared Internship drive
Read through the transcripts to search for any other identifiable information in the interviews, such as entities, places, etc.
Replace any personally identifiable information (PII) with non-identifiable terms. For example, if their specific school is mentioned, we should redact that info with either [high school] or [college].
If they mentioned any specific issues they worked on, make sure to remove the issue numbers and rephrase the issues they worked on
Read through the edited transcripts again
Focus on punctuation, readability, and formatting
One recommendation is to install LanguageTool Chrome Extension (see Resources 1.05) to clean up the punctuation and verbal ticks in the transcript. But no need to correct grammatical mistakes made by interviewers and interviewees.
When you're satisfied that the transcript is completely de-identified and cleaned up, create a new copy and save it into the shared Research folder
Make a new copy of the transcript by selecting "File" and "Make a copy". This is to make sure that the new version of the Google Doc does not include any editing history or PII.
"Copy document" window pops up
In the "Name" text box, delete the "Copy of To be de-identified-" text at the beginning of the file name
Copy
De-identified Interview Transcript-
Paste the text you just copied to the beginning of the file name in the "Name" text box
The new file name should be formatted like "De-identified Interview Transcript-RP023-UX Researcher Abbreviation###-Intern Abbreviation###"
An example: "De-identified Interview Transcript-RP006-U007-I001"
Follow the steps below to save the renamed de-identified Google doc transcript into the RP023 De-identified Transcripts folder on the shared Google drive. This is to make sure that this new copy which does not include any identifiable information will be saved in the shared Google drive.
Choose the folder icon under "Folder"
Choose "All locations"
Choose Shared drives > Internship > Internships > Research > Research by Participant Type > Intern > folder for RP023 > RP023 De-identified Transcripts
Choose "Select"
Select "Make a copy"
The copy of the file is generated in a new browser
Update the de-identified transcript URL in the Recording and Transcription tracking sheet (Resources # 2.03), De-identified Participants List (Resources # 2.04), Participants List (Resources # 2.07), and in Resource # 2.11
Copy the link of the de-identified transcript in the shared drive
Choose "Share"
Choose "Copy Link"
Open the tracking spreadsheet link in Resources # 2.03 in a new tab
Paste what you copied into the matching participant's cell in Column D Transcription Link - de-identified in the tracking sheet (Resources # 2.03)
If any person (excluding the Interviewee and UXR) is being mentioned in the transcript, paste the de-identified transcript URL in Column E in the De-identified Participants List (Resource # 2.04)
If any person (excluding the interviewee and UXR) is being mentioned in the transcript, paste the de-identified transcript URL in Column G in the Participants List (Resource # 2.07)
In Resources # 2.11, place the link you just copied between parentheses at the end of the line with no space in between the right bracket ] and the left parenthesis (, so it turns into a hyperlink
Choose "Update comment" in GitHub and make sure all the checkboxes above have been checked
Review Process
UXR Lead assigns another team member to conduct a peer review
Go to peer review for cleaning up and de-identifying transcripts wiki page and read through to see if and how to implement the peer review process (Resources 1.06)
Copy the template for a peer review for cleaning up and de-identifying transcripts from the wiki (Resources 1.06)
Paste the template in a comment in this issue
Make sure the dependency section is completed
Customize the template by adding the transcript name and link in the peer review instructions
Edit to include the @ syntax and the peer reviewer so the peer reviewer gets the notification
Peer reviewer discusses with the issue assignee on the changes needed
Review with UX Lead
Product sign-off
UXR Lead or PM removes the assignee and peer reviewer from the Internship - PII's My Drive when closing the issue if they no longer need to access the PII drive.
UXR Lead delete the .txt and all extra Google Doc transcripts that contains PII from the internship-PII drive after either the de-identified transcript gets approved or the entire research plan's transcripts get approved.
After the entire research plan's transcripts get approved, UXR Lead hides columns that contains PII such as recording link and transcription link in Resources 2.03 and transcription link in Resources 2.07 since they are not needed anymore.
Dependencies
Overview
We need to clean up and de-identify RP023's Intern Application interview transcripts so that we can move on to the data analysis and insights generation phase.
Resources/Instructions: This section is at the bottom of this issue (scroll to the bottom to view it now). You will be asked to add links to this section while completing the issue.
Tip: Use two windows side by side. One with the issue open and the other one with resource links displayed to avoid back and forth. To prevent loss of work, refresh both windows after each edit.
Action Items
UX lead adds the assignee to the Internship - PII's "My Drive" as a Contributor so that they have access to internship - PII's My Drive where the recordings and transcripts with PII are stored
Customize Resource Links
Customize Resource for Wiki Page Link
]and the left parenthesis(, so it turns into a hyperlinkCustomize Resource for this Research Plan's Google Drive Folder
InternfolderRP023folderRP023folderRP023folder]and the left parenthesis(, so it turns into a hyperlinkCustomize Resource for Interview Recordings and Transcriptions Tracking Sheet
TWE: IS22: RP023: Intern Application Interview Recordings and Transcriptions Tracking Sheetin the folderTWE: IS22: RP023: Intern Application Interview Recordings and Transcriptions Tracking Sheet]and the left parenthesis(, so it turns into a hyperlinkCustomize Resource for De-identified Participants List from Interviews (UXR Excluded) spreadsheet
TWE: IS22: RP023: De-identified Participants List from Interviews (UXR Excluded)in the folderTWE: IS22: RP023: De-identified Participants List from Interviews (UXR Excluded)]and the left parenthesis(, so it turns into a hyperlinkCustomize Resource for the De-identified Transcripts Folder in the Shared Drive
RP023 De-identified Transcriptsfolder]and the left parenthesis(, so it turns into a hyperlinkCustomize Resource for Interview Recording Folder stored in the Internship - PII's
My DriveTWE Raw Data Collection by Plan #folder in Internship - PII'sMy drive(linked in Resources # 1.03)TWE Raw Data Collection by Plan #]and the left parenthesis(, so it turns into a hyperlinkCustomize Resource for Participants List from Interviews (UXR Excluded) spreadsheet
RP023 Interview Recording Folderin the Internship - PII drive (Resource # 2.06)TWE: IS22: RP023: Participants List from Interviews (UXR Excluded)in the folderTWE: IS22: RP023: Participants List from Interviews (UXR Excluded)]and the left parenthesis(, so it turns into a hyperlinkCustomize Resources for the Recording and Transcript you are Assigned to
De-identify:TWE: IS22: RP023: Intern Application Interview Recordings and Transcriptions Tracking Sheet(Resources # 2.03)]and the left parenthesis(, so it turns into a hyperlink]and the left parenthesis(, so it turns into a hyperlinkTranscription Linkthat matches the participant number (Resources # 2.09)Open with Google DocsCopy offrom the file name]and the left parenthesis(, so it turns into a hyperlinkClean up and de-identify the transcript
Open the "To be de-identified" Google Doc transcript if it is not open yet (Resource # 2.10)
Listen to the recording to get yourself familiar with the transcript
Go to Resources # 1.04 to get a refresher on the cleaning up and de-identification process. This is particularly important if it is your first time cleaning up an interview transcript.
Listen to the interview recording again and clean up the inaccuracies in the transcript as you read along because the transcripts often contain mistakes since the auto transcription is not accurate
Read through the transcripts again and make edits to keep track of and de-identify any personally identifiable information (PII)
Participants List from Interviews (UXR Excluded) spreadsheet(Resources # 2.07), and assign a participant number to them. You may need to open the spreadsheet in a new tab for easier access.De-identified Participants List from Interviews (UXR Excluded) spreadsheet(Resources # 2.04), and enter the de-identified info based onParticipants List from Interviews (UXR Excluded) spreadsheet(Resources # 2.07), so we have de-identified info in the research plan folder in the shared Internship driveRead through the edited transcripts again
When you're satisfied that the transcript is completely de-identified and cleaned up, create a new copy and save it into the shared Research folder
RP023 De-identified Transcriptsfolder on the shared Google drive. This is to make sure that this new copy which does not include any identifiable information will be saved in the shared Google drive.Update the de-identified transcript URL in the Recording and Transcription tracking sheet (Resources # 2.03), De-identified Participants List (Resources # 2.04), Participants List (Resources # 2.07), and in Resource # 2.11
Transcription Link - de-identifiedin the tracking sheet (Resources # 2.03)]and the left parenthesis(, so it turns into a hyperlinkReview Process
@syntax and the peer reviewer so the peer reviewer gets the notificationResources/Instructions
Resources for creating this issue
1.01 Wiki: Research Output Overview
1.02 Google Drive Folder: Research by Type
1.03* TWE Raw Data Collection by Plan #
1.04 Transcript Cleaning-up and De-identification Resources
1.04.01 Guidelines to Interviews Page 8-10
1.04.02 Video folder
1.04.03 Cleaning Up Zoom Transcriptions for Qualitative Research
1.04.04 Determining Best Practice for Filler Words in Captions and Transcripts
1.05 LanguageTool Chrome Extension
1.06 Wiki and Template for Cleaning Up and De-identifying Transcripts Peer Review
Resources gathered during the completion of this issue
2.01 [Wiki: Research Plan: RP023]
2.02 [Google Drive Folder: RP023]
2.03 [TWE: IS22: RP023: Intern Application Interview Recordings and Transcriptions Tracking Sheet]
2.04 [TWE: IS22: RP023: De-identified Participants List from Interviews (UXR Excluded)]
2.05 [RP023 De-identified Transcripts Folder]
2.06* [RP023 Interview Recording Folder]
2.07* [TWE: IS22: RP023: Participants List from Interviews (UXR Excluded)]
2.08* [Intern 010 Recording Link]
2.09* [Intern 010 Transcript .txt File]
2.10* [Intern 010 To be De-identified Transcript Google Doc]
2.11 [Intern 010 De-identified Transcript Google Doc]
*This folder can only be accessed from the Internship - PII's "My Drive"