to the most essential;
during the informed consent process, describing the types of questions that will be asked so potential participants can decline if they feel that the questions are too intrusive or that they might become upset from answering the questions; and
letting participants know ahead of time that they can choose not to answer any question that makes them feel uncomfortable.
If a participant becomes upset, it is a good practice to stop the survey or interview—if it is a face-to-face or telephone interaction—and allow the participant time to regain composure. If in person, interviewers should offer the participant tissues and a glass of water. Once the participant is ready, the interview or survey can resume. However, participants should be allowed to stop the interview or survey completely if they do not want to continue. In these situations, the interviewer should obtain the participant’s permission to allow any data provided thus far to be included in the research—even if a consent form has been signed. If participants decline, their data should not be included in the dataset. Even when the survey or interview was not completed, interviewers should provide the participant with the reimbursement or incentive for completing the interview or survey.
Identifying and minimizing psychological harm can be trickier in research that is not conducted in person, particularly for online surveys, because the researcher will not know if the participant becomes upset. For research conducted in any setting, when applicable, researchers should arrange for referral to counselors and list community programs for any participants who want such services—even for those who do not become upset during the research. These services can be described to participants at the end of the survey/interview or earlier, if appropriate.
More questions? See #11, #12, and #97.
Question #18 What Is Meant by “Privacy” and “Confidentiality,” and Is There a Difference?
Privacy and confidentiality are two critical concepts that all researchers must address when designing and implementing research. Often assumed to have the same meaning, privacy and confidentiality are, in fact, two discrete but related concepts. An easy way to distinguish the two is to think of privacy as protecting individuals and confidentiality as protecting information—or data—that people share with researchers.
Privacy can be defined as having control over oneself—that is, people can choose when to share information about themselves and with whom. During recruitment, you can protect the privacy of prospective participants by implementing procedures that do not disclose information to others that would identify prospective participants as being part of a specific group, engaging in a specific behavior, or having a specific health condition. During data collection, you can reduce the likelihood of a violation of privacy by implementing procedures that allow participants to share their information with researchers where others cannot hear or see them.
When participants privately share their information with researchers, they expect that their information will remain confidential—that is, they expect that only the research team and other authorized individuals will have access to their data. In a practical sense, confidentiality refers to the specific steps researchers implement to keep information about participants unknown to others, to the extent possible. Federal research regulations require researchers to establish procedures to protect the confidentiality of information that is individually identifiable (meaning, the participant can be identified directly by the researcher or through identifiers that are linked with the data). However, researchers often implement the same confidentiality procedures for all types of data, as they are good standard research practices.
More questions? See #21, #22, and #23.
Question #19 What Makes Data De-Identified?
Datasets that have been stripped of all personal identifiers are considered to be de-identified. The federal research regulations do not list specific personal identifiers. Instead, they loosely define identifiable to mean that “the identity of the subject is or may readily be ascertained by the investigator or associated with the information” (45 C.F.R. § 46.102(e)(5)). Although a universal list of personal identifiers does not exist, the 18 identifiers listed in the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule (such as participant name and date of birth) are reasonable identifiers that researchers should consider removing from their datasets when de-identifying them (USHHS, 2015a, 2015b).
The premise of de-identifying datasets is that by removing all personal identifiers, the participants’ identities likely cannot be determined by those who see the data. Even after datasets are de-identified, however, a slight risk remains that participants can be re-identified, if someone had the interest in and means to do so. Therefore, researchers and ethicists debate the extent to which data can truly be de-identified.
Typically researchers must de-identify their datasets when they plan to share them with researchers outside the original study team (such as for secondary data analysis), when the data are to be made publicly available, or when they prepare data for long-term storage. Adequately de-identifying datasets may take considerable effort, depending on the type of information collected. Numerous procedures exist for removing or masking identifiers in quantitative datasets. For example, a specific process is required for removing all HIPAA identifiers from quantitative datasets in research that must follow the HIPAA Privacy Rule (USHHS, 2015a, 2015b).
Processes for de-identifying qualitative data are not as straightforward. Overall, it is very difficult to de-identify qualitative data. Researchers typically modify easily-identifiable data in interview transcripts. For example, proper names said by the participant, such as “my friend Bob,” are removed and replaced with a general description (“my friend”) or a pseudonym. However, that step alone likely does not make qualitative data de-identified. Larger segments, including very specific or unusual experiences, may need to be redacted from transcripts to preserve participants’ identities. Social and behavioral scientists must therefore be mindful of the quality of their data—both quantitative and qualitative—if a large amount of stripping must be done to de-identify them, and whether the necessary context will still remain to allow for valid interpretations to be made by others.
When de-identifying data for sharing or storage, the master list linking personal identifiers to the study data does not necessarily have to be destroyed. Institutional review boards often allow the original researcher to maintain the master list that links the participants’ names to their identification numbers, but that list must be stored securely and not shared.
More questions? See #18, #20, and #24.
Question #20 What Makes Data Anonymous?
Data are anonymous when they are not linked to any participant identifiers. In other words, the identity of a participant cannot be determined through his or her data. If the data are truly anonymous, even the study team cannot determine participants’ identities. Researchers often choose to collect data anonymously for studies on stigmatized or illegal behaviors. Then, if unauthorized persons gain access to the data—or if the data were purposefully shared with other researchers for secondary analyses—participants’ identities could not be detected because identifying information was never collected or known by the researchers at all. Importantly, data do not need to be anonymous to be considered ethical; employing secure procedures for limiting a confidentiality breach of identifiable data is ethically sufficient. Only in certain situations where extra protections are needed is it preferable to collect data anonymously. However, some researchers—regardless of whether the research topic is sensitive or not—choose to collect data anonymously for a study because they do not need participants’ identifiers to answer their research questions.
If you want to collect data anonymously, you must consider several factors. First, your study design matters. Collecting anonymous data is likely an unrealistic option for research that requires data to be linked from multiple interactions with the same participant, such as in longitudinal research. In these situations, researchers should keep a master list linking participant names