Human Subjects Data Security

One of the biggest potential risks to research participants in our modern-day world is the threat that confidential information will be exposed to unauthorized individuals who are not part of the research team. Maintaining human subject data securely with the appropriate level of confidentiality is a key component of minimizing risks to your study participants. 

To ensure confidentiality risks are properly addressed, your IRB protocol must clearly describe the precautions you will take to protect participant data and how these measures will be communicated to participants—whether during the informed consent process or through another method.

Data security should be considered at every stage of the study, including data collection, transmission or transport, access for review and analysis, collaboration, storage, reporting, and final disposition. When planning, think about the tools and resources you will use for data collection, how you will restrict access to identifiable data to authorized research personnel only, and who will be responsible for secure storage and eventual destruction or disposition of the data.

It is important to understand the type of data you are working with to determine the appropriate security measures required. Different categories of human subject data carry varying levels of sensitivity, and selecting the correct safeguards depends on accurately identifying the data classification.

The Information Security Office (ISO) provides data classification definitions, which are outlined below along with common categories of human subjects data.

  1. Anonymous Data: Data that never contains identifying values that can link the information to any participant. Once anonymous data has been collected, there is no way for the researcher (or anyone else) to identify any of the contributing participants. If any values or combination of values can be used to identify any specific participant, regardless of the kind of information provided, it is not considered anonymous; the data would be considered identifiable.
  2. Coded Data: A dataset containing information about a living individual that has had the direct identifiers of the individual removed (e.g., name, SSN#, student #, etc.) and replaced with a code (e.g., 101, 102, 103, etc.). The research team typically keeps a separate file containing the list of subject code numbers and other identifiable information as a “Master List” so that the coded dataset can be re-linked to the subjects’ identities if needed using the subject codes. This would fall under identifiable data
  3. Confidential Data (ISO definition): University data protected specifically by federal or state law or UT Arlington rules and regulations (e.g., HIPAA; FERPA; Sarbanes-Oxley, Gramm-Leach-Bliley; the Texas Identity Theft Enforcement and Protection Act; University of Texas System Policies; specific donor and employee data). University data that are not otherwise protected by a known civil statute or regulation, but which must be protected due to contractual agreements requiring confidentiality, integrity, or availability considerations (e.g., Non-Disclosure Agreements, Memoranda of Understanding, Service Level Agreements, Granting or Funding Agency Agreements, etc.).
  4. Controlled Data (ISO definition): University data not otherwise identified as Confidential data, but which are releasable in accordance with the Texas Public Information Act (e.g., contents of specific e-mail, date of birth, salary, etc.). Such data must be appropriately protected to ensure a controlled and lawful release.
  5. De-identified Data: a dataset containing only information about living individual(s) that had identifiable information at one time, but has since had all identifiers removed from the data in a manner that any member of the research team is not able to identify the individual(s) from whom the information was collected. Links between the data and the individual about whom the data was recorded may still exist, but are not readily accessible, and will not be made available to the researcher(s) at UTA. Note that studies utilizing a coding system with a “Master List” linking subject codes to identifiable information are not considered de-identified; instead, these datasets are considered “Coded Data”.
  6. Identifiable Data: A dataset containing any information that would allow someone (including members of the research team) to be able to directly or indirectly identify the person from whom the information was collected; a dataset in which the identity of the subject can be or may be readily ascertained by someone, or is associated with the information.
  7. Non-Sensitive Data: Data that is not likely to cause harm to subjects in the event of a data breach; a dataset containing information about living individuals that may contain individually identifiable information, but which is not likely to place the subjects at risk of criminal or civil liability or be damaging to their financial standing, employability, educational advancement, or reputation if the information was disclosed outside of the research context.
  8. Published or Public Data (ISO definition): University data not otherwise identified as Confidential data, but which are releasable in accordance with the Texas Public Information Act (e.g., contents of specific e-mail, date of birth, salary, etc.). Such data must be appropriately protected to ensure a controlled and lawful release.
  9. Sensitive Data: Data that could potentially cause harm to subjects in the event of a data breach; a dataset containing information about living individuals that could reasonably place the subjects at risk of criminal or civil liability or be damaging to their financial standing, employability, educational advancement, or reputation if the information was disclosed outside of the research context.

All electronic human subjects data must be maintained on UTA-sanctioned storage tools to ensure compliance with university policies and protect participant confidentiality. All UTA-approved applications for data collection and analysis can be accessed through My Apps.

If you need to use a platform or application for human subjects research recruitment, or for data collection, analysis or storage which is not included within My Apps, you are required by UTA to complete the OIT Software Review and TAPREQ process.

Additionally, an Information Security Office (ISO) Risk Assessment is required for any software or application used in your research. This assessment verifies that appropriate safeguards are in place to protect confidential or controlled data.

  • Projects handling confidential information (e.g., FERPA data, biometric identifiers, identifiable human subjects research) must complete an ISO Risk Assessment annually.
  • Projects handling controlled or public information (e.g., de-identified human subjects research, anonymous research) must complete an ISO Risk Assessment every two years.

IRB Submission Requirements

IRB Protocol Application:
The IRB protocol application must clearly specify the types of data to be collected, the tools used for data collection and storage, and the measures in place to ensure data security. When outlining data security procedures, consider the following:

  • Paper Records: All original paper documents must be stored on the UTA campus unless the IRB grants an exception. UTA and the IRB must have access to research records and consent forms at any time.
  • Research Conducted Off-Campus: Special considerations apply to collaborative or field research conducted outside the UTA campus. If human subjects data will initially be collected off-campus, the protocol must include detailed plans for securing data at the collection site and for securely transporting it to the UTA campus (or a secure UTA server) for storage.
  • Record Retention: All records—paper or electronic—must be securely maintained for at least three years after protocol closure or as required by the funding agency (whichever is longer). Student PIs should plan for long-term storage if leaving UTA before the retention period ends.

ISO Risk Assessment:
An ISO Risk Assessment is required for each research project that uses a platform or application not sanctioned by UTA to access, collect, or store confidential or controlled data. Because each study uses these tools differently, the assessment must be completed per project—not just per app/platform. Please attach an approved ISO Risk Assessment to your protocol submission. To avoid delays, it is strongly recommended the ISO Risk Assessment be completed before submitting your protocol.

Approved Data Collection and Storage Devices
All data categories may use the following data collection and storage devices:
  • UTA-Sanctioned Cloud Storage can be used for both internal and external collaboration, and can be used by faculty, staff and students for the storage of human subject data. When sharing data, it is important to ensure folders are password protected or have appropriate access control to prevent accidental data compromise or leak. Special caution must be taken when handling identifiable data. To request access, please contact the OIT Help Desk.
  • UTA-owned computer that is encrypted and has OIT standard image: All computers containing confidential information must be encrypted following Institutional standards. Where possible, all data must be stored on secure UTA-sanctioned cloud storage. Always ensure that you are following established practices for protecting regulated data. Consult the Information Security Office if you have questions. Access control and encrypted devices must be used for most regulated data; additional controls may be required such as physical security (cable locks, locked room, etc.)
  • UTA-owned ISO approved external drives that are hardware encrypted: All portable devices containing confidential information must be encrypted following Institutional standards. Where possible, all data must be stored on secure UTA-sanctioned cloud storage. Always ensure that you are following established practices for protecting regulated data. Consult the Information Security Office if you have questions. Access control and encrypted devices must be used for most regulated data; additional controls may be required such as physical security (cable locks, locked room, etc.)
  • QuestionPro: QuestionPro enables faculty, staff, and students to create and conduct unlimited surveys for University-related academic or administrative purposes. The tool offers a range of features to create web forms, conduct offline research studies, collect and analyze data, and more. While the collection and storage of identifiable data is permitted, collecting and storing SSN's is not permitted without first consulting the Information Security Office. When possible, data should be removed from QuestionPro and stored on an encrypted UTA computer.
De-identified, Anonymous, and Non-Sensitive data may use the following data collection and storage devices:
  • UTA Office 365 OneDrive: OneDrive is currently available to employees and students. Ensure that OIT has implemented security settings for your OneDrive to ensure that inadvertent sharing of data does not occur.
  • UTA-owned computer that is not encrypted (with encryption exception)
  • UTA-owned external drives that are not encrypted
  • Exchange and UTA Office 365 Email: Email and texting generally may be used for recruitment, scheduling of appointments, and non-sensitive informational purposes ONLY, per your IRB protocol. Email may not be used for human subjects data collection or storage.

The following are NOT permitted for data collection and/or storage:

  • Dropbox
  • Google WorkSpace with UTA SSO
  • iCloud
  • Elsevier Mendeley
  • Non-UTA owned computers, external drives or phones that are not encrypted
  • Survey Monkey
  • Gmail

If you would like to request another process for data collection and/or storage, you must contact the Information Security Office.

Supplemental training webinars are available in the CITI program, under the IPS for Researchers Course. Investigators are encouraged to complete this training for additional education in data security.

Data Management and Security for Student Researchers: An Overview (ID 20423)

  • The runtime is 1 hour
  • Learning Objectives:
    • Define the basic principles governing protection and confidentiality of research data.
    • Identify best practices to use in securing research data.
    • Assess situations in which research data may need extra security protections.

Partnering with Technology Companies

  • The runtime is 1 hour, 2 minutes, and 58 seconds.
  • Learning Objectives:
    • Review the goals of digital health, the interdisciplinary struggles of working with technology companies, and an overall approach to the problems.
    • Identify best practices for researchers, technology companies, institutions, and Institutional Review Boards (IRBs) for creating partnerships.
    • Explore some of the common challenges faced by the research community and technology companies in the design and conduct of research.