Data
Data are the foundation of research and science. Once an appropriate research topic is determined, proper data collection, retention, and sharing are vital to the research enterprise.
There are a number of methodological issues of which researchers should be aware when selecting data. These include choices about:
- Data types (e.g., nominal, ordinal or interval measures).
- Samples ("frames") and sample size, instruments.
- Methodologies.
Different disciplines have preferences for different approaches, and for what constitutes acceptable "rigor" for reliability and validity of results. This is one reason why a careful prior review of the existing literature on a topic is imperative when designing a research protocol. For example, a key component of most protocol designs will be the sample size (or "n"). From a purely methodological perspective, that decision hinges on how large an error one is willing to tolerate in estimating population parameters; or put differently, what effect size will be required for the result to be considered significant. These must be determined in advance of commencing data collection. But statistical explanatory power must be balanced against time, cost and other practical considerations, just like every other element of the protocol.
Data collection methods vary by discipline, and according to the data types of interest; but the emphasis on ensuring accurate and honest collection remains the same. Consequences from improperly collected data include:
- Inability to answer research questions accurately.
- Inability to repeat and validate the study.
- Distorted findings resulting in wasted resources.
- Misleading other researchers to pursue fruitless avenues of investigation.
- Compromising decisions for public policy or private decision-making.
- Causing harm to human participants and animal subjects.
As with data selection, it is critical that researchers have sufficient methodological skills to assure the quality of data collection efforts. Everyone who participates in the investigative effort should be trained in the methods. Where possible, researchers should try to build checks-and-balances into the collection process.
In information security, it is conventional to speak of three core goals for information protection:
- Confidentiality - limiting information access and disclosure to authorized users;
- Integrity - ensuring that data is not changed inappropriately after recording, whether by accidental or deliberate activity. Also, the notion that the person or entity in question entered the right information - that is, that the information reflected the actual circumstances ("validity") and under the same circumstances would generate identical data (what statisticians call "reliability").
- Availability - refers to the availability of information resources to authorized users. Everyday risks like fire, water or other environmental damage, or simple technical failures like hard disk crashes, must be considered. It's an essential practice to make frequent, periodic backup copies of a data collection, and store these copies in a secure secondary location that is protected both from intruders and environmental threats.
UTA Guidance regarding information security and data can be found here: https://www.uta.edu/security/encryption/fulldiskencryption/index.php.
Read more about The Practice of Keeping Research Notebooks: Paper vs. Electronic.
Data handling procedures should describe when, how, and who may handle data for storage, retrieval, sharing, archiving and disposal purposes. These procedures may depend on the nature of the project, the cost of maintaining that data, research sponsors' requirements, etc.
Retaining data on paper files and electronic media long past the end of a project can increase the chances of unauthorized access. Disposal of sensitive data requires care and technical expertise to ensure that the information could not be reconstructed from the storage media. Review UT Arlington's Records Information Management policies here: http://www.uta.edu/ouc/rim/.
The practice of ensuring research integrity extends to the stage of documenting and preparing results for publication. Publishing in peer-reviewed journals or presenting in scholarly meetings is the primary mechanism for investigators to disseminate their findings to the research community. This community relies on authors to report the events of a study honestly and accurately. All researchers should be aware of the issues that compromise the integrity of data reporting and publishing:
- Misrepresentation of data quality, or of the data itself.
- Analysis of data by several methods to find a significant result.
- Fabrication or falsification of data.
- Inadequate evaluation of prior research.
- Misleading discussion of observations.
- Reporting conclusions that are not supported.
- Failure to disclose conflicts of interest.
- Plagiarism.
- Unjust attribution of authorship.
Review UT Arlington's policies here: http://www.uta.edu/research/administration/departments/tm/index.php.