Data Management Plan (DMP)
A Data Management Plan (DMP) outlines how data will be collected, stored, processed, analyzed, and shared throughout an RCT. A well-structured DMP ensures data integrity, security, compliance, and reproducibility.
1. Key Components of a Data Management Plan (DMP)
A. Study Overview
- Title: Full name of the RCT.
- Principal Investigator (PI): Name, institution, contact details.
- Study Objectives: Summary of the trial's aims and endpoints.
- Trial Design: Parallel, crossover, stepped wedge, etc.
- Number of Sites: Single-center or multi-center study.
- Funding Source: Grant agency, industry funding, or institutional support.
B. Data Collection & Database Management
1. Data Sources
- Electronic Data Capture (EDC): REDCap, OpenClinica, Castor EDC.
- Case Report Forms (CRFs): Paper or electronic forms for data entry.
- Patient-Reported Outcomes (PROs): Surveys, diaries, mobile applications.
- Wearable & Sensor Data: Fitbit, glucose monitors, ECG.
- Laboratory Data: Biochemical, imaging, genomic sequencing.
2. Data Entry Process
| Step |
Description
|
| Source Data |
Data collected from participants (e.g., interviews, medical records).
|
| Data Entry |
Entered into EDC by research staff, validated in real-time.
|
| Double Entry (if required) |
Independent verification for accuracy.
|
| Validation Checks |
Automatic range checks, logic checks, missing data reports.
|
| Audit Trails |
Log of who entered/edited data, timestamps included.
|
3. Data Collection Timeline
- Baseline data: Collected at recruitment.
- Follow-up data: Scheduled at predefined intervals (e.g., 3 months, 6 months, 12 months).
- Final dataset lock: After data cleaning and verification.
C. Data Storage & Security
1. Storage & Backup
- Primary Storage: Institutional servers, REDCap, cloud-based EDC.
- Backup Strategy:
- Daily automated backups for active databases.
- Offsite backup copies stored securely.
- Retention Policy: Data stored for 5–10 years post-study.
2. Data Security & Confidentiality
| Measure |
Description
|
| Access Control |
Role-based access (PI, statisticians, site coordinators).
|
| De-Identification |
Use participant ID numbers instead of names.
|
| Encryption |
Data encrypted in transit and at rest.
|
| Two-Factor Authentication (2FA) |
Required for database access.
|
| HIPAA/GDPR Compliance |
Ensure adherence to local data protection laws.
|
D. Randomization & Blinding
- Randomization Method: Simple, blocked, stratified.
- Randomization Software: Integrated in REDCap, Castor EDC, or using R/SAS/STATA.
- Blinding Strategy: Single-blind, double-blind, or unblinded data managers.
E. Data Quality Assurance
| Quality Control Measure |
Description
|
| Real-Time Data Validation |
Automatic checks on range, logic, and missing values.
|
| Data Monitoring Committee (DMC) |
Independent committee reviews data periodically.
|
| Source Data Verification (SDV) |
Random checks comparing CRF entries to medical records.
|
| Query Resolution |
System flags discrepancies for review by site coordinators.
|
| Interim Analysis |
Assesses data trends without unblinding treatment groups.
|
F. Data Analysis & Statistical Plan
1. Primary & Secondary Outcomes
- Define how primary and secondary endpoints will be measured.
- Specify pre-specified subgroups (e.g., age, sex, disease severity).
2. Analysis Plan
| Analysis Type |
Description
|
| Intention-to-treat analysis (ITT) |
Includes all randomized participants regardless of protocol adherence.
|
| Per-protocol analysis |
Includes only participants who fully adhered to the intervention.
|
| Missing data Handling |
Multiple imputation, last observation carried forward (LOCF).
|
| Longitudinal Analysis |
Repeated measures analysis for follow-ups.
|
| Interim Analyses |
Conducted at predefined time points with statistical stopping rules.
|
3. Statistical Software
- R, SAS, SPSS, Stata, Python for analysis.
- REDCap exports available in multiple formats.
G. Data Sharing & Dissemination
1. Data Sharing Policy
- Access: Approved investigators during the study; public access post-study (if applicable).
- Repositories: Institutional repository, Dryad, Figshare.
- Timeline: 6–12 months after publication.
- Data Use Agreement (DUA): Required for secondary use.
2. Knowledge Translation
- Publications, conference presentations.
- Lay summaries for patient groups.
- Media outreach: Press releases, social media.
H. Ethical & Regulatory Compliance
- Ethics approval from IRB/REB.
- Registration on ClinicalTrials.gov or WHO ICTRP.
- Informed consent: Clear, plain-language process.
- Adverse event reporting per regulatory requirements.
I. Budget Considerations for Data Management
| Category |
Description |
Estimated Cost
|
| EDC Software |
REDCap (free) or commercial EDC |
$0 - $50,000
|
| Data Entry Personnel |
Research assistants |
$20,000 - $50,000
|
| IT Support |
Maintenance, troubleshooting |
$10,000 - $30,000
|
| Backup & Security |
Cloud storage, encryption |
$5,000 - $15,000
|
| Statistical Analysis |
Software licenses |
$5,000 - $20,000
|
Final Recommendations
- Use a secure EDC system (REDCap, OpenClinica).
- Standardize data collection forms with validation.
- Implement real-time data quality checks.
- Ensure compliance with GCP, GDPR, HIPAA.
- Plan for long-term data storage and sharing.
Bibliography
- ICH E6(R2) Good Clinical Practice: Integrated Addendum to ICH E6(R1). International Council for Harmonisation; 2016. Section 5.5 addresses trial data handling and recordkeeping.
- EMA. Reflection paper on expectations for electronic source data and data transcribed to electronic data collection tools in clinical trials. European Medicines Agency; 2010. EMA/INS/GCP/454280/2010.
- U.S. National Institutes of Health (NIH). Final NIH Policy for Data Management and Sharing. NIH; 2020. Available from: https://grants.nih.gov/grants/guide/notice-files/NOT-OD-21-013.html
- CDISC. Clinical Data Acquisition Standards Harmonization (CDASH) Model v2.0. Clinical Data Interchange Standards Consortium; 2020. Available from: https://www.cdisc.org/standards/foundational/cdash
- Califf RM, Zarin DA, Kramer JM, et al. Characteristics of clinical trials registered in ClinicalTrials.gov, 2007–2010. JAMA. 2012;307(17):1838–1847. Discusses importance of structured data management in trial registries.
Adapted for educational use. Please cite relevant trial methodology sources when using this material in research or teaching.