Data Management

Data Management plays a key role in the life cycle of studies coordinated at the EDC, from grant proposal through publications and data archiving. We believe that the primary role of Data Management is to define methods and utilize technology in a way that ensures the highest possible degree of integrity in our research. A secondary role of Data Management is to ensure that all methods and practices accelerate the timeline from data collection to data analysis and publication. Our Data Management teams work in close collaboration with other researchers from the time of study design through the analysis phase to ensure that data are collected correctly from the inception of a research study and to ensure that we are building in the highest degree of data integrity at every stage of the project lifecycle.


Data management plans (DMPs) are an important element in any grant proposal, and are based on a careful analysis of several areas:  1) the data to be collected, e.g. what data are necessary to answer the research questions, and what standardized validated instruments may be implemented;  2) who will be providing the data, e.g. interviewers, participant self-reports, medical record review, laboratory data via electronic transfers, etc.; 3) how the data will be collected: direct entry into laptops or other mobile devices is preferred, but sometimes data must be collected on paper, and implementing procedures to ensure quality collection of data is of great importance; 4) what software platform best meets the study’s data collection and management needs; 5) what quality control features must be in place to ensure prompt and accurate collection of quality data for analysis and for the smooth execution of the research study, 6) how data will be shared both during the study and after its final analyses are complete; and 7) plans for final archiving of the study database.


Once a study has been funded, data managers work to implement the DMP. Specific data collection forms are identified or created to implement the protocol. These include forms for recording screening, baseline, treatment, and outcome measures, as well as tools to monitor data accrual, visit scheduling, events such as adverse reactions to study treatment or out-of-protocol events. Data dictionaries are constructed to clearly document the data being collected, and procedures are established for the transfer of external data files such as imaging or laboratory assay results. Manuals of operations are written to clearly translate the study protocol into operational steps to successfully implement the protocol and provide study-specific definitions.

Data collection systems are developed and user guides are written. Careful training of data collection personnel and onsite study coordinators helps to insure the successful execution of the study protocol, and recertification exams are developed if determined necessary. Data management personnel lead regularly scheduled conference calls with clinical coordinators to address any questions or problems that may arise during the course of the study.

Quality Control

Quality control systems are designed that check data as they are received, looking for completeness, correctness, and logical consistency, and sending prompt notices to data collectors when corrections or clarifications are needed. Weekly study status reports allow constant monitoring of recruitment and retention by all study personnel, and high-level summaries of data accrual and quality are shared with investigators and other personnel as appropriate.

Analysis and Reporting

During the data collection period, data managers institute regular “freezes” of accruing data by taking a snapshot of the dynamic database that will provide a stable set of datasets for interim analyses and reporting. Freeze dates are announced in advance so that clinical center personnel can work to ensure the cleanest and most complete data are available. Using these freeze datasets, data managers prepare analytic and summary datasets that can be used for reporting to the project steering committee, funding agency, safety or oversight committees, and other entities as appropriate.

These analytic datasets are shared with study statisticians for use in analyses that result in abstracts and publications.

Data may be shared with investigators throughout the project, and at study closeout must be archived and, if requested by funding agencies, publicly shared. Data management teams create deidentification processes, track shared data within and outside study personnel, and prepare and document data for final archiving, or use as publicly available datasets.


M-F, 8:30am to 5pm ET

icon-facebook  icon-twitter  icon-linkedin

Mailing Address

Epidemiology Data Center
University of Pittsburgh
4420 Bayard Street, Suite 600
Pittsburgh, PA 15260


Collaboration with other institutions has always been essential to our work. Please contact us for more information on how working with the EDC can benefit your organization.