RFA-R14-002: EMR Rare Conditions

DRDC Solicitation #:  RFA-R14-002

Project Title: Linkage of electronic medical records and administrative databases: a novel tool for surveillance and health services research for rare conditions

Maximum Budget: Year 1 $100,000; Year 2 $100,000; Year 3 $100,000  (includes direct and indirect costs of applicant)
NOTE: All budget amounts are subject to the availability of funding.
Project period: up to 3 years
Anticipated Number of Awards:  1
Project start date:  September 30, 2014

Eligible Applicants:

Universities or other organizations that have partnerships with state health departments or other state agencies which host administrative data systems, EMR developers and their host institutions, and clinical data registries.


Healthy People 2020 Focus Area(s) aligned with this project: Identify critical research, evaluation, and data collection needs.

CDC Research winnable battles aligned with this research project: Strengthening surveillance, epidemiology, and laboratory services.

Center/Division goal(s) and priorities aligned with this research project:
NCBDDD: characterize the problem, incidence, prevalence, and distribution of our Center’s priority health conditions to inform public health research, priority setting and program monitoring.
DHDD: Improving developmental outcomes of children.

Purpose: This project will explore the feasibility of developing a novel surveillance and research tool for rare conditions by linking electronic medical records (EMRs) to large administrative databases. This project will use spina bifida, muscular dystrophy, or fragile x syndrome as examples of rare conditions. Applicants may choose to address one or more of the conditions, but there is a preference to address more than one, if possible. The awardee will identify and obtain access to a linkable EMR and a large, linkable database (administrative/medical claims/registry) and develop a linkage algorithm to detect pre-selected rare disorders. It is expected that such data linkage will make better use of the combined information from EMRs and administrative databases. EMRs provide timely clinical details, usually lacking in other data sources, at a relatively low cost, enhancing the research potential of administrative databases. If successful, this method would be transferable to other conditions.

Background: Collectively, rare conditions affect an estimated 25 to 30 million Americans. Advances in diagnosis and treatment have improved survival of many individuals with these conditions. But the long-term clinical and epidemiological outcomes, quality of life, education, and social participation are still major challenges for them.1  The fields of surveillance, epidemiology, and health services and outcomes research for rare conditions are still in their infancy, partially due to difficulties in collecting data. National surveys do not routinely collect information on rare disorders, and it is unclear whether self-reported information on rare disorders is reliable.2 Registries and surveillance systems require infrastructure and significant investments of resources.3 Secondary data from non-research settings, such as state population-based linked administrative data systems (State Medicaid, department of education, department of social services, etc.) has proven to be relatively low-cost and valuable data studies,4 but health information from these data sources is usually limited to information extracted from medical claims data, which are byproducts of patient enrollment and billing records submitted to insurance companies.

Many existing administrative databases have the inherent limitation of lacking essential clinical details, such as laboratory tests and physical examinations (vital data, blood pressure). A related key limitation is the lack of long-term follow-up and relevant outcome measures. Such clinical information can be supplemented with electronic medical records (EMRs). The primary advantages of EMRs include their potentially comprehensive and relatively timely clinical information that is not typically available in medical claims. Typical components of an EMR include: problem/diagnosis and progress charting, medication orders and administration, past medical history, lifestyle, physical examination, laboratory test orders and results, procedures and family history. The use of EMRs in research has gained notable momentum in recent years as their data become available. Yet, their potential to be linked to other databases has not been investigated with the exception of oncology studies.5,6

Research Goals and Objectives:


  1. Identify and obtain access to a linkable administrative/registry/other data system. This data system should have basic health information, such as that contained in state Medicaid databases, statewide inpatient hospital discharge records, or all-payer medical claims databases.
  2. Identify and obtain access to a linkable EMR dataset that has been fully implemented for at least one complete year and can potentially identify at least 150 cases of a given rare condition. The covered population of the EMR and the administrative data system should have overlap.
  3. Design the case identification strategy when using administrative data only (administrative data cohort) and EMRs only (EMR data cohort).
  4. Compare demographic, clinical, and treatment characteristics between the administrative data cohort and the EMR data cohort.
  5. Explore the feasibility of linking records across data sources: identify potential variables that can be used for linkage; clarify procedures needed for data linkage; review algorithms for data linkage.
  6. Summarize the feasibility of data linkage and propose at least one potential project with linked data.

YEARS 2 and 3

  1. Develop algorithms to link records across data sources; proceed with the linkage of EMRs and the administrative database; and describe the variables used for linking as well as the linkage algorithm.
  2. Design the case identification strategy when using the combined claims-EMR data (combined data cohort).
  3. Compare demographic, clinical, and treatment characteristics between the administrative data cohort, the EMR data cohort, and the combined data cohort.
  4. In Year 1, identify a study question, research design, and implementation plan, in collaboration with the CDC and data source owners. Implement the plan in Years 2 and 3.
  5. Summarize the practicality of this approach and its potential scalability to include other data to be linked and linked and combined.

Special Instructions for applicants:

The successful applicant will describe how they will accomplish each of the objectives in their submission, with as much specificity as possible. The applicant is encouraged to include letters of collaboration in their appendices and to provide tables that demonstrate they can meet the numeric quotas for the project.

Describe the potential public health impact of this opportunity:

If the proposal objectives are successfully achieved, this project will demonstrate a cost-effective way to perform rare conditions surveillance. It will strengthen the epidemiology and health services research for rare conditions. In the long term, the innovation of combining EMRs and other administrative data system will improve the health of those affected by rare conditions and decrease disease health burden by helping with the identification of cost-effective treatment/interventions.

This project may be transferable to other programs at the CDC because it provides a general surveillance method using emerging new data systems. Rare conditions are used as examples but their surveillance is much harder than the surveillance for common conditions. If the project is successful using rare conditions, it will be relatively easy to be transferred to common conditions.


  1. Institute of Medicine.  Rare Diseases and Orphan Products Accelerating Research and Development.  Brief Report.  Available at: http://www.iom.edu/Reports/2010/Rare-Diseases-and-Orphan-Products-Accelerating-Research-and-Development.aspx
  2. Ouyang L, Grosse SD, Fox MH, Bolen J. A national profile of health care and family impacts of children with muscular dystrophy and special health care needs in the United States. J Child Neurol. 2012;5:569-76.
  3. Miller LA, Romitti PA, Cunniff C, Druschel C, Mathews KD, Meaney FJ, Matthews D, Kantamneni J, Feng ZF, Zemblidge N, Miller TM, Andrews J, Fox D, Ciafaloni E, Pandya S, Montgomery A, Kenneson A. The muscular Dystrophy Surveillance Tracking and Research Network (MD STARnet): surveillance methodology. Birth Defects Res A Clin Mol Teratol. 2006;76(11):793-7.
  4. Jutte DP, Roos LL, Brownell MD. Administrative record linkage as a tool for public health research. Annu Rev Public Health. 2011;32:91-108.
  5. Kurian AW, Mitani A, Desai M, Yu PP, Seto T, Weber SC, Olson C, Kenkare P, Gomez SL, de Bruin MA, Horst K, Belkora J, May SG, Frosch DL, Blayney DW, Luft HS, Das AK. Breast cancer treatment across health care systems: Linking Electronic Medical Records and State Registry Data to Enable Outcomes Research. Cancer. 2013 Sep 24.
  6. Edelman LS, Guo JW, Fraser A, Beck SL. Linking clinical research data to population databases. Nurs Res. 2013 Nov-Dec;62(6):438-44.