Rare cancers, defined by the NCI as those that affect fewer than 40,000 new patients in the United States each year, account for nearly 30% of all cancers. These tumors present diverse challenges for clinicians and patients from diagnosis to treatment. The relative scarcity of these tumors in comparison to the most common malignancies (such as lung, colorectal, breast and prostate) has resulted in less funding and research to understand both the natural history and molecular basis for these diseases. Not only can diagnosis of a rare cancer be challenging, but once the diagnosis is made there are limited and typically fewer effective treatments available than for the more common tumors.
The information acquired through a natural history study can play an important role during the entire life cycle of drug development, especially in the design and analysis of clinical trials and in post marketing studies. There is an urgent need to develop tools and processes to capture patient-level detail about rare cancer progression from the time of diagnosis. These data, which are often collected during care delivery, can be mined from electronic health records and organized into disease-focused, instead of patient-focused databases called registries. A disease registry can be used to derive best practices for patient management, identifying prognostic biomarkers, and testing hypotheses that would be very difficult if data were left in individual patient records.
These registries need to be developed by clinical and biostatistical subject matter experts to ensure they capture all relevant data that characterize a disease’s natural history, its presentation and diagnosis, clinical management (including therapeutic selection and response), and biomarkers (such as clinical chemistries and genomics). Organizing this patient-centered data from electronic medical records into a disease-centered database is intensive work. Medical professionals add value not only by organizing data in a new way, but also by reviewing, correcting, and enriching the primary data.