The painstaking process of collecting and curating data is at the heart of scientific progress. What happens in the lab over weeks, months, and years can lead to new treatments, new medications, and sometimes, new cures for disease.
But when a researcher makes her data available to others so they can test new hypotheses or re-analyze the data, her initial contribution can get lost or buried. Work done in collaboration with others may mean a researcher is listed as an author on a second publication. But as the time between the initial work and new exploration grows, that researcher may be included in a footnote, acknowledged in the text of an article, or disappear altogether from the data she was responsible for generating.
As data sharing becomes increasingly common, there has not been a way to track and credit the data contributions of academic researchers. That’s why leaders in academic medicine are intensifying efforts to usher in a 21st century model of research, where scientists can track the use of shared datasets and better determine their value and impact.
Data sharing is a requirement for many research funders
Researchers looking to fund their next study may not be allowed to do so without a plan to share their data. Organizations like the Bill and Melinda Gates Foundation and the Patient-Centered Outcomes Research Institute now require grantees to share data. The National Institutes of Health, the major federal funder of research at medical schools and teaching hospitals, just released a proposal to update its data sharing guidelines for the first time in 15 years. Major biomedical journals are also focusing on data by enacting more stringent policies on data sharing and requiring data sharing plans for any clinical trials they publish.
Organizations are stepping up to help
While some fields and programs have been quick to invest in and adopt data sharing as part of the research process, others have struggled with the associated challenges, such as finding the time and resources and having the technical knowledge and infrastructure needed to make the data ready to share. Many organizations, including the Research Data Alliance and FORCE11, have worked to create community-based standards and best practices to assist researchers with sharing data.
Removing barriers to data sharing
However, a major barrier to data sharing remains: Although data are a valuable research asset, typically only original peer-reviewed publications give academic researchers recognition, promotion, and external research funding. “Even with the best of intentions, it is difficult to capture and quantify the value of shared data and recognize the contribution of the people who made that data valuable,” says Heather Pierce, AAMC senior director for science policy and regulatory counsel. “To date, there has been no standard process to track the use of shared data.” To try and address this issue, the AAMC began working with other organizations to implement a system that would better recognize the researchers who gather and share data for their contributions to scientific progress.
“Credit for Data Sharing” project
Two years ago, the AAMC launched the “Credit for Data Sharing” project, in collaboration with the Multi-Regional Clinical Trials Center of Brigham and Women’s Hospital and Harvard and the New England Journal of Medicine, to identify a systematic process that would enable individuals and organizations to track the use of shared research datasets. Earlier this year, the AAMC hosted a workshop with more than 50 organizations to specifically focus on the actions needed from academic institutions, journals and publishers, nonprofit organizations, and government funding agencies to support and implement this process. AAMC is currently working to disseminate the findings of this workshop more broadly, and is further engaging specific stakeholder groups on their role in the process of tracking data use and re-use.
What can academic institutions do?
Academic institutions play a critical role in future efforts to promote data sharing and facilitate the tracking of datasets produced by their investigators, Pierce says. These include:
- Increasing training in data management, citation, and sharing, both as a part of graduate curricula and continuing education for researchers;
- Developing a comprehensive, institution-wide data policy that includes research data;
- Supporting the data sharing and citation ecosystem by incorporating evaluation of data sharing into hiring and promotion processes;
- Employing institutions’ libraries to develop needed infrastructure, especially for data storage, curation, and access, as well as training and support. In most instances, libraries are also responsible for the management of institutional data repositories. Libraries can also assist in identifying and leveraging data that already exists within an institution.
As the value of data sharing is more widely understood, it should be better integrated into the research process from the outset. This, in turn, will maximize the impact of research data on scientific and medical progress.