The Role of Data in Drug Development
How connecting the data dots can lead to better development and clinical trials.
Daniel Chancellor | | 4 min read | Opinion

The pharma industry is built on solid foundations of data. From biological insights to clinical trial results and prescription claims, data has guided decision making that has yielded treatments that have dramatically improved the standard of living throughout the world.
However, the status quo is resulting in productivity challenges in bringing the next generation of therapies to patients, and we must look to new insights within the ever-expanding data universe.
Pharma companies have always been vociferous users – and producers – of data. Each step of the development lifecycle is informed by extensive experimentation and data. Once in the pipeline, product positioning and clinical development strategies are guided completely by a blend of human insights and market intelligence. A Tufts Center for the Study of Drug Development report suggests that each phase III trial generates 3.6 million data points, a threefold increase in the last decade. When preparing for launch, real-world data (RWD) and reimbursement information are critical to understanding the patient journey and ensuring access to treatment.
The data universe is also expanding rapidly. The genomics revolution has unlocked powerful targeted therapies that are guided by biomarkers. Our digital health footprint is increasing dramatically, and now plays an important role in diagnosis and treatment decisions. Thanks to natural language processing, unstructured information such as physician notes, social media, and market research can be systematically consumed. Add to this the all-time high levels of industry R&D – biopharma’s collective $300 billion research budget is sustaining a pipeline of more than 23,000 drugs – the data challenge is now becoming one of abundance.
Connecting the dots
While the pipeline may be at record levels, Deloitte’s reporting on the return on investment in pharmaceutical R&D reveals a longstanding trend towards declining productivity. With such a vast amount of data available to inform drug development decisions, attention must turn to recognizing the signals amongst the noise and improving speed and efficiency.
It is imperative to connect data along the patient journey, building a strategy with feedback loops that learn from past successes and accurately predict future scenarios. Pharmaceutical companies should ensure that a data strategy sits independently of functional teams and that insights can be integrated along the product development lifecycle. In practice, this means equipping clinical planning teams with the downstream knowledge of commercial colleagues, and empowering market access strategy with clinical context and competitive intelligence. Only with an integrated data framework can the true power of artificial intelligence be harnessed across decision making. For example, predictive modelling in drug forecasting should account for an incredible breadth of information, from study timelines and endpoints through to pricing and regulatory considerations.
Data connectivity in practice
Here’s an example from Norstella’s work that shows the benefits of data connectivity. A company was launching a multiple myeloma drug. The diagnostic requirements to determine eligibility for this drug were complex, and there was a risk that important information could be lost through unstructured notes on electronic medical records (EMR).
By applying natural language processing to EMR data and linking with other RWD sources such as lab and biomarker information, the manufacturer was able to build comprehensive data signatures that represent typical multiple myeloma patient journeys. With the addition of lab data, the manufacturer can be alerted in near real-time when a new patient candidate enters a treatment center.
Compared to traditional targeting methods, this approach led to a tenfold increase in the number of newly eligible patients that the pharma company could identify, and a total pool of over 25,000 new high-risk patients to prioritize. These are patients that would otherwise be delayed in accessing the most appropriate treatment, or rather not receive treatment at all, with poorer outcomes as a result.
Here's a second case study from the COVID-19 pandemic. It was becoming clear that social determinants of health (SDOH) were hugely influential as the virus exacerbated existing health inequalities. This created an additional need to generate clinical data that would build confidence among traditionally vaccine-hesitant populations.
The connection of SDOH data alongside parameters such as experienced and potential investigators, clinical sites, and predictive enrolment rates allowed this vaccine developer to build a phase III clinical trial that was truly representative of the US population. Despite tremendous pressure on timelines, the trial met its enrolment goals, both in terms of schedule and diversity targets. This engendered trust in the results and supported broader utilization of the vaccine at a time of critical need.
As these two examples show, creating linkages between the right datasets allows decision makers to understand and align with the patient journey. From these foundations, pharmaceutical companies can begin to develop drugs that bring true improvements while generating sustainable returns.
From increasing the pace of R&D through to therapies that are intimately matched to the patients that will benefit most, data is fundamental to informed decision making. Increasing R&D costs and declining productivity is one of the greatest challenges and threats to the biopharma industry’s future, stemming from the status quo of high failure rates and inadequate access to new treatments. Through a connected data strategy, drug developers can address this decline and ensure a healthy long-term future for their innovations and investments.
VP Thought Leadership at Norstella