Nutan B., our Vice President for Pharma Consulting, is excited about implementing Gen AI in Pharma operations. With over 15 years of experience in handling pharma data, Nutan has led multiple projects deploying advanced analytics, and Gen AI solutions.
His recent interactions with leaders from the pharma industry raised a few questions about NLP’s impact on clinical reporting, innovation in drug development, overcoming pitfalls in pharma innovation, and more.
Table of Contents
Nutan: Having had firsthand experience with these applications, it is evident that Generative AI has been revolutionary within the pharmaceutical and life sciences industry, much like it has in various other domains and sectors. In an industry marked by stringent regulations, the adoption of Generative AI is still in its nascent stages, yet it holds immense potential to bring about transformation in pharmaceutical R&D, clinical trials, patient engagement, and numerous other crucial areas.
Nutan: In highly regulated industries, adopting analytics as a capability has always been challenging. The reason is that analytics relies on probabilities, which is quite different from the deterministic approaches that these industries are accustomed to. With the introduction of generative AI, the situation becomes even more complex.
However, it’s crucial to start thinking about the implications of this technology early, as in some cases, generative AI offers solutions.
There are key dimensions to consider in this context:
Nutan: Inherent biases pose a big threat to the integrity of clinical trials and drug development, potentially resulting in severe consequences if left unaddressed.
The root of this issue lies in skewed data, which has been well-documented in various studies, particularly with regard to the underrepresentation of specific demographics and gender in datasets related to conditions such as cancer and heart disease.
This bias is passed on to Generative AI and its applications if not carefully managed.
Multiple steps in a clinical trial impact the biases, starting with patient recruitment, protocol design assessment, and data analysis. The lack of diversity in trial participants can introduce biases in our comprehension of how drugs affect different ethnic groups, genders, and demographics, ultimately undermining the applicability of the trial’s results.
Publication and reporting biases further complicate the problem. It’s widely acknowledged that research incentives tend to favor the publication of positive outcomes over negative ones, painting an incomplete picture of the overall efficacy and safety of drugs.
To mitigate these biases, a combination of expert-designed solutions, sampling techniques, the use of synthetic data, and data augmentations can be applied.
Certain biases are deeply ingrained in the data itself, necessitating a more balanced approach from a process standpoint.
Implementing measures such as data safety monitoring boards and even straightforward practices like mandating the publication of all clinical trial results can contribute significantly to creating more equitable and unbiased outcomes that prioritize the well-being of patients.
Nutan: This is a very important question. Generative models, like any other tools, do not inherently have the capability to detect biases on their own. Ultimately, boils down to researchers, scientists, and developers involved in creating processes and solutions to mitigate these biases.
As we discussed earlier, generative AI primarily learns from the data it’s trained on. If the data itself contains biases, there’s a high likelihood that the AI will replicate these biases in its outputs.
To tackle these, various strategies need to be used:
Finally, defining endpoints and processes is extremely crucial, too. Multi-disciplinary teams help recognize potential blind spots early in the process. There could also be review boards that can establish guidelines and governance mechanisms for assessing the outputs.
All of these processes could be time-consuming or could dampen the excitement of implementing generative applications and add some overheads, but they are extremely crucial for generative AI to be sustainably used in the future.
Nutan: GPT 4 Apparently is trained on a trillion parameters.
And fundamentally, retraining these algorithms is becoming an almost impossible activity for individual healthcare and pharmaceutical organizations. It is also why most of these models are called foundational models.
So if these models are learning from biased data, correcting them requires a very meticulous approach and a very conscious set of processes defined to mitigate them.
There are fine-tuning techniques leveraging representative training data prioritizing diversity in factors like race, gender, age, and geography. These can correct the biases to a certain extent.
Solution approaches like those we discussed in the earlier answers, like data augmentation, adversarial training, and designing human-in-the-loop approaches to get feedback from experts, can also support addressing this challenge.
Explain ability and interpretability play very crucial parts as well in both the adoption of the technology and in addressing the bias problem.
Overarching frameworks from regulatory bodies like the FDA or WHO and from organizations themselves can serve as a road map for developing these algorithms, which prioritize ethical principles, fairness, and transparency.
Nutan: NLP technology has made remarkable progress in recent years and, when strategically applied, can effectively differentiate and protect patient-specific or company data. This is crucial not only for regulatory compliance but also for maintaining trust in the life sciences industry.
Let’s break down the process into two parts – what and how to anonymize.
Identification of What to Anonymize:
Approaches for How to Anonymize:
Leveraging these techniques while customizing them to a company’s internal documents and fine-tuning them on proprietary corpora can provide a resilient and reliable approach to ensuring full anonymity and protection.
It’s important to note that this pertains to the analytics context alone.
To ensure comprehensive data protection, other elements such as access controls, encryption, and best practices should also be implemented in conjunction with NLP-based anonymization techniques.
These measures collectively safeguard sensitive information and maintain the trust essential in the life sciences industry.
Nutan: In addressing the question of how NLP technology facilitates compliance with international data protection regulations, we can divide the discussion into two parts: the “what” and the “how.”
Identification of What to Anonymize:
NLP technology is highly adaptable to multilingual setups, allowing it to process documents and text in various languages natively. Prominent models such as Azure, Google, and BERT-based models are proficient in handling over 100 languages. GPT-style large language models exhibit versatility across multiple languages.
Furthermore, NLP models can process geographical information, including organizational details and geographical identifiers such as city names and street addresses. This adaptability enables the identification of sensitive information across diverse linguistic and geographic contexts.
While NLP technology can identify what needs to be anonymized, ensuring fairness and unbiased outcomes requires meticulous model training, data preparation, and rigorous testing procedures.
Approaches for How to Anonymize:
The “how” of anonymization processes is heavily influenced by country-specific regulations.
As discussed earlier, techniques like differential privacy, anonymization algorithms (e.g., K-anonymity, L-diversity), and customized risk algorithms remain relevant. However, they must be customized and configured to align with specific country regulations, such as those established by the EMA, HIPAA, and others.
In a multi-country environment, data processing, storage, and best practices must also adapt to comply with various regulatory frameworks. For instance, the European Union’s General Data Protection Regulation (GDPR) imposes stringent rules on data privacy, dictating where data can be processed and stored.
NLP can also play indirect roles in supporting compliance with international data protection:
In summary, NLP technology’s adaptability to multilingual and geographic contexts makes it a powerful tool for identifying sensitive information. However, the “how” of anonymization must be meticulously customized to align with country-specific regulations.
Nutan: Natural Language Processing (NLP) enabled a transformative shift in clinical study report generation, particularly with the advent of large language models like GPT.
This transformation, which began earlier this year, has significantly enhanced the accuracy and comprehensiveness of clinical reports while maintaining sensitivity and confidentiality.
One of the primary advancements enabled by NLP models like GPT is their ability to extract and process data from various structured and unstructured sources. This includes medical reports, patient profiles, trial records, and adverse event reports. The extracted information can then be used to construct coherent natural language reports, which can be further personalized to highlight the treatment’s impact on specific patient groups.
A notable feature of this process is its adaptability. Updates to these reports, based on new information, can be largely automated. New reports can be generated, and information can be anonymized using techniques discussed earlier.
These reports, generated through NLP, often serve as mature drafts. They can then be reviewed by domain experts for corrections and validation before finalization. This hybrid “human-in-the-loop” approach brings efficiency, time savings, and comprehensiveness to the report generation process while simultaneously reducing manual effort and minimizing the risk of human error.
Furthermore, NLP technologies can be continuously refined through learning and improvement. They can assimilate user feedback to adapt over time, resulting in more accurate and informative clinical documentation.
Did you know the smart factory market is expected to grow significantly over the next… Read More
Effective inventory management is more crucial than ever in today's fast-paced business environment. It directly… Read More
Gramener - A Straive Company has secured a spot in Analytics India Magazine’s (AIM) Challengers… Read More
Recently, we won the Nasscom AI Gamechangers Award for Responsible AI, especially for our Fish… Read More
Supply chain disruptions can arise from various sources, such as extreme weather events, geopolitical tensions,… Read More
In a remarkable achievement for the Artificial Intelligence (AI) sector, Gramener's flagship GenAI-powered Intelligent Document… Read More
This website uses cookies.
Leave a Comment