Presentation Type
Poster Presentation
Abstract
This study evaluated the performance of Sherpa Rx, an artificial intelligence platform leveraging large language models and retrieval-augmented generation (RAG) for pharmacogenomics, by validating its performance across key response metrics. Sherpa Rx integrated Clinical Pharmacogenetics Implementation Consortium (CPIC) guidelines with Pharmacogenomics Knowledgebase (PharmGKB) data to generate contextually relevant responses. A dataset (N=260 queries) spanning 26 CPIC guidelines was used to evaluate drug-gene interactions, dosing recommendations, and therapeutic implications. In Phase 1, only CPIC data was embedded; Phase 2 additionally incorporated PharmGKB. Responses were scored on accuracy, relevance, clarity, completeness (5-point Likert scale), and recall. Wilcoxon signed-rank tests compared accuracy between Phase 1 and Phase 2, and between Phase 2 and ChatGPT-4omini. A 20-question quiz assessed the tool’s real-world applicability against other models. In Phase 1 (N=260), Sherpa Rx demonstrated high performance: accuracy 4.9, relevance 5.0, clarity 5.0, completeness 4.8, and recall 0.99. The subset analysis (N=20) showed improvements in accuracy (4.6 vs. 4.4, Phase 2 vs. Phase 1 subset) and completeness (5.0 vs. 4.8). ChatGPT-4omini performed comparably in relevance (5.0) and clarity (4.9) but lagged in accuracy (3.9) and completeness (4.2). Differences in accuracy between Phase 1 and Phase 2 were not statistically significant. However, Phase 2 significantly outperformed ChatGPT-4omini (p < 0.05). On the 20-question quiz, Sherpa Rx achieved 90% accuracy, outperforming other models. Integrating additional resources like CPIC and PharmGKB with RAG enhances AI accuracy and performance. This study highlights the transformative potential of generative AI like Sherpa Rx in pharmacogenomics, improving decision-making with accurate, personalized responses.
Faculty Mentor
Jay Dorris, PharmD
Recommended Citation
Rector, Ashley; Breeden, Beth; and Dorris, Jay, "Validating Pharmacogenomics Generative Artificial Intelligence Query Prompts using Retrieval-Augmented Generation (RAG)" (2025). Student Scholar Symposium. 161.
https://digitalcollections.lipscomb.edu/student_scholars_symposium/2025/Full_schedule/161
Included in
Artificial Intelligence and Robotics Commons, Biomedical Informatics Commons, Health Information Technology Commons
Validating Pharmacogenomics Generative Artificial Intelligence Query Prompts using Retrieval-Augmented Generation (RAG)
This study evaluated the performance of Sherpa Rx, an artificial intelligence platform leveraging large language models and retrieval-augmented generation (RAG) for pharmacogenomics, by validating its performance across key response metrics. Sherpa Rx integrated Clinical Pharmacogenetics Implementation Consortium (CPIC) guidelines with Pharmacogenomics Knowledgebase (PharmGKB) data to generate contextually relevant responses. A dataset (N=260 queries) spanning 26 CPIC guidelines was used to evaluate drug-gene interactions, dosing recommendations, and therapeutic implications. In Phase 1, only CPIC data was embedded; Phase 2 additionally incorporated PharmGKB. Responses were scored on accuracy, relevance, clarity, completeness (5-point Likert scale), and recall. Wilcoxon signed-rank tests compared accuracy between Phase 1 and Phase 2, and between Phase 2 and ChatGPT-4omini. A 20-question quiz assessed the tool’s real-world applicability against other models. In Phase 1 (N=260), Sherpa Rx demonstrated high performance: accuracy 4.9, relevance 5.0, clarity 5.0, completeness 4.8, and recall 0.99. The subset analysis (N=20) showed improvements in accuracy (4.6 vs. 4.4, Phase 2 vs. Phase 1 subset) and completeness (5.0 vs. 4.8). ChatGPT-4omini performed comparably in relevance (5.0) and clarity (4.9) but lagged in accuracy (3.9) and completeness (4.2). Differences in accuracy between Phase 1 and Phase 2 were not statistically significant. However, Phase 2 significantly outperformed ChatGPT-4omini (p < 0.05). On the 20-question quiz, Sherpa Rx achieved 90% accuracy, outperforming other models. Integrating additional resources like CPIC and PharmGKB with RAG enhances AI accuracy and performance. This study highlights the transformative potential of generative AI like Sherpa Rx in pharmacogenomics, improving decision-making with accurate, personalized responses.