Using Large Language Models for Educational Grant Review | Office of Education and Student Affairs

Background: Large language models (LLMs) are currently being used for multiple purposes in medical education and research (1, 2). The UCSF Haile T. Debas Academy of Medical Educators Innovations Funding program provides intramural grants for innovations in health professions education. Annually, 40-50 letters of intent are received, from which 25-30 advance to proposal submission. Each proposal is reviewed by 3 reviewers. Reviewer comments are summarized. All teams receive feedback. The current project explored the feasibility and initial results of using a custom-built AI assistant to summarize reviewer comments. We aimed to reduce manual work, improve consistency and specificity, and eliminate redundancy.

Methods: In 2024, we piloted using Versa Chat (UCSF’s secure generative AI chat interface) to summarize reviewer comments. We developed prompts through an iterative process, focusing on proposal strengths and weaknesses. Next, we compared Versa output to the feedback compiled by program leaders manually. The second phase of the project took place from May to November 2025. Based on lessons learned in the pilot phase, we used Versa API (application programming interface) via custom built software to automate and standardize two parts of the workflow: early checks on proposal submissions (e.g., word count, timeline, etc.) and later, automatically summarizing reviewer comments using section based instructions (strengths, weaknesses, timeline, budget). This standardized process applies the same rules across proposals and produces organized summaries, compiled into a structured Word report for human review.

Results: The early check phase revealed Versa-human concordance on 23 of the 27 (85%) proposals submitted. The final review stage will occur in late January 2026. We will log edits made by program leaders to the Versa summary, focusing on conflicting or missing information and misinterpreting reviewer comments.

Conclusion: LLMs can be helpful for the meta-review (3) of educational grant proposals. The AI output should be carefully reviewed and edited, to avoid inadvertent bias or misinformation.

Acknowledgments:

Funding for this project was provided by the UCSF Academy of Medical Educators and UCSF Technology Enhanced Education (TEE) program.

References:

Abdulnour RE, Gin B, Boscardin CK. Educational Strategies for Clinical Supervision of Artificial Intelligence Use. N Engl J Med. 2025 Aug 21;393(8):786-797.
Goh E, Gallo R, Hom J, Strong E, Weng Y, Kerman H, et al. Large Language Model Influence on Diagnostic Reasoning: A Randomized Clinical Trial. JAMA Netw Open. 2024 Oct 1;7(10):e2440969.
Perlis RH, Christakis DA, Bressler NM, Öngür D, Kendall-Taylor J, Flanagin A, et al. Artificial Intelligence in Peer Review. JAMA. 2025 Nov 4;334(17).

Contacts

Andreea Seritan, [email protected]
Nick Wadsworth, [email protected]
Abigail Phillips, [email protected]
Sierra Niblett, [email protected]
Ann Poncelet, [email protected]