aamc.org does not support this web browser.

Use Case 4: Predictive Scoring with Smart Summaries

Challenge

Selection committees must review thousands of applications containing both quantitative metrics and qualitative materials. Manual review is time-intensive, yet purely data-driven approaches miss important context from written materials.

Solution

Combine machine learning (ML) predictions based on structured data with large language model- (LLM-) generated summaries of unstructured content to provide a comprehensive yet efficient review tool.

How it Works

  • Build prediction model using historical data (USMLE scores, publications, research experience).
  • Generate an "Interview Likelihood Score" (0-100) based on past successful candidates.
  • Identify key statistical factors influencing predictions.
  • Process personal statements, activity descriptions, and letters.
  • Programs first define priority areas for summarization.
  • Create targeted summary highlighting, for example:
    • Program value alignment (e.g., community service, research excellence).
    • Socioeconomic context (e.g., education background, financial circumstances).
    • Key experiences and achievements.
    • Notable characteristics or qualities.
    • Unique background elements.
  • Include relevant quotes as evidence.
  • Interview Likelihood Score: 85/100.
  • Statistical factors.
    • USMLE Step 1: 245 (top 15% of past interviewees).
    • USMLE Step 1 attempts: 2.
    • Research: 2 first-author publications.
    • Clinical experience: 1,000-plus hours.
  • Program value alignment.
    • Community focus: “Created mobile health clinic for underserved areas.”
      • Found in: Activities section, entry #3.
      • Reasoning: Demonstrates initiative in addressing health care access.
    • Research excellence: “Led quality improvement study on ED wait times.”
      • Found in: CV research section and personal statement paragraph 2.
      • Reasoning: Shows both leadership and research methodology skills.
    • Educational innovation: “Peer tutoring program for premed students.”
      • Found in: Activities section, entry #7.
      • Reasoning: Indicates commitment to medical education.
  • Context and Background.
    • First-generation college student.
      • Found in: Secondary application essay #2.
      • Reasoning: Explicitly stated in response about challenges.
    • Worked 20-plus hours per week during undergrad.
      • Found in: CV employment history and referenced in personal statement.
      • Reasoning: Indicates financial need and time management skills.
    • Rural health care experience in medical desert region.
      • Found in: Personal statement opening paragraph and activities #4.
      • Reasoning: Shows exposure to underserved health care settings.
  • Key Experiences.
    • ED quality improvement project leader
      • Found in: Research experience section and LOR from ED director.
      • Reasoning: Major leadership role with measurable impact.
    • 3 years EMT experience.
      • Found in: CV clinical experience section.
      • Reasoning: Sustained clinical commitment in premedical school.
    • Health care disparities research focus.
      • Found in: CV research section and personal statement theme.
      • Reasoning: Consistent thread across multiple experiences.

Key Takeaways

Core Benefits

  • Hybrid analysis. Combines predictive scoring with qualitative insights.
  • Adaptable framework. Updates with evolving priorities and fresh analysis.
  • Evidence-based. Clear sourcing and reasoning for all insights.
  • Efficient review. Streamlines document analysis while maintaining depth.

Resource Requirements

  • Technical: Both ML and LLM infrastructure.
  • Personnel: Technical team, SMEs for evaluation standards.
  • Effort: High initial setup for both prediction models and LLM framework.

Challenges, Solutions, and Information Triangulation

Building on Table 1's challenges around example libraries and expert consensus, Table 3 provides a non-exhaustive list of challenges building with LLMs. The key to success lies in systematic human evaluation — the same careful approach needed for building reliable example libraries and achieving expert consensus.

Best suited for

  • Programs seeking both efficiency and depth in review.
  • Institutions with resources for dual AI implementation.
  • Teams wanting fresh analysis beyond historical patterns.
  • Programs handling large application volumes.

Bottom Line

Predictive scoring with smart summaries combines quantitative metrics with qualitative LLM analysis for comprehensive applicant evaluation. Its unique advantage is integrating historical patterns with adaptable evaluation methods that avoid overreliance on past decisions. This approach requires substantial technical and expert resources to manage dual systems effectively. It works best for well-resourced programs seeking both efficiency and depth in their evaluation processes