Although the competition has started, we are still accepting registrations for the Dr. Bot Challenge, please fill this form to register.
Dr. Bot Challenge
As part of the 5th Midwest Healthcare Conference, we are excited to host a data challenge that invites participants to push the frontiers of large language model (LLM) development in service of a critical mission: improving how AI understands and responds to real-world medical questions. This competition fosters innovation at the intersection of artificial intelligence, clinical reasoning, and digital health, offering a unique opportunity to contribute to transformative research with real-world impact.

Leveraging Large Language Models for Healthcare
A large language model (LLM) is a type of deep learning algorithm designed to understand, summarize, translate, predict, and generate text by learning from vast amounts of data.
LLMs represent one of the most impactful uses of transformer models. Their capabilities go far beyond processing human languages; beyond enhancing natural language processing tools such as translators, chatbots, and virtual assistants, LLMs are increasingly used in healthcare, coding, and a wide range of other domains. As these models become more integrated into decision-making and service delivery, their influence on research, communication, and everyday human-computer interaction continues to grow rapidly.
In healthcare, LLMs are transforming clinical decision support, streamlining documentation, and improving patient engagement through intelligent interfaces. Their ability to analyze unstructured medical text and integrate diverse sources of health data gives them the potential to significantly enhance diagnostic accuracy, alleviate administrative burdens, and support more personalized care, among other impactful applications.
Challenge Overview and Aim
The competition invites participants to develop large language models (LLMs) specifically trained to respond to patient-initiated clinical questions with clarity, accuracy, and contextual sensitivity. Unlike general-purpose LLMs trained primarily on web-based text, this contest emphasizes the use of diverse and domain-relevant data sources, including medical literature, clinical notes, structured datasets, and curated question-answer pairs, to build models with specialized healthcare competencies.Participants are encouraged to explore innovative strategies in data selection, model architecture, and fine-tuning. The aim is not only to improve factual correctness and safety in medical responses, but also to design models that can handle nuanced patient language, simulate clinical reasoning, express uncertainty appropriately, and communicate with empathy, responding in ways that acknowledge patients’ emotions and concerns. Ultimately, the competition aims to advance the development of LLMs suitable for real-world deployment in healthcare settings, where they can assist with symptom triage, health education, and digital health navigation, while upholding standards of safety, equity, and patient-centered communication, and helping to reduce the burden on healthcare providers.
Task Description
The competition challenges participants to develop cutting-edge large language models (LLMs) specifically designed to answer patient-initiated clinical questions with accuracy, clarity, and empathy. The ultimate goal is to enhance LLM capabilities in providing safe, context-aware, and clinically relevant responses to patients seeking medical guidance. This task mirrors real-world applications where patients pose complex and often ambiguous questions based on their symptoms, concerns, or test results—areas where both technical precision and ethical sensitivity are critical.
Such systems have the potential to serve as first-line digital triage tools, guiding patients toward appropriate care pathways and alleviating pressure on overburdened healthcare systems. When implemented responsibly, they can improve access to reliable health information, especially for underserved populations with limited healthcare resources. These models can be used for triaging symptoms, helping patients decide whether to seek urgent care, schedule a primary care visit, or manage symptoms at home. They can also enhance patient education before and after clinical visits, support medication adherence through personalized instructions, and provide clear explanations of diagnoses and treatment plans. Furthermore, they can assist in discharge planning by helping patients understand follow-up care, recognize warning signs, and manage self-care, ultimately improving outcomes and reducing preventable readmissions.
Data Source
Participants are not expected to train a language model from scratch. Instead, they are encouraged to build upon existing pretrained models—such as GPT-style transformers or open-source architectures like LLaMA, Mistral, Falcon, or similar—by fine-tuning and adapting them to the healthcare domain using innovative and high-quality data sources.
To guide model development and help set expectations, a limited training set will be provided at the beginning of the competition. This dataset includes examples of patient-initiated clinical questions curated by medical students across various common healthcare scenarios. While this dataset is not exhaustive, it serves to illustrate the style, complexity, and subject matter of questions that will appear in the evaluation phase.
Participants are permitted to utilize publicly available datasets, including Common Crawl, Wikipedia, Project Gutenberg, PubMed, arXiv, and other domain-relevant corpora. Dialogue-oriented datasets from sources such as OpenSubtitles, Reddit, or patient forums can also be used to enhance conversational fluency. Importantly, the focus is on how participants creatively select, combine, and refine these data sources to enhance their model’s ability to understand and respond to real-world clinical queries.
All data used must be legally sourced, high-quality, and ethically curated, with a particular emphasis on fairness, factual accuracy, and relevance to patient-centered healthcare communication. Participants are solely responsible for ensuring that any data they use complies with legal, ethical, and licensing standards. This includes verifying the provenance of datasets, respecting privacy and consent where applicable, and adhering to all applicable data use policies. The goal is to demonstrate how innovative data use and thoughtful adaptation of existing models can lead to safer, smarter, and more responsive medical LLMs.
Ethical Considerations and Data Privacy
- Model Transparency: All models developed for the competition must be made publicly available under an appropriate open-source license upon completion of the competition. In addition, the results of each model—including evaluation scores for creativity, accuracy, and overall performance—will be published to promote transparency, reproducibility, and collective learning within the research community.
- Ethical Compliance: The challenge adheres to ethical standards in data handling and analysis, ensuring the integrity of the research process.
- Safeguards Against Misinformation: Participants are strongly encouraged to implement strategies to detect and mitigate LLM-generated misinformation, including hallucination detection methods, uncertainty estimation, and expert adjudication of outputs. These safeguards help ensure that model responses are clinically sound, ethically responsible, and aligned with patient safety standards.
Evaluation Metrics
The following are the primary metrics by which submitted algorithms will be evaluated, using manual review by a panel of judges with clinical and technical expertise. In the event of a tie, additional criteria—such as model safety, explainability, or user experience—may be considered at the judges’ discretion.
Algorithmic Innovation
Each LLM will be assessed based on the innovation of its overall design, training methodology, and the originality and diversity of data sources used in its development. Judges will look for innovative strategies—such as the use of underexplored or domain-specific datasets, novel fine-tuning techniques, or architectural enhancements—that expand the boundaries of how language models can be applied in healthcare. The goal is to reward not just technical rigor, but also imaginative thinking and bold experimentation. Creativity will be scored on a Likert scale from 1 to 7, with one being the lowest and seven the highest.
Context Quality/Clinical Relevance
The responses generated by each model to clinical questions will be evaluated based on their accuracy, relevance, clarity, and depth of contextual understanding. Judges will assess whether the answers reflect sound medical reasoning, align with current clinical knowledge, and address the key symptoms or concerns presented in each prompt. Beyond correctness, emphasis will be placed on the model’s ability to communicate appropriately and sensitively, recognizing the potential implications of medical language in patient-facing contexts. Context quality will also be scored on a Likert scale from 1 to 7, with one being the lowest and seven the highest.
Timeline
- Start Date: Monday, July7
- QA Webinar 1 : Wednesday, July 15
- QA Webinar 2 : Wednesday, July 30
- Submission Deadline: Friday, August 15
- Announcement of Winner: Friday, August 22
- Healthcare Workshop: Friday, August 22 (Top three teams will be asked to give a brief presentation at the workshop)
How To Participate
The challenge is open to all UIUC community. The competition will be hosted on the Kaggle platform, providing participants with an accessible, user-friendly environment to build, test, and refine their models. Kaggle’s collaborative tools and intuitive interface support rapid experimentation and reproducibility, enabling users to iterate efficiently. Participants will also benefit from access to powerful computational resources and a library of pre-trained language models, streamlining development and enhancing the quality of submitted solutions. This setup ensures that both beginners and experienced practitioners can focus on innovation and performance while maintaining transparency and scalability in their approaches.
- Fill out this form to register your team (Team size 1-4 members. Each team should submit one form only. You must log in using your UIUC email to access the registration form. Please ensure that you include the UIUC email addresses of all team members in the submission.)
- Upon filling out the form, you and your team members will be given a link to access the competition webpage on the Kaggle platform before the competition start date.
- Sign in to the Kaggle platform and accept the competition rules and data use agreement.
- Analyze the data, build and refine your model.
- Submit your final entry by the deadline.
Incentives
- The top three teams will be invited to join the organizing team as co-authors on a planned peer-reviewed publication.
- These teams will also be invited to give an oral presentation at the conference.
- Awards include First Prize 1500$, Second Prize 500$ and Third Prize 250$.
Contact Information
Main Contact: midwest.data.competition@gmail.com
Organizers:
Mehmet Eren Ahsen, Assistant Professor , Gies College of Business, Carle Illinois School of Medicine, ahsen@illinois.edu
Rand Kittani, Medical Student at Carle Illinois School of Medicine
Michael Chen, Medical Student at Carle Illinois School of Medicine
Gaurav Nigam, Medical Student at Carle Illinois School of Medicine
Alexis Watson, Medical Student at Carle Illinois School of Medicine
Talha Coskun, Student at Department of Computer Engineering