Birger Moëll BirgerMoell

Birger Moëll

I build AI systems for healthcare, language, and human wellbeing.

My work sits between research and production: evaluating medical AI, building open health infrastructure, developing language and speech technology, and turning clinical questions into usable software. I am a licensed clinical psychologist, PhD from KTH Royal Institute of Technology with a thesis on AI evaluation in medicine, Senior Research Scientist at AI Sweden, Senior Lecturer in Computational Linguistics at Uppsala University, and co-founder of Eir Space.

Current Work

Role	Focus
AI Sweden	Senior Research Scientist in the NLU team, working on language models and applied language technology for Swedish and European needs
Uppsala University	Senior Lecturer in Computational Linguistics, teaching and researching NLP, speech technology, and AI in healthcare
Eir Space	Co-founder, building privacy-first health tools that help people understand and use their own health data
Open Source	Medical AI benchmarks, health data tools, Swedish NLP resources, local-first apps, datasets, and models

What I Work On

Medical AI evaluation: benchmarks and methods for testing how language models reason in clinical contexts.
Privacy-first health software: tools that keep sensitive health data local and useful.
Speech technology for diagnostics: acoustic and language-based signals for clinical assessment.
Swedish and Nordic language technology: models, datasets, and evaluation resources for underrepresented language contexts.
Human-centered AI: clinical safety, harm reduction, explainability, and tools that fit real workflows.

Flagship Projects

Eir Open

Open-source infrastructure for privacy-first health AI. Eir Open brings together health data standards, medication lookup tools, local medical apps, clinical documentation experiments, and agent-ready modules for building patient-centered health systems.

Highlights:

Local-first tools for viewing and working with Swedish medical records.
Medication lookup resources for Swedish and US medications.
Health.md, a lightweight standard for LLM-readable health records.
Open medical scribe experiments that run locally.
Swedish provider discovery and care navigation tooling.

Swedish Medical LLM Benchmark

A benchmark and evaluation framework for assessing large language models in the Swedish medical domain.

The work was published in Frontiers in Artificial Intelligence:

Moëll, B., Farestam, F., & Beskow, J. (2025). Swedish Medical LLM Benchmark: development and evaluation of a framework for assessing large language models in the Swedish medical domain. Frontiers in Artificial Intelligence, 8, 1557920.

Health Journal

An exploration of AI-assisted personal health management through journaling interfaces, where the interaction model matters as much as the model itself.

Pataka Test

Audio processing and machine learning tools for automatic evaluation of the Pataka test, used in speech-language pathology and motor speech assessment.

Research

My research focuses on the bridge between advanced AI methods and clinical utility.

Recent themes include:

Evaluation of large language models in medical reasoning.
Speech and language markers for neurological and psychological assessment.
Harm reduction strategies for safe use of generative AI in healthcare.
Synthetic clinical data and model evaluation.
Human-AI interaction for personal health tools.

Selected publications:

High-accuracy prediction of mental health scores from English BERT embeddings trained on LLM-generated synthetic self-reports
Frontiers in Digital Health, 2026.
Medical reasoning in LLMs: an in-depth analysis of DeepSeek R1
Frontiers in Artificial Intelligence, 2025.
Swedish Medical LLM Benchmark (SMLB)
Frontiers in Artificial Intelligence, 2025.
Harm reduction strategies for thoughtful use of large language models in the medical domain
Journal of Medical Internet Research, 2025.
Automatic Evaluation of the Pataka Test Using Machine Learning and Audio Signal Processing
Acta Logopaedica, 2025.
Multimodal capture of patient behaviour for improved detection of early dementia
Frontiers in Computer Science, 2021.

More: Google Scholar

Models, Datasets, and Tools

I publish experimental models, datasets, and demos on Hugging Face, including:

Clinical and Swedish-language LLM experiments.
Swedish medical benchmark datasets.
Speech and health-related datasets.
Local model demos and comparison tools.

Teaching

At Uppsala University, I teach and supervise in computational linguistics and language technology, including information retrieval and research-oriented language technology projects.

Links

Website: birgermoell.com
GitHub: github.com/BirgerMoell
Google Scholar: Birger Moëll
Hugging Face: huggingface.co/birgermoell
Uppsala University: staff profile
AI Sweden: NLU team
Eir Space: eir.space

Provide feedback

Saved searches

Use saved searches to filter your results more quickly