BabySLM — Tiny Transformer Language Model

A minimal PyTorch implementation of a Transformer-based language model for educational purposes.

What Does This Model Do?

BabySLM is a next-token prediction model — given a sequence of word tokens (represented as numbers), it predicts what token should come next at each position. This is the same core task that powers models like GPT, but in a much smaller, educational form.

Architecture

Token & Position Embeddings: Converts word indices into vectors and adds positional information
Single Transformer Block: Uses multi-head attention (4 heads) to learn relationships between tokens
Output Head: Projects back to vocabulary space for next-token predictions

Key Parameters

vocab_size: Number of unique tokens the model can handle (e.g., 1000 words)
embed_dim: Dimensionality of token embeddings (e.g., 32 or 128)
context_length: Maximum sequence length the model can process (e.g., 16 or 64 tokens)

Quick Start (Windows PowerShell)

python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
python model.py

Notes

Installing torch may require following the official instructions at https://pytorch.org/ if you need CUDA support or specific wheels for Windows.
This is a teaching example — the model is structurally similar to real LLMs but much smaller (real models have billions of parameters and 30+ stacked transformer blocks).
The model is untrained and will output random predictions until trained on actual data.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Readme.md		Readme.md
model.py		model.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BabySLM — Tiny Transformer Language Model

What Does This Model Do?

Architecture

Key Parameters

Quick Start (Windows PowerShell)

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BabySLM — Tiny Transformer Language Model

What Does This Model Do?

Architecture

Key Parameters

Quick Start (Windows PowerShell)

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages