Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 62 additions & 7 deletions spec/header-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,29 +21,52 @@ A header file contains:

Each line must be under 255 characters, and fields are separated by spaces or tabs (except where otherwise noted).

Detailed documentation on header files can be found at: [https://physionet.org/physiotools/wag/header-5.htm](https://physionet.org/physiotools/wag/header-5.htm)

---

## Record Line

The first non-comment line is the **record line**, which provides metadata about the overall record. It includes:

| Field | Description |
|:------|:------------|
|:------|:-----------|
| Record name | Identifier for the record (letters, digits, underscores only). |
| Number of segments (optional) | If present, appended as `/n`. Indicates a multi-segment record. |
| Number of signals | Number of signals described in the header. |
| Sampling frequency (optional) | Samples per second per signal. Defaults to 250 if omitted. |
| Counter frequency (optional) | Secondary clock frequency, separated from sampling frequency by a `/`. |
| Base counter value (optional) | Offset value for counter, enclosed in parentheses. |
| Number of samples (optional) | Total samples per signal. |
| Number of samples (optional) | Total samples per signal when `samps_per_frame`=1; total frames otherwise. |
| Base time (optional) | Start time of the recording (`HH:MM:SS`). |
| Base date (optional) | Start date (`DD/MM/YYYY`). |

**Example:**
### Examples

**Basic header (uniform sampling frequency):**
record 100 from the [MIT-BIH database](https://www.physionet.org/content/mitdb)
```text
100 2 360 650000
```

- `100`: Record name (must match the filename prefix).
- `2`: Number of signals in the record.
- `360`: Sampling frequency in Hz.
- `650000`: Number of samples for each signal.

**Multi-frequency header (unique sampling rates per channel):**

```text
100 2 360 650000 12:00:00 01/01/2000
12345 3 62.5 625 12:00:00 30/01/1989
```

- `12345`: Record name (must match the filename prefix).
- `3`: Number of signals in the record.
- `62.5`: Sampling frequency in Hz.
- `625`: Number of samples. For multi-frequency records this is the total frames for each signal.
- `12:00:00`: Base time.
- `30/01/1989`: Base date.

---

## Signal Specification Lines
Expand All @@ -67,12 +90,43 @@ Each signal has its own line immediately following the record line (for single-s
| Block size (optional) | Number of samples per block (for formats supporting block I/O). |
| Description (optional) | Free-text description of the signal (e.g., lead name `ECG Lead II`). |

**Example:**
### Examples

**Basic header (uniform sampling frequency):**
record 100 from the [MIT-BIH database](https://www.physionet.org/content/mitdb)
```text
100.dat 212 200 11 1024 995 0 MLII
100.dat 212 200 11 1024 995 0 V5
100.dat 212 200 11 1024 995 -22131 0 MLII
100.dat 212 200 11 1024 1011 20052 0 V5
```

- `100.dat`: File name of the signal file.
- `212`: Format samples are stored in (12-bit two's complement).
- `200`: ADC gain (i.e. number of digital values per physical unit).
- `11`: ADC resolution (bits).
- `1024`: ADC zero value.
- `995`, `1011`: Initial value.
- `-22131`, `20052`: Checksum (sum of all signal samples modulo 2^16).
- `0`: Block size.
- `MLII`, `V5`: Description (e.g., signal lead names).

**Multi-frequency header (unique sampling rates per channel):**
```text
12345.dat 16x4 200/μV 12 0 0 2178 0 ECG
12345.dat 16x2 16/mmHg 12 0 0 3497 0 ICP
12345.dat 16x1 2500/Ohm 12 0 0 1366 0 RESP
```

- `12345.dat`: File name of the signal file.
- `16`: Format samples are stored in (16-bit integers).
- `x4`,`x2`,`x1`: 4, 2, and 1 samples per frame, respectively. This indicates that the ECG signal has `157 * 4 = 628` samples while the ICP signal has 314 samples and the RESP signal has 157 samples.
- `200/μV`, `16/mmHg`, `2500/Ohm`: ADC gain (i.e., number of digital values per physical unit).
- `12`: ADC resolution (bits).
- `0`: ADC zero value.
- `0`: Initial value.
- `2178 `, `3497`, `1366`: Checksum (sum of all signal samples modulo 2^16).
- `0`: Block size.
- `ECG`, `ICP`, `RESP`: Description (e.g., signal lead names).

---

## Comments and Info Strings
Expand All @@ -87,4 +141,5 @@ Each signal has its own line immediately following the record line (for single-s

- A header file may describe signals stored in multiple files or multiple signals in a single file.
- Fields like sampling frequency, counter frequency, and base time/date improve time-aligned analysis but are optional.
- Storage format options and details can be found at: [https://physionet.org/physiotools/wag/signal-5.htm](https://physionet.org/physiotools/wag/signal-5.htm)
- Multi-segment records use a slightly different structure (described separately).