For barcode images (in .ppm format) of products that need to be purchased, this project implements a simplified barcode decoder. Using this, stores obtain the EAN-13 number and, with the help of a database, identify the sold product and can calculate the price.
A barcode is a visual representation of data, easily readable by devices. Usually, the data describes properties of the object on which the barcode is located.
Traditional barcodes represent data by varying the widths of parallel lines and spaces. This representation is linear or 1D. There are also 2D representations, such as QR codes, which have a matrix logic and can encode more data.
A barcode is a number represented by a sequence of black bars (bars) and white bars (spaces). Usually, black bars encode 1 bits, and white bars encode 0 bits. Several adjacent bits can create a thicker bar. In the image below, you can see the barcode for the digits 5901234123457, according to the European Article Number (EAN-13) standard.
Barcodes describe 13 digits, but only 12 digits are actually encoded because the first digit is encoded indirectly. Each digit is encoded using 4 bars of different widths. Besides the bars that encode digits, there is also a start sequence of 3 bars, a stop sequence of 3 bars, and a center sequence of 5 bars (the bars of these special sequences are longer). In total, there are 59 bars (12 * 4 + 3 + 3 + 5).
You can see more in this section.
The sequence of 59 bars contains 95 bits. From left to right, they are organized as follows:
- 3 bits
101to mark the start - 42 bits (7 for each digit) for the digits in positions 2→7 (the first digit is encoded indirectly)
- 5 bits
01010to mark the center - 42 bits (7 for each digit) for the digits in positions 8→13
- 3 bits
101to mark the end
We will consider the digits numbered from left to right by position, starting from position 1.
The barcode digits are divided into 3 parts:
- The first digit, called the parity digit
- The first group is represented by the following 6 digits, in positions 2→7
- The second group is represented by the last 6 digits, in positions 8→13
We say that the encoding of a digit has even parity if it contains an even number of 1 bits. Otherwise, it is odd.
For each digit in the first group (positions 2→7), there are two possible encodings: one with an odd number of 1 bits (L encoding, odd parity) and one with an even number of 1 bits (G encoding, even parity).
For each digit in the second group (positions 8→13), there is only one possible encoding (R encoding, even parity).
(Note: Table data is missing from the source text)
- The barcode starts with an odd-encoded digit and ends with an even-encoded digit, so barcode scanners can determine the orientation and read from left to right, as well as inversely.
- R-encodings are the bitwise complements of L-encodings.
- G-encodings are the reverse of R-encodings.
- G and L encodings start with
0and end with1. R encodings start with1and end with0. Thus, each digit will be represented by 4 alternating bars of different widths. - The maximum width of a bar will be 4. The bars of two digits do not mix.
As mentioned earlier, the first digit is called the parity digit. The parity digit is not represented directly by a sequence of bars and spaces, but is encoded indirectly by choosing a combination of L or G encoding modes for the first group of 6 digits, according to the table below. Practically, it is sufficient to know which encoding was used for each of the 6 digits in the first group to determine the associated parity digit. If the found combination is not associated with a digit from the table below, the code is invalid.
(Note: Table data is missing from the source text)
The last digit of an EAN-13 barcode is called the check digit. It is used to confirm the correct reading of a code. Each digit in the barcode (excluding the check digit) has a weight of 1 or 3 depending on its position when calculating the check digit (those in odd positions have a weight of 1, and those in even positions have a weight of 3, see the table below).
| Position | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Weight | 1 | 3 | 1 | 3 | 1 | 3 | 1 | 3 | 1 | 3 | 1 | 3 |
The formula for the check digit is the following:
Where
Let's return to the image for the 5901234123457 barcode from the introduction.
The binary representation of these bars is the following: 10100010110100111011001100100110111101001110101010110011011011001000010101110010011101000100101
The first digit, meaning the parity digit, has the value 5. According to the table above, this will lead to the encoding of the next 6 digits in the LGGLLG format. The last 6 digits are always encoded in the RRRRRR format.
| Digit | start | 9 | 0 | 1 | 2 | 3 | 4 | center | 1 | 2 | 3 | 4 | 5 | 7 | end |
| Encoding | - | L | G | G | L | L | G | - | R | R | R | R | R | R | - |
| Representation | 101 | 0001011 | 0100111 | 0110011 | 0010011 | 0111101 | 0100011 | 01010 | 1100110 | 1101100 | 1000010 | 1011100 | 1001110 | 1000100 | 101 |
Now let's verify if the check digit 7 is correct.
| Position | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Weight | 1 | 3 | 1 | 3 | 1 | 3 | 1 | 3 | 1 | 3 | 1 | 3 |
| Digit | 5 | 9 | 0 | 1 | 2 | 3 | 4 | 1 | 2 | 3 | 4 | 5 |
The sum of the products between the digits and their associated weights is 83. The check digit is (10 - (83 % 10)) % 10 = 7, so it is correct.