Skip to content

Next release uk language-packs check up #129

@tomatolog

Description

@tomatolog

Proposal:

Follow-up to #127.

Before the next Docker release, verify that the release Docker ref/package set is prepared from current master after:

The goal is to make sure the next release image does not reintroduce the legacy Ukrainian lemmatizer stack that was removed from master.

Background

Docker release 25.0.0 was made from a release ref/tag, not directly from current master.

For 25.0.0, the release Dockerfile used:

manticore_25.0.0-26032712-ce3c27828__ARCH_64.deb
manticore-lemmatizer-uk_1.0.3__ARCH_64.deb

That was correct for that release at the time, but it means the next release preparation step must not copy the old manticore-lemmatizer-uk line forward.

The current Docker master has removed:

  • manticore-lemmatizer-uk
  • /usr/local/lib/manticore/lemmatize_uk.so
  • Python 3.9
  • pymorphy2
  • pymorphy2-dicts-uk
  • PYTHONWARNINGS=ignore::UserWarning:pymorphy2.analyzer

The native Ukrainian lemmatizer is provided by Manticore itself and requires uk.pak.

Relevant fixes:

Release Preparation Checks

When preparing the next Docker release ref/tag from master, inspect the release Dockerfile:

git show <new-release-ref>:Dockerfile | rg 'manticore-lemmatizer-uk|python3\.9|pymorphy2|PYTHONWARNINGS'

Expected result:

(no output)

If the release Dockerfile uses the Manticore bundle package, it should use only the new bundle package, for example:

manticore_<next-version>-<date>-<commit>__ARCH_64.deb

and must not add:

manticore-lemmatizer-uk_1.0.3__ARCH_64.deb

If the release Dockerfile uses split packages instead of the bundle package, the package set must include:

manticore-language-packs >= 1.0.15

or another package that owns:

/usr/share/manticore/uk.pak

Release Image Validation

Build the release/default image from the release ref:

docker build \
  --build-arg DEV=0 \
  --platform linux/amd64 \
  -t manticoresoftware/manticore:release-uk-check \
  .

Verify image contents:

docker run --rm --entrypoint sh manticoresoftware/manticore:release-uk-check -lc '
  set -e
  test -f /usr/share/manticore/uk.pak
  dpkg -S /usr/share/manticore/uk.pak
  ! dpkg -s manticore-lemmatizer-uk >/dev/null 2>&1
  test ! -e /usr/local/lib/manticore/lemmatize_uk.so
  ! command -v python3.9 >/dev/null 2>&1
  if find /usr/local/lib /usr/lib \( -iname "*pymorphy2*" -o -iname "*pymorphy2-dicts-uk*" -o -iname "*pymorphy2_dicts_uk*" \) 2>/dev/null | grep -q .; then
    echo "legacy pymorphy2 files found"
    exit 1
  fi
'

Note: use dpkg -S /usr/share/manticore/uk.pak, not only dpkg -L manticore-language-packs, because in a bundle-based release image uk.pak may be owned by the manticore bundle package.

Verify runtime Ukrainian morphology:

docker rm -f manticore-release-uk-check >/dev/null 2>&1 || true

docker run -d \
  --name manticore-release-uk-check \
  manticoresoftware/manticore:release-uk-check

timeout 60 sh -c '
  until docker logs manticore-release-uk-check 2>&1 | grep -q "accepting connections"; do
    sleep 1
  done
'

docker exec manticore-release-uk-check mysql -h0 -P9306 -e "
  CREATE TABLE test_uk (
    id bigint,
    content text
  )
  rt_mem_limit = '256M'
  morphology = 'lemmatize_uk'
  charset_table = '0..9, A..Z->a..z, _, a..z, U+0410..U+042F->U+0430..U+044F, U+0430..U+044F, U+0454, U+0456, U+0457, U+0491';
"

docker exec manticore-release-uk-check mysql -h0 -P9306 -e "
  INSERT INTO test_uk (id, content) VALUES (1, 'бігаю');
"

docker exec manticore-release-uk-check mysql -h0 -P9306 -e "
  CALL KEYWORDS('бігаю', 'test_uk');
"

docker rm -f manticore-release-uk-check

Expected normalized form:

бігати

Acceptance Criteria

  • Release Dockerfile does not reference manticore-lemmatizer-uk.
  • Release Dockerfile does not install Python 3.9 / pymorphy2 / pymorphy2-dicts-uk.
  • Final release image contains /usr/share/manticore/uk.pak.
  • /usr/share/manticore/uk.pak is owned by an installed package.
  • Final release image does not contain /usr/local/lib/manticore/lemmatize_uk.so.
  • morphology = 'lemmatize_uk' works in the release image.
  • The same checks pass for expected release architectures, at least linux/amd64 and linux/arm64.

Notes

The current/dev Docker path and downstream test-kit path were already validated after #128. This ticket is only for the next release preparation/release image check.

Checklist:

To be completed by the assignee. Check off tasks that have been completed or are not applicable.

Details
  • Implementation completed
  • Tests developed
  • Documentation updated
  • Documentation reviewed
  • Changelog updated

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions