Skip to content

Add CI workflow, AI/ML format recovery, TrID sig generation, ext4 fix#205

Open
johndpope wants to merge 5 commits into
cgsecurity:masterfrom
johndpope:master
Open

Add CI workflow, AI/ML format recovery, TrID sig generation, ext4 fix#205
johndpope wants to merge 5 commits into
cgsecurity:masterfrom
johndpope:master

Conversation

@johndpope
Copy link
Copy Markdown

Summary

  • CI workflow (build-release.yml): builds Linux and Windows binaries on push/tag, publishes release artifacts
  • AI/ML file format recovery: adds PhotoRec modules for GGML (.ggml), GGUF (.gguf), NumPy (.npy), PyTorch (.pt/.pth), and SafeTensors (.safetensors) — increasingly common file types to recover from damaged drives
  • TrID→photorec.sig converter (trid_to_photorec.py): generates photorec.sig from the latest TrID XML definitions (digipres.github.io), bundled as a CI artifact
  • ext4 build fix: CI now installs e2fsprogs libext2fs-dev so configure enables ext4 filesystem support (was silently disabled)
  • Windows packaging fix: replaced MSYS2 7z (not installed → exit 127) with PowerShell Compress-Archive

Test plan

  • CI passes on both Linux and Windows runners
  • testdisk /dev/sdX on an ext4 drive no longer shows "Support for this filesystem wasn't enabled during compilation"
  • PhotoRec recovers .gguf, .safetensors, .npy, .pt files from test image
  • photorec.sig artifact is generated and includes AI/ML signatures

🤖 Generated with Claude Code

johndpope and others added 3 commits May 8, 2026 19:24
- .github/workflows/build-release.yml: GitHub Actions CI building Linux and
  Windows binaries, cloning digipres/digipres.github.io to get the latest
  TrID XML definitions, generating photorec.sig, and publishing all three as
  release assets on tag pushes (v*).
- trid_to_photorec.py: Python converter from TrID XML definitions to PhotoRec
  photorec.sig format (39,523 unique signatures generated from 26,369 defs).
- photorec.sig: pre-generated signature file from current TrID dataset.
- src/file_ggml.c: PhotoRec module for GGML/GGMF/GGJT LLM model formats.
- src/file_gguf.c: PhotoRec module for GGUF LLM model format (v1-3).
- src/file_npy.c: PhotoRec module for NumPy .npy array format (v1-3).
- src/file_pt.c: PhotoRec module for PyTorch .pt pickle-based models.
- src/file_safetensors.c: PhotoRec module for HuggingFace SafeTensors.
- autogen.sh: use 'mkdir -p config' so re-runs don't fail when config/ exists
- build-release.yml: add gettext and autopoint to apt deps so AM_ICONV m4
  macro is available during autoreconf
…s packaging

- Linux: add e2fsprogs + libext2fs-dev so configure detects ext2fs and
  compiles in ext4 filesystem support (was silently disabled)
- Windows: replace MSYS2 7z (not installed) with PowerShell
  Compress-Archive which is always available on Windows runners

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@johndpope
Copy link
Copy Markdown
Author

this has 6500 file types

johndpope and others added 2 commits May 9, 2026 08:10
- Generate photorec.sig from fresh TrID definitions before packaging
  (was generated after, so tarballs got the stale repo copy)
- Include photorec.sig in both Linux tarball and Windows zip so
  PhotoRec finds it at ./photorec.sig when run from the extracted dir
- Windows uses the checked-out photorec.sig (avoids needing Python on runner)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant