Skip to content

PatchRequest/Proteus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Proteus

LLVM IR obfuscation toolkit. Takes LLVM bitcode or IR from any language (C, C++, Rust, Go, etc.), applies obfuscation passes, and outputs transformed IR that compiles to functionally identical but heavily obfuscated binaries.

Named after the shape-shifting Greek sea god -- your code transforms beyond recognition.

Ships as both a standalone CLI tool (proteus) and an LLVM pass plugin (libproteus.dylib) that integrates directly into compiler pipelines.

Quick Start

1. Build

cmake -B build -DLLVM_DIR=$(llvm-config --cmakedir)
cmake --build build

2. Mark functions to protect

// Only functions you annotate get obfuscated.
// Everything else compiles normally with zero overhead.

__attribute__((annotate("proteus")))
void decrypt_payload(uint8_t *buf, size_t len, const uint8_t *key) {
    for (size_t i = 0; i < len; i++)
        buf[i] ^= key[i % 16];
}

__attribute__((annotate("proteus")))
int verify_license(const char *key) {
    // ... license check logic
}

void fast_network_loop() {
    // not annotated -- untouched, zero overhead
}

3. Compile

Pass plugin (recommended):

# One step -- annotated functions get obfuscated, everything else untouched
COBRA_ANNOTATE_ONLY=1 clang -fpass-plugin=build/libproteus.dylib -O0 program.c -o program

# C++
COBRA_ANNOTATE_ONLY=1 clang++ -fpass-plugin=build/libproteus.dylib -O0 program.cpp -o program

Standalone tool:

clang -emit-llvm -c program.c -o program.bc
./build/proteus program.bc -o program_obf.bc --passes all --annotate-only --seed 42
clang -O0 program_obf.bc -o program

Note: Use -O0 when compiling obfuscated IR. LLVM's optimizer conflicts with heavily obfuscated IR. This is correct -- you don't want the optimizer undoing the obfuscation.

Obfuscate everything (no annotations)

If you want to obfuscate every function (e.g. a small standalone tool where everything is sensitive), just omit --annotate-only:

# CLI
proteus input.bc -o output.bc --passes all --seed 42

# Plugin
clang -fpass-plugin=build/libproteus.dylib -O0 program.c -o program

Performance Impact

Obfuscation adds runtime overhead. Use --annotate-only to limit it to sensitive functions.

Configuration Slowdown Binary Size Use Case
CFF only ~3x ~1x Light protection, minimal impact
Selective passes (CFF + MBA + string-encrypt) ~5-8x ~1.5x Good balance for most use cases
All 16 passes ~18x ~2x Maximum protection
All passes, iter=2 ~1200x ~4x+ Don't do this

Bottom line: Proteus is well-suited for C2 agents, license checks, anti-cheat modules, DRM stubs, and anything where protecting the logic matters more than raw speed. It is not meant for hot loops, real-time systems, or high-throughput data processing. Use --annotate-only to protect only what matters.

Obfuscation Passes

16 passes across 3 categories:

Arithmetic & Data

Pass Description
insn-substitution Replace arithmetic with semantically equivalent alternatives (add -> sub(a, neg(b)), xor -> (a|b) & ~(a&b))
mba Mixed Boolean-Arithmetic -- replace operations with opaque expressions (a+b -> (a^b) + 2*(a&b))
constant-unfold Replace compile-time constants with runtime computations (42 -> 6*7+0)
string-encrypt XOR-encrypt string constants, decrypt inline at runtime with per-string keys

Control Flow

Pass Description
cff Control flow flattening -- replace structured control flow with a dispatcher loop. Randomly selects between three strategies: switch, if-else chain, or XOR-keyed lookup table
bogus-cf Insert fake conditional branches using opaque predicates (always-true conditions that look dynamic)
bogus-loops Insert do-while(false) back-edges that fool decompilers into showing fake loops
dead-code Insert opaque predicate guards leading to unreachable blocks filled with plausible-looking junk code
junk-insertion Scatter volatile dead stores and loads through every basic block
reg-pressure Flood the register allocator with volatile spills, making decompiler output noisy
insn-shuffle Randomly reorder adjacent independent instructions within basic blocks
ret-obfuscate Compute return values through XOR chains that cancel out but obscure the value

Structural

Pass Description
func-merge-split Merge pairs of functions with matching signatures into a single dispatcher function with a selector parameter
indirect-branch Replace direct function calls with loads from a global function pointer table
symbol-strip Rename internal functions and globals to random hex names, strip all basic block and SSA value names
anti-tamper Insert runtime integrity checks at function entry that call abort() if tampered with

CLI Reference

proteus [OPTIONS] <input.bc|input.ll>

Options:
  -o <file>            Output file (default: stdout)
  --passes <list>      Comma-separated pass names or 'all' (default: all)
  --exclude <list>     Comma-separated passes to skip
  --seed <N>           RNG seed for reproducibility (default: random)
  --iterations <N>     Run the full pipeline N times (default: 1)
  --annotate-only      Only obfuscate __attribute__((annotate("proteus"))) functions
  --emit-ll            Output human-readable .ll instead of bitcode
  --print-stats        Print before/after module statistics
  --verbose            Print per-pass activity
  --version            Print version

Examples

# Protect only annotated functions (recommended)
proteus input.bc -o output.bc --passes all --annotate-only --seed 42

# Apply all passes to everything
proteus input.bc -o output.bc --passes all --seed 42

# Only CFF and string encryption
proteus input.bc -o output.bc --passes cff,string-encrypt --annotate-only

# Everything except junk insertion
proteus input.bc -o output.bc --passes all --exclude junk-insertion --annotate-only

# Inspect the obfuscated IR
proteus input.bc -o output.ll --emit-ll --passes cff --seed 42

# See statistics
proteus input.bc -o output.bc --passes all --print-stats

Pass Plugin Reference

The plugin is configured via environment variables:

Variable Default Description
COBRA_ANNOTATE_ONLY 0 Set to 1 to only obfuscate annotated functions
COBRA_SEED random RNG seed
COBRA_PASSES all Comma-separated pass list
COBRA_EXCLUDE (none) Passes to skip
COBRA_ITERATIONS 1 Pipeline iterations
COBRA_VERBOSE 0 Set to 1 for verbose output
# Recommended: only annotated functions
COBRA_ANNOTATE_ONLY=1 clang -fpass-plugin=libproteus.dylib -O0 foo.c -o foo

# With specific seed
COBRA_ANNOTATE_ONLY=1 COBRA_SEED=42 clang -fpass-plugin=libproteus.dylib -O0 foo.c -o foo

# Selective passes on annotated functions
COBRA_ANNOTATE_ONLY=1 COBRA_PASSES=cff,mba,string-encrypt \
    clang -fpass-plugin=libproteus.dylib -O0 foo.c -o foo

# See what's happening
COBRA_ANNOTATE_ONLY=1 COBRA_VERBOSE=1 clang -fpass-plugin=libproteus.dylib -O0 foo.c -o foo 2>&1

Annotation Reference

Mark functions with __attribute__((annotate("proteus"))) to select them for obfuscation:

// C
__attribute__((annotate("proteus")))
void my_secret_function() { ... }

// With a macro for convenience
#define PROTEUS __attribute__((annotate("proteus")))

PROTEUS void decrypt(uint8_t *data, size_t len) { ... }
PROTEUS int check_license(const char *key) { ... }
void normal_function() { ... }  // untouched
// C++
#define PROTEUS __attribute__((annotate("proteus")))

class Crypto {
    PROTEUS void encrypt(Buffer &buf);  // obfuscated
    void getSize();                     // untouched
};

When --annotate-only is active:

  • Annotated functions: All enabled passes are applied
  • Unannotated functions: Passed through completely untouched (zero overhead)
  • Module passes (string-encrypt, func-merge-split, indirect-branch, symbol-strip): Still operate on the whole module

Pass Ordering

Passes run in this order within each iteration:

1.  string-encrypt        (module -- encrypt before other passes obscure the IR)
2.  constant-unfold       (function -- expand constants before further transforms)
3.  insn-substitution     (function)
4.  mba                   (function)
5.  insn-shuffle          (function -- reorder before CFG transforms)
6.  reg-pressure          (function -- spill slots before block manipulation)
7.  bogus-cf              (function)
8.  dead-code             (function)
9.  junk-insertion        (function)
10. bogus-loops           (function -- fake loops after all block insertion)
11. cff                   (function -- flatten the already-obfuscated control flow)
12. ret-obfuscate         (function -- XOR chains on returns)
13. anti-tamper           (function -- integrity checks on the final obfuscated code)
14. func-merge-split      (module)
15. indirect-branch       (module)
16. symbol-strip          (module -- rename everything last)

With --iterations 2, the entire sequence runs twice for layered obfuscation.

Building

Requirements

  • LLVM 17+ (tested with LLVM 20)
  • CMake 3.20+
  • C++17 compiler

macOS (Homebrew)

brew install llvm cmake
cmake -B build -DLLVM_DIR=$(brew --prefix llvm)/lib/cmake/llvm
cmake --build build

Produces:

  • build/proteus -- standalone CLI tool
  • build/libproteus.dylib -- LLVM pass plugin

Linux

apt install llvm-20-dev cmake
cmake -B build -DLLVM_DIR=/usr/lib/llvm-20/lib/cmake/llvm
cmake --build build

Produces build/proteus and build/libproteus.so.

Language Support

Language Standalone Tool Pass Plugin
C Full support Full support (clang -fpass-plugin=...)
C++ Full support Full support (clang++ -fpass-plugin=...)
Rust Emit BC with rustc --emit=llvm-bc, obfuscate, link with clang Requires building against rustc's LLVM version
Any LLVM language Full support via BC/IR Depends on compiler's plugin support

Rust Workflow (standalone tool)

rustc --emit=llvm-bc main.rs -o main.bc
proteus main.bc -o main_obf.bc --passes all --seed 42
clang -O0 main_obf.bc -o main    # works for #![no_std] programs

Testing

350 end-to-end tests across 13 C programs, 3 C++ programs, all 16 passes individually, all passes combined, multiple seeds, multiple iterations, pairwise pass combinations, and stress tests.

# Run the full E2E test suite (350 tests)
./test/e2e/run_e2e.sh

# Quick smoke test
clang -emit-llvm -c test/e2e/arith.c -o /tmp/arith.bc
./build/proteus /tmp/arith.bc -o /tmp/arith_obf.bc --passes all --seed 42
clang -O0 /tmp/arith_obf.bc -o /tmp/arith_obf -lm
/tmp/arith_obf   # should print: add=13 sub=7 xor=9

Architecture

proteus/
├── include/cobra/
│   ├── CobraConfig.h       # Configuration + annotation checking
│   ├── Passes.h            # All pass class declarations
│   ├── PassPipeline.h      # Pipeline runner interface
│   ├── RNG.h               # Seeded PRNG
│   ├── OpaquePredicates.h  # Shared opaque predicate builder
│   └── Stats.h             # Module statistics
├── src/
│   ├── main.cpp            # Standalone CLI entry point
│   ├── Plugin.cpp          # LLVM pass plugin entry point
│   ├── CobraConfig.cpp     # Annotation checking implementation
│   ├── PassPipeline.cpp    # Pipeline construction and execution
│   ├── Stats.cpp           # Statistics implementation
│   ├── passes/             # One file per obfuscation pass (16 passes)
│   └── utils/
│       └── OpaquePredicates.cpp
└── test/
    ├── passes/             # Per-pass .ll tests
    └── e2e/                # End-to-end correctness tests (C, C++)

License

MIT

About

LLVM IR obfuscation toolkit

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors