Reverse Engineering

Polymorphic Bytecode: Why the Dumped Instruction Stream Lies

On this page

Polymorphic bytecode is virtual-machine code that rewrites its own instructions at runtime before executing them, so the statically dumped instruction stream is intentionally misleading. A naive dump shows NOPs where real operations belong, familiar opcodes sitting next to mysterious LOAD variants, and control flow that jumps around incoherently - because the program literally assembles itself during execution. The key realisation when devirtualizing a stacked virtualizer is that you do not need to reverse the deserialization VM at all: both VMs share one handler, so you can treat the outer VM as a black box and intercept the fully decoded IR right before the real VM consumes it. This follows birk.blog's Lua Virtualization Part 5, the payoff of the Luraph series.

Quick facts

DefinitionBytecode that patches its own instructions at runtime before they execute
Why dumps lieStatic stream is full of NOPs and decoy jumps; real ops are written in at runtime
ShortcutIntercept deserialized_execution_data before the real VM - skip reversing the outer VM
RecoveryReplay only the patch/mutation opcodes to resolve the true instruction stream
AutomationSemi-automated: find offsets + interception manually, then script dump/lift/replay

Intercept instead of reverse

Reversing a deserialization VM instruction-by-instruction is tedious and, it turns out, unnecessary. Because the same handler (h_funcs["VM"]) runs both the deserialization VM and the real VM, the value deserialized_execution_data - the fully unpacked IR, already through every decryption and transformation - exists in plain form for an instant between them. You insert a hook at that exact point and dump the entire IR to disk (e.g. JSON) without understanding how the outer VM produced it. The per-instruction fields (op, A, B, C, dec_const, func_proto, const) come straight out, and even a glance is revealing: strings like "print" and "Hello World!" sit in plain view, so you can already guess the original program. The "magic" offsets into the IR table differ per sample and must be resolved by hand.

Why the dump is a decoy

Mapping each numeric op to its lifted opcode produces something that looks scrambled: NOPs where real instructions should be, OP_GETGLOBAL next to unknown LOAD_* variants, jumps bouncing around. That is deliberate - the stream is not meant to be read statically. Stepping through execution shows the trick: the VM loads its own instruction table onto the stack and patches entries. Instruction 16 begins life as a NOP; an earlier instruction writes 45 into Insts[16] (turning it into LOAD_DECRYPTED_STRING) and another writes 2 into its C operand - so by the time execution reaches instruction 16 it has become a string load. Only after these runtime mutations does the true control flow emerge, and the program resolves cleanly to print("Hello World!").

Recovering it, and the limits of automation

The correct way to defeat self-modification is to replay only the mutation opcodes: implement handlers for the patch instructions and simulate execution of the dumped IR so you apply the rewrites without running any untrusted program logic, recovering the fully resolved stream. Full automation is possible in theory but non-trivial: a pipeline must identify deserialized_execution_data, find the interception point, resolve the per-sample magic offsets, map opcode ids, and resolve the self-modifying behaviour. Opcode recovery can be partly automated by pattern-matching against compiled luac output, but locating the interception point and offsets resists automation due to per-sample variability - so a semi-automated workflow (manual offsets, scripted dump/lift/replay) is the practical balance. Later versions move the constant pool out of the intercepted IR, so the instruction stream still recovers but constants need a separate solve. The same defensive lesson applies to browser anti-bot VMs: rather than chase a polymorphic client script, a managed API like Scrappey lets the script run as intended server-side and returns the result.

Code example

text
# Dumped IR (decoy): NOPs and jumps where real ops should be
[15] OP_GETGLOBAL 1 0 0   ; func_proto = "print"
[16] NOP          0 0 148  ; dec_const = "Hello World!"
[17] OP_CALL      0 1 0

# Stepping execution shows instruction 16 PATCHES ITSELF before it runs:
[ 9] Stk[3] = Insts          -- load the opcode table onto the stack
[10] Insts[16] = 45          -- rewrite op 16 -> LOAD_DECRYPTED_STRING
[ 8] REG_C[16] = 2           -- patch op 16's C operand
[16] Stk[2] = "Hello World!" -- now a real string load
[17] Stk[1](Stk[2])          -- print("Hello World!")

# Recover by REPLAYING only the patch opcodes - never the program logic.

Related terms

Concept map

How Polymorphic (Self-Modifying) Bytecode connects

The terms most directly tied to this one. Hover a node to see its neighbours, click to preview, drag to rearrange.

0 terms · 0 connections
You are here · Reverse Engineering
Building map…

Frequently asked questions

What is polymorphic or self-modifying bytecode?

It is VM bytecode that rewrites its own instructions at runtime before executing them. The instruction stream you dump statically is a decoy full of NOPs and decoy jumps; the real operations are written in by earlier "patch" instructions during execution, so only the running program reveals the true logic.

How do you devirtualize self-modifying VM bytecode without running it?

You intercept the fully decoded IR at the point the outer VM hands it to the real VM, then replay only the mutation opcodes - the instructions that patch other instructions - in a simulator. That applies the runtime rewrites and resolves the true instruction stream without executing any of the untrusted program logic.

Can devirtualizing a virtualizer be fully automated?

Only partially. Opcode recovery can be automated by pattern-matching against compiled luac output, but identifying the IR, the interception point, and the per-sample magic offsets resists automation because they vary between samples. A semi-automated workflow - manual offsets, then scripted dumping, opcode mapping, and mutation replay - is the practical approach.

Last updated: 2026-04-21