云栈社区»论坛 › 技术文档「 Note & Doc 」 › 从字节码到ASM：gadgetinspector反序列化链挖掘源码解析 ...

发回帖发新帖

4333 积分	0 好友	569 主题

发消息

从字节码到ASM：gadgetinspector反序列化链挖掘源码解析

发表于 2026-4-11 06:10:07 | 查看: 188| 回复: 0

目前在CTF比赛中，对于Java反序列化漏洞的利用链挖掘，普遍依赖codeql、tabby等静态分析工具。其中，tabby基于字节码的分析方式更为精准。而gadgetinspector作为一款较早出现的、基于ASM框架进行字节码分析的自动化反序列化链挖掘工具，尽管在实际攻防场景中应用不多，但其设计思想极具学习价值。经过社区的一些功能补全和二次开发后，其准确率也有所提升。本文将通过分析二开版本的gadgetinspector，深入探讨作者如何利用ASM处理字节码，并跟踪污点流进行自动化分析。

二开后的GadgetInspector项目地址：https://github.com/threedr3am/gadgetinspector

字节码分析

我们以以下简单的Java类为例进行分析：

package com.y1zh3e7.Test;

public class ClassTest {
    public static void main(String[] args) {
        String sayHello = "Hello World!";
    }
}

将编译后的ClassTest.class文件放入十六进制编辑器中查看其原始字节码：

CA FE BA BE 00 00 00 34 00 18 0A 00 04 00 14 08 00 15 07 00 16 07 00 17 01 00 06 3C 69 6E 69 74 3E 01 00 03 28 29 56 01 00 04 43 6F 64 65 01 00 0F 4C 69 6E 65 4E 75 6D 62 65 72 54 61 62 6C 65 01 00 12 4C 6F 63 61 6C 56 61 72 69 61 62 6C 65 54 61 62 6C 65 01 00 04 74 68 69 73 01 00 1C 4C 63 6F 6D 2F 79 31 7A 68 33 65 37 2F 54 65 73 74 2F 43 6C 61 73 73 54 65 73 74 3B 01 00 04 6D 61 69 6E 01 00 16 28 5B 4C 6A 61 76 61 2F 6C 61 6E 67 2F 53 74 72 69 6E 67 3B 29 56 01 00 04 61 72 67 73 01 00 13 5B 4C 6A 61 76 61 2F 6C 61 6E 67 2F 53 74 72 69 6E 67 3B 01 00 08 73 61 79 48 65 6C 6C 6F 01 00 12 4C 6A 61 76 61 2F 6C 61 6E 67 2F 53 74 72 69 6E 67 3B 01 00 0A 53 6F 75 72 63 65 46 69 6C 65 01 00 0E 43 6C 61 73 73 54 65 73 74 2E 6A 61 76 61 0C 00 05 00 06 01 00 0C 48 65 6C 6C 6F 20 57 6F 72 6C 64 21 01 00 1A 63 6F 6D 2F 79 31 7A 68 33 65 37 2F 54 65 73 74 2F 43 6C 61 73 73 54 65 73 74 01 00 10 6A 61 76 61 2F 6C 61 6E 67 2F 4F 62 6A 65 63 74 00 21 00 03 00 04 00 00 00 00 00 02 00 01 00 05 00 06 00 01 00 07 00 00 00 2F 00 01 00 01 00 00 00 05 2A B7 00 01 B1 00 00 00 02 00 08 00 00 00 06 00 01 00 00 00 03 00 09 00 00 00 0C 00 01 00 00 00 05 00 0A 00 0B 00 00 00 09 00 0C 00 0D 00 01 00 07 00 00 00 3C 00 01 00 02 00 00 00 04 12 02 S 4C B1 00 00 00 02 00 08 00 00 00 0A 00 02 00 00 00 05 00 03 00 06 00 09 00 00 00 16 00 02 00 00 00 04 00 0E 00 0F 00 00 00 03 00 01 00 10 00 11 00 01 00 01 00 12 00 00 00 02 00 13

根据JVM规范，class文件遵循严格的结构：

ClassFile {
    u4             magic;
    u2             minor_version;
    u2             major_version;
    u2             constant_pool_count;
    cp_info        constant_pool[constant_pool_count-1];
    u2             access_flags;
    u2             this_class;
    u2             super_class;
    u2             interfaces_count;
    u2             interfaces[interfaces_count];
    u2             fields_count;
    u2             fields[fields_count];
    u2             methods_count;
    u2             methods[methods_count];
    u2             attributes_count;
    u2             attributes[attributes_count];
}

常量池（Constant Pool）

常量池是class文件的核心，存储了字符串、类名、方法名、字段名等各种符号引用。其条目以tag标识类型，例如：

0A: CONSTANT_Methodref (指向方法)
08: CONSTANT_String (字符串字面量)
07: CONSTANT_Class (类或接口符号引用)

访问标志（Access Flags）

access_flags 是一个16位的掩码，用于表示类或接口的访问权限和属性。例如：

ACC_PUBLIC (0x0001): 公共类，可被包外访问。
ACC_FINAL (0x0010): 最终类，不可被继承。
ACC_SUPER (0x0020): 指示invokespecial指令应特殊处理父类方法调用（几乎所有现代编译器都设置此标志）。
ACC_INTERFACE (0x0200): 标识这是一个接口。
ACC_ABSTRACT (0x0400): 抽象类，不能被实例化。
ACC_SYNTHETIC (0x1000): 合成的，不在源代码中出现。
ACC_ANNOTATION (0x2000): 注解类型。
ACC_ENUM (0x4000): 枚举类型。

Java字节码中类访问和属性修饰符的标志位

方法表（Methods）

methods 表存储了类中所有方法（包括构造函数）的结构信息。每个方法包含：

access_flags: 方法的访问标志（如 ACC_PUBLIC, ACC_PRIVATE）。
name_index: 指向常量池的索引，表示方法名。
descriptor_index: 指向常量池的索引，表示方法描述符（参数类型和返回值类型）。
attributes_count 和 attributes[]: 属性表，最重要的属性是 Code 属性，它包含了方法的字节码指令。

Code 属性

Code 属性是方法体的字节码实现，其结构如下：

Code_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 max_stack;
    u2 max_locals;
    u4 code_length;
    u1 code[code_length];
    u2 exception_table_length;
    {   u2 start_pc;
        u2 end_pc;
        u2 handler_pc;
        u2 catch_type;
    } exception_table[exception_table_length];
    u2 attributes_count;
    attribute_info attributes[attributes_count];
}

max_stack: 操作数栈的最大深度。
max_locals: 局部变量表所需的存储空间。
code[]: 实际的字节码指令数组。

SourceFile 属性

SourceFile 属性用于记录生成该class文件的源代码文件名，这对于调试和反编译至关重要。

SourceFile 属性结构如下：

SourceFile_attribute {
    u2 attribute_name_index;
    u4 attribute_length;
    u2 sourcefile_index;
}

在我们的例子中：

0x00 00 00 02：attribute_length，属性长度为2。
0x00 13：sourcefile_index，指向常量池第19项（0x13），其内容为 ClassTest.java。

gadgetInspector分析

0x01 工具概述

gadgetInspector工具基于ASM框架，通过对字节码的控制，读取传入的jar或war包classpath下的所有类，依次记录类信息、方法信息以及它们之间的调用关系。最终，基于这些收集到的数据，挖掘潜在的反序列化利用链（gadget chain）。其核心组件包括：

GadgetInspector: 主程序入口，负责配置初始化和数据准备。
MethodDiscovery: 发现类、方法数据以及父子类、超类的继承关系。
PassthroughDiscovery: 分析哪些方法的参数能够直接影响其返回值，并收集这些“透传”关系。
CallGraphDiscovery: 记录方法调用者（caller）和被调用者（target）之间的参数关联，构建完整的调用图。
SourceDiscovery: 搜索入口方法（Source），即那些具有特定特征（如实现了readObject等）的方法。
GadgetChainDiscovery: 整合以上所有数据，通过判断调用链的最末端是否为已知的危险操作（Sink），从而发现可用的gadget chain。

0x02 主入口 - GadgetInspector

GadgetInspector 类是整个工具的起点。它主要负责初始化配置，例如在静态代码块中创建用于存储结果的文件。main方法首先检查命令行参数，然后配置日志，管理历史数据文件（.dat文件，用于缓存类、方法等信息），并指定要分析的反序列化链类型（如JDK原生、Jackson等）和classpath路径。

我们重点关注如何指定反序列化链类型：

else if (arg.equals("--config")) {
    // --config参数指定fuzz类型
    argIter.next();
    configName = argIter.next();

    switch (configName) {
        case "jdk":
            config = new JdkDeserializationConfig();
            break;
        case "jackson":
            config = new JacksonDeserializationConfig();
            break;
        default:
            throw new IllegalArgumentException("Unknown config name: " + configName);
    }

    // 为当前配置创建一个唯一的输出目录
    outputDir = Paths.get("out_" + config.getName());
    Files.createDirectories(outputDir);

    // 初始化各种发现器
    serializableDecider = config.getSerializableDecider(classpath);
    inheritanceMap = config.getInheritanceMap(classpath);
    implementationFinder = config.getImplementationFinder(serializableDecider, inheritanceMap);
    sourceDiscovery = config.getSourceDiscovery();
    slinkDiscovery = config.getSlinkDiscovery();
}

0x03 配置实现 - GIConfig 接口

GIConfig 接口定义了针对不同反序列化场景的配置策略。它包含一系列工厂方法，用于获取特定于该场景的发现器：

public interface GIConfig {
    SerializableDecider getSerializableDecider(Collection<String> classpath);
    InheritanceMap getInheritanceMap(Collection<String> classpath);
    ImplementationFinder getImplementationFinder(SerializableDecider serializableDecider, InheritanceMap inheritanceMap);
    SourceDiscovery getSourceDiscovery();
    SlinkDiscovery getSlinkDiscovery();
}

以 JacksonDeserializationConfig 为例，它实现了 GIConfig 接口：

public class JacksonDeserializationConfig implements GIConfig {

    @Override
    public String getName() {
        return "jackson";
    }

    @Override
    public SerializableDecider getSerializableDecider(Collection<String> classpath) {
        return new JacksonSerializableDecider();
    }

    @Override
    public ImplementationFinder getImplementationFinder(SerializableDecider serializableDecider, InheritanceMap inheritanceMap) {
        return new JacksonImplementationFinder((JacksonSerializableDecider) serializableDecider);
    }

    @Override
    public SourceDiscovery getSourceDiscovery() {
        return new JacksonSourceDiscovery();
    }

    @Override
    public SlinkDiscovery getSlinkDiscovery() {
        return new JacksonSlinkDiscovery();
    }
}

这些实现的方法会在后续的分析过程中被调用，以确定哪些类、方法属于Jackson反序列化的范畴。

0x04 方法发现 - MethodDiscovery

MethodDiscovery 负责扫描所有类文件，建立基本的类和方法信息库。

public void discover(final ClassResourceEnumerator classResourceEnumerator) throws Exception {
    for (ClassResourceEnumerator.ClassResource classResource : classResourceEnumerator.getAllClasses()) {
        try (InputStream in = classResource.getInputStream()) {
            ClassReader cr = new ClassReader(in);
            try {
                // 使用ASM的ClassVisitor和MethodVisitor，通过观察者模式扫描所有class和method并记录
                cr.accept(new MethodDiscoveryClassVisitor(), ClassReader.EXPAND_FRAMES);
            } catch (Exception e) {
                LOGGER.error("Exception analyzing: " + classResource.getName(), e);
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

ASM及访问者模式

ASM的设计基于访问者模式，适用于在不修改类结构的前提下，对其元素（类、方法、字段）进行灵活的操作。可以将其类比为一个导游（ClassReader）带领一群游客（ClassVisitor及其子类）参观一个旅游区（Java类文件）。旅游区有固定的景点清单（类的结构），而游客可以自由决定在每个景点做什么（如记录、分析）。

ASM的关键在于ClassReader.accept()方法。它会按照JVM规定的类文件结构，依次调用ClassVisitor中对应的visitXxx()方法。我们通过继承ClassVisitor并重写这些方法，就能在遍历过程中执行自定义逻辑。

MethodDiscoveryClassVisitor 的核心是维护一个映射关系：methodCalls，它记录了“某个方法调用了哪些其他方法”。

// private final Map<MethodReference.Handle, Set<MethodReference.Handle>> methodCalls = new HashMap<>();

@Override
public MethodVisitor visitMethod(int access, String name, String desc, String signature, String[] exceptions) {
    // 当发现一个新方法时，为其创建一个空的HashSet来存放它调用的方法
    this.calledMethods = new HashSet<>();
    // 将 (当前类.方法名, 描述符) 映射到一个空的调用集合
    methodCalls.put(new MethodReference.Handle(new ClassReference.Handle(owner), name, desc), calledMethods);
    // 返回一个MethodVisitor，用于进一步分析该方法内部的字节码
    return super.visitMethod(access, name, desc, signature, exceptions);
}

visitMethodInsn 方法是关键，它在ASM解析到方法调用指令（如INVOKEVIRTUAL, INVOKESPECIAL等）时被触发：

@Override
public void visitMethodInsn(int opcode, String owner, String name, String desc, boolean itf) {
    // 将被调用的方法（由owner, name, desc唯一确定）加入到当前正在分析的方法的调用集合中
    calledMethods.add(new MethodReference.Handle(new ClassReference.Handle(owner), name, desc));
    super.visitMethodInsn(opcode, owner, name, desc, itf);
}

通过这种方式，MethodDiscovery 成功构建了初步的调用图。

0x05 调用图构建与分析

在获得了初步的调用关系后，CallGraphDiscovery 会对其进行更深层次的分析，特别是处理多态性带来的复杂性。

多态性与实现查找

由于Java的多态性，当调用一个接口或抽象类的方法时，gadgetInspector无法在静态分析阶段确定具体执行的是哪个实现。因此，ImplementationFinder 的作用就是找出所有可能的实现类。

public class JacksonImplementationFinder implements ImplementationFinder {

    private final SerializableDecider serializableDecider;

    public JacksonImplementationFinder(SerializableDecider serializableDecider) {
        this.serializableDecider = serializableDecider;
    }

    @Override
    public Set<MethodReference.Handle> getImplementations(MethodReference.Handle target) {
        Set<MethodReference.Handle> allImpls = new HashSet<>();

        // 1. 获取目标方法所属的类
        ClassReference.Handle targetClass = target.getClassReference();
        // 2. 查找该类的所有实现类和子类
        Set<ClassReference.Handle> subClasses = inheritanceMap.getSubClasses(targetClass);
        // 3. 遍历所有子类，检查它们是否实现了目标方法
        for (ClassReference.Handle subClass : subClasses) {
            MethodReference.Handle implMethod = new MethodReference.Handle(subClass, target.getName(), target.getDesc());
            // 4. 使用SerializableDecider判断该实现类是否可序列化，这是成为有效gadget的一部分条件
            if (serializableDecider.isSerializable(subClass)) {
                allImpls.add(implMethod);
            }
        }
        return allImpls;
    }
}

逆拓扑排序

为了确保在分析一个方法时，它所依赖的所有被调用方法已经被分析过，CallGraphDiscovery 会对方法调用图进行逆拓扑排序。

List<MethodReference.Handle> sortedMethods = topologicallySortMethodCalls();

排序算法的核心是深度优先搜索（DFS），确保没有循环依赖，并按正确的顺序处理方法。

污点流分析 (Passthrough Discovery)

PassthroughDiscovery 分析方法内部的字节码，追踪参数（污点）如何流入返回值。这同样依赖于ASM的MethodVisitor。

public class PassthroughDiscoveryMethodVisitor extends AdviceAdapter {

    private final Map<MethodReference.Handle, Set<Passthrough>> passthroughDataflow;
    private final MethodReference.Handle method;
    private int argIndex = 0;

    public PassthroughDiscoveryMethodVisitor(MethodVisitor mv, int access, String name, String desc, Map<MethodReference.Handle, Set<Passthrough>> passthroughDataflow, MethodReference.Handle method) {
        super(ASM9, mv, access, name, desc);
        this.passthroughDataflow = passthroughDataflow;
        this.method = method;
    }

    @Override
    public void visitCode() {
        super.visitCode();
        int localIndex = 0;
        int argIndex = 0;
        if ((this.access & Opcodes.ACC_STATIC) == 0) {
            // 非静态方法，第一个局部变量是'this'
            setLocalTaint(localIndex, argIndex);
            localIndex += 1;
            argIndex += 1;
        }
        for (Type argType : Type.getArgumentTypes(desc)) {
            // 为每个参数在局部变量表中分配位置，并建立与参数索引的映射
            setLocalTaint(localIndex, argIndex);
            localIndex += argType.getSize(); // 考虑long/double占两个slot
            argIndex++;
        }
    }

    @Override
    public void visitVarInsn(int opcode, int var) {
        // 处理LOAD/STORE指令，追踪局部变量间的污点传递
        if (opcode >= ILOAD && opcode <= ALOAD) {
            Set<String> taint = getLocalTaint(var);
            setStackTaint(0, taint);
        } else if (opcode >= ISTORE && opcode <= ASTORE) {
            Set<String> taint = getStackTaint(0);
            setLocalTaint(var, taint);
        }
        super.visitVarInsn(opcode, var);
    }

    @Override
    public void visitFieldInsn(int opcode, String owner, String name, String desc) {
        Set<String> newTaint = new HashSet<>();
        if (opcode == GETFIELD) {
            // 检测到GETFIELD指令，尝试获取对象的字段
            // 首先检查该字段是否为transient，若是则不传播污点
            Boolean isTransient = isFieldTransient(owner, name);
            if (!Boolean.TRUE.equals(isTransient)) {
                // 如果不是transient，则将栈顶对象（如arg0）的字段（如arg0.name）作为新的污点
                for (String s : getStackTaint(0)) {
                    newTaint.add(s + "." + name);
                }
            }
        }
        super.visitFieldInsn(opcode, owner, name, desc);
        setStackTaint(0, newTaint);
    }

    @Override
    public void visitInsn(int opcode) {
        if (opcode == ARETURN || opcode == IRETURN || opcode == LRETURN || opcode == DRETURN || opcode == FRETURN) {
            // 当遇到RETURN指令时，分析返回值的污点来源
            Set<String> returnTaint = getStackTaint(0);
            for (String taint : returnTaint) {
                // 解析污点字符串，如 "arg0.field1.field2"，得到源参数索引和路径
                int dotIndex = taint.lastIndexOf('.');
                if (dotIndex != -1) {
                    int srcArgIndex = Integer.parseInt(taint.substring(4, dotIndex)); // "arg0" -> 0
                    String srcArgPath = taint.substring(dotIndex + 1);
                    // 记录下这个透传关系：该方法的返回值受参数srcArgIndex的影响，影响路径为srcArgPath
                    passthroughDataflow.computeIfAbsent(method, k -> new HashSet<>()).add(new Passthrough(srcArgIndex, srcArgPath));
                }
            }
        }
        super.visitInsn(opcode);
    }
}

通过这套机制，gadgetInspector能够精确地描绘出数据在方法调用链中的流动轨迹，最终组合成完整的、可利用的反序列化利用链。

上一篇：AI助手格局生变？解读Hermes Agent为何能威胁OpenClaw等云端Agent
下一篇：Apache ActiveMQ漏洞CVE-2026-34197分析：Jolokia JMX-HTTP桥接导致的RCE风险

Java, ASM, 字节码, 反序列化, CTF