JS 混淆技术与 AST 还原

2025-05-03

JavaScriptJavaScriptReverseSpider

本文参考文章 https://juejin.cn/post/7019155025666506788open in new window

常量混淆

对代码中的常量进行混淆，如字符串，布尔值，数字等，是 js 中最基本的混淆。可配合一些手段，增加常量的出现，从而使常量混淆的生效范围变大

字符串常量混淆

使用编码或加密函数对字符串常量进行混淆。注意，尽量少使用系统内置函数，如 atob 等，推荐尽量自己实现。内置函数容易被 jsHook，从而定位到关键内容

ascii 码混淆

ascii 码相关 api

// 字符转ascii码
<字符串>.charCodeAt(<索引>)  // 获取字符串中指定索引字符对应的ascii码。10进制整形

// ascii整形转字符串
String.fromCharCode(int, int, ...)  // 将传入的多个ascii码数字转为字符串
String.fromCharCode.apply(null, [int, int, ...]) // 使用apply以数组方式调用。无调用对象，参数可使用数组形式
                              
// 使用构造函数进一步混淆
"a"["constructor"]["fromCharCode"]()  // 使用字符串对象的构造函数引用 function String
window["String"]["fromCharCode"]()  // 使用 window 引用 function String

字符串常量转 ascii 码字符串以降低代码可读性。如 \x27 表示字符 '。转换代码如下，注意此处的转义问题

function str2ascii(str) {
  let res = ''
  // 字符串不能直接调用forEach，使用 call 方法让字符串调用  forEach
  ;[].forEach.call(  // `[` 开头，需要分号
    str, c => {res += "\\x" + str.charCodeAt(c).toString(16)}
  )
  return res
}

unicode 编码混淆

如字符串 aaa 转为unicode字符串 '\u0061\u0061\u0061'。相比ascii码的转换内容范围大（支持中文）

function str2unicode(str) {
  let res = '';
  for (let i = 0; i < str.length; i++) {
    res += "\\u" + parseInt(str.charCodeAt(i)).toString(16).padStart(4, '0');  // 补齐4位
  }
  return res
}

标识符可以使用unicode。在之前介绍过，类形式获取对象属性支持中文。同样，类形式获取属性可使用unicode，不论是否为中文。此时为标识符混淆。例如

a = {aaa: 1}
a.aaa  // 输出 1
a.\u0061\u0061\u0061  // 输出 1
\u0061.\u0061\u0061\u0061  // 输出 1

ascii / unicode 编码去混淆

此处还包含数字常量编码的解混淆。遍历数字与字符串字面量，进行特征判断，对满足特征的字面量进行处理

const simplifyLiteral = {
    // 访问所有数字字面量节点 {node} 参数是对象时，取对象中的node属性
    NumericLiteral({node}) {
        // node.extra.raw 保存了代码中字面量的原始字符串形式，例如 "0xff"
        // node.value 则是它实际的数值，例如 255
        // 这行代码检查原始字符串是否以 "0o" (八进制), "0b" (二进制), 或 "0x" (十六进制) 开头
        if (node.extra && /^0[obx]/i.test(node.extra.raw)) {
            node.extra = undefined
        }
    },
    // 访问所有字符串字面量节点
    StringLiteral({node}) {
        // 同样，检查字符串的原始形式是否包含转义序列，如 \uXXXX (unicode) 或 \xXX (ascii)
        if (node.extra && /\\[ux]/gi.test(node.extra.raw)) {
            // 将 extra 设置为 undefined，Babel 在生成代码时会
            // 将 "\x48\x65\x6c\x6c\x6f" 转换成它所代表的实际字符串 "Hello"
            node.extra = undefined
        }
    }
}
traverse(ast, simplifyLiteral);

对象属性访问方式

通过修改对象属性访问方式，可增加代码中的字符串常量，利于字符串混淆

访问方式介绍：在 js 中，对象属性访问方式有两种

function People(name) {
  this.name = name
}
People.prototype.introduce = function() {
    console.log(`Hi, I'm ${this.name}.`)
}
var p = new People('Otto')

// 以类形式获取属性
// 支持中文属性 例如 a = {'我': 1}  a.我 // 输出1
console.log(p.name); //Otto
p.introduce() // Hi, I'm Otto.
// 节点结构
Node {
  type: 'MemberExpression',
  object: Node {...},  // 拥有属性的对象
  computed: false,  // 类形式时，该为 false
  property: Node {  // 属性为标识符节点
    type: 'Identifier',
    name: 'prop1'
  }
}

// 通过 hash 表方式访问对象的属性
console.log(p['name']); // Otto
p['introduce'](); // Hi, I'm Otto.
// 节点结构
Node {
  type: 'MemberExpression',
  object: Node {...},
  computed: true,  // hash表形式时，该为 true
  property: Node {  // 属性为字符串字面量节点
    type: 'StringLiteral',
    extra: {...}
    value: 'prop2'
  }
}

修改对象属性访问方式。hash 表方式访问对象属性明显增加代码中字符串出现频率

// 通过节点结构可知，共两处修改点
// 修改点1: computed - 将 a.b 修改为 a[b]
// 修改点2: Identifier 节点修改为 StringLiteral  将 a[b] 修改为 a['b']

// 遍历成员表达式
MemberExpression(path) {
  // 定位 property 是标识符的节点
  if(t.isIdentifier(path.node.property)) {
    // 修改为 StringLiteral 节点。用属性名构造字符串
    path.node.property = t.stringLiteral(path.node.preperty.name)
  }
  path.node.computed = true  // computed 改为 true
}

去混淆代码。代码来源文章 https://lzc6244.github.io/2021/08/02/Babel%E5%B0%86a-'bb'-%E8%BD%AC%E6%8D%A2%E4%B8%BAa.bb.htmlopen in new window

MemberExpression(path) {
  const { computed } = path.node
  // 获取 path property 子路径
  const property = path.get('property')
  if (computed && types.isStringLiteral(property)) {
    property.replaceWith(types.identifier(property.node.value))
    path.node.computed=false
  }
}

标准内置对象

将标准内置对象如 Date/Math 等, 通过 window 进行访问（ window['Date'] ），增加了字符串出现频率，利于字符串混淆，且进行了 window 对象检测

Identifier(path) {
  const name = path.node.name;
  if ('|eval|paresInt|encodeURIComponent|Object|Function|Boolean|Number|Math|Date|String|RegExp|Array|'.indexOf(`|${name}|`) != -1) {  // 两边加上 `|` ， 放置变量名 'a' 等子字符串判断异常异常
    path.replaceWith(  // 替换为成员表达式
      t.memberExpression(  // 构建成员表达式 传入对象，属性（字符串或标识符等），computed
        t.identifier('window'),
        t.stringLiteral(name),
        true
      )
    );
  }
}

数字常量混淆

一些数字具有一定的特征，例如：

MD5中的常量 4个幻数

A: 0x67452301
B: 0xefcdab89
C: 0x98badcfe
D: 0x10325476

A: 1732584193
B: 4023233417
C: 2562383102
D: 271733878

SHA1中的幻数

uint32_t H0 = 0x67452301;   // 0x01, 0x23, 0x45, 0x67
uint32_t H1 = 0xEFCDAB89;   // 0x89, 0xAB, 0xCD, 0xEF
uint32_t H2 = 0x98BADCFE;   // 0xFE, 0xDC, 0xBA, 0x98
uint32_t H3 = 0x10325476;   // 0x76, 0x54, 0x32, 0x10
uint32_t H4 = 0xC3D2E1F0;   // 0xF0, 0xE1, 0xD2, 0xC3

/* 前4个数字与MD5幻数一致
A: 1732584193
B: 4023233417
C: 2562383102
D: 271733878
E: 3285377520
*/

通过这些特征，可以迅速确定关键逻辑

因此需要对数字常量进行混淆。常用的混淆手段如下

对数字进行进制转换。进制转换的去混淆代码见 # ascii / unicode 编码去混淆
常用位异或对数字进行处理。当 $ a \oplus b = c$ 时， $b \oplus c = a$ 。因此在加密算法中，可使用异或方式处理初始值，例如 0xC3D2E1F0 ^ 0x12345678 = 0xD1E6B788 ，使用 0xD1E6B788 ^ 0x12345678 代替 SHA1 算法的幻数H5值 0xC3D2E1F0 。此时0x12345678 通常是作为密钥的身份，此时的思路其实与DES类似，向量，明文，密钥在CBC等模式需要在加密前或后相互进行异或。因此DES等算法可在异或位置进行插桩
- 混淆算法：遍历 NumericLiteral ，其 value 为对应数字常量的值。将一随机数作为 sercet，所有常量与其异或得到值记为 payload。此时可将 NumericLiteral 替换为 payload 异或 secret 的 binaryExpress
```
NumericLiteral(path) {
  const value = path.node.value;
  const secret = parseInt(Math.random() * (999999-100000) + 100000, 10);  // 生成 [100000, 999998] 的整数
  const payload = value ^ secret;
  path.replaceWith(  // 替换为二项式
    t.binaryExpression(  // 构建二项式，传入 操作符、左节点（加密数字）、右节点（密钥）
      '^',
      t.numericLiteral(payload),
      t.numericLiteral(secret)
    )
  )
  // 替换后会生成新的 NumericLiteral 因此跳过防止死循环 - 新节点内子节点类型与遍历类型相同
  path.skip()
}
```
  Tips: 关于是否使用 path.skip()
  - 当一个节点被替换后，遍历器认为这是一个全新的子树，需要从头开始访问
  - 当替换后的新节点（或其子节点中）包含了会被当前访问者（Visitor）再次捕获的节点类型时，就需要使用 path.skip()

数组混淆

使用数组方式，将代码中的常量放入数组，代码原本的常量位置，使用索引从数组中获取常量的值

代码示例

var arr = ['atob', 'RGF0ZQ=='];  // 数组中可以包含各种类型
var self = window;
var t = new self[ self[arr[0]](arr[1]) ]();  // self[ self['atob']('RGF0ZQ==') ]()

数组位移：为了使解密时，索引不能直接进行使用，通常会配合位移函数

// 右移
(function(arr, num) {
  var shuffer = function(nums) {
    while(--nums) {
      arr.unshift(arr.pop())
    }
  };
  shuffer(++num);
}(arr, 0x2))

// 左移
(function(arr, num) {
  var shuffer = function(nums) {
    while(--nums) {
      arr.push(arr.shift())
    }
  };
  shuffer(++num)
})(arr, 0x2)

混淆代码：将常量抽离到数组中。通过索引进行引用。额外生成一个数组，将字符串字面量 push 进数组，并获取当前 push 的索引，将原处进行替换

字符串常量转换为 atob(arr[0]) ( callExpression 内为 memberExpress)

// 数组
let arr = []

// 常量替换
traverse(ast, {
  StringLiteral(path) {
    const b64str = btoa(path.node.value)  // btoa 得到base64字符串
    let index
    // 判断当前值是否已经存在于数组中
    if (arr.indexOf(b64str) == -1) {
      // 不存在，添加
      const len = arr.push(b64str)  // push 函数返回添加完成后的数组长度
      index = len - 1  // 新元素是数组最后一个
    } else {
      // 已存在，获取当前的索引
      index = arr.indexOf(b64str)
    }
    path.replaceWith(  // 替换为 `callExpression`，内层为 `memberExpress`
      t.callExpression(  // 构建调用表达式，传入 callee(函数名)，arguments(参数数组)
        t.identifier('atob'),
        [t.memberExpression(  // 构建成员表达式
          t.identifier('arr'),
          t.numericLiteral(index),
          true
        )]  
      )
    )
  }
})

// 数组放入代码
arr = arr.map(i => t.stringLiteral(i))  // 构造数组内的字符串常量节点
arr = t.variableDeclarator(t.identifier('arr'), t.arrayExpression(arr))  // 数组初始化节点
arr = t.variableDeclaration('var', [arr])  // 数组声明节点
ast.program.body.unshift(arr)  // 放入代码最开始

数组位移。使用 pop + unshift 或 shift + push 对数组进行位移。使逆向者还原时，不能直接使用下标得到对应值替换

arr = arr.map(i => t.stringLiteral(i))
// 数组内容左移进行混淆 - 此处注意函数形参的变量名不要与外部数组一致
(function(ar, num) {
  var shuffer = function(nums) {
    while(--nums) {
      ar.push(ar.shift())
    }
  };
  shuffer(++num)
})(arr, 0x1)

// 代码内通过右移还原数组 - 此处注意函数形参变量名不要与外部数组一致
const arrShiftCode = `
(function(ar, num) {
  var shuffer = function(nums) {
    while(--nums) {
      ar.unshift(ar.pop())
    }
  }
  shuffer(++num)
})(arr, 0x1)
`
const shiftAst = parser.parse(arrShiftCode)  // 数组位移代码解析为ast
ast.program.body.unshift(shiftAst)   // 将代码放入最开始
// 在位移代码上方再放入位移后的数组代码
arr = t.variableDeclarator(t.identifier('arr'), t.arrayExpression(arr))
arr = t.variableDeclaration('var', [arr])
ast.program.body.unshift(arr)

数组位移代码混淆：由于不能使用数组混淆的方式混淆数组位移代码自己（为了保证数组位移顺利执行）。对该位移源码使用字符编码进行混淆

// 只对数组位移的代码进行混淆
const shiftAst = parser.parse(arrShiftCode)  // 数组位移代码解析为ast
traverse(shiftAst, {
  MemberExpression(path) {
    if (t.isIdentifier(path.node.property)) {
      const name = path.node.property.name;
      path.node.property = t.stringLiteral(str2ascii(name));  // 改为hash表方式形式，并使用ascii混淆，unicode混淆同理
    }
    path.node.computed = true
  }
})
// 此时代码字符串中的 `\\` 变更为可转义的 ascii 码
code = code.replace(/\\\\x/g, '\\x')

OB混淆：常用数组混淆配合解密函数，使用解密函数传入指定参数获取数组中的常量值。以下是一个示例

// 数组封装为一个函数
function o() {
  var I8 = ['UjVkF', 'fromCharCode', 'uKuSb', 'kClGo', 'lCntN', '3|5|4|2|1|0', 'oEyoT', 'yZwGd', 'c', ...];
  o = function() {
    return I8;
  }
  ;
  return o();
}
// 解密函数
function B(A, s) {
    var m = o(); // 这里的 m 接收了 o() 函数返回的初始字符串数组
    B = function(Q, G) { // 关键：B 函数在这里被重新定义了 - 混淆常用技巧
        Q = Q - 0x18a; // 对传入的索引进行偏移
        var c = m[Q]; // 从 m 数组中取出对应索引的值
        return c;
    }
    ;
    return B(A, s); // 递归调用自身，但这次是调用重新定义后的 B
}
// 左移
(function(A, s) {   // 数组函数 与 循环停止条件
    var mb = {
        A: 0x22c,
        s: 0x337
    };
    var ox = B;  // 获取指定索引偏移元素的函数
    var m = A();  // 原始数组
    while (!![]) {
        try {
            var Q = -parseInt(ox(0x1b2)) / 0x1 + parseInt(ox(mb.A)) / 0x2 + parseInt(ox(0x27b)) / 0x3 + -parseInt(ox(0x1db)) / 0x4 + parseInt(ox(mb.s)) / 0x5 + parseInt(ox(0x268)) / 0x6 + parseInt(ox(0x32d)) / 0x7 * (-parseInt(ox(0x2d0)) / 0x8);
            if (Q === s) {  // 判断是否到达目标的第一个元素
                break;  // 停止左移
            } else {
                m['push'](m['shift']());
            }
        } catch (G) {
            m['push'](m['shift']());  // 执行条件时出现异常（即 parseInt ），依然进行位移
        }
    }
}(o, 0x7fdfa));  // 数组函数 与 循环停止条件

去OB混淆：本例为全局解密函数的情况（只有一个数组解密函数），代码中前三部分是：数组声明，数组位移，解密函数。思路：通过eval执行解密函数及其涉及内容，使还原脚本可调用解密函数获取原文，进行替换

加载解密函数

// 本例中解密函数为 _0x53b5 ，代码中大量出现 _0x53b5[0xXXX]
const stringDecryptFuncAst = ast.program.body[2]
const stringDecryptFunName = stringDecryptFuncAst.declarations[0].id.name
// 创建一个 AST，将原代码中的三部分，加入到 body节点(数组声明，数组位移，解密函数)
const newAst = parser.parse('')
newAst.program.body.push(ast.program.body[0])
newAst.program.body.push(ast.program.body[1])
newAst.program.body.push(stringDecryptFuncAst)
const {code: stringDecryptFunc} = generator(newAst, {compact: true})  // 压缩代码避免格式化检测
// 通过eval使该环境做了数组声明，数组位移，字符串解密函数声明等操作
// 执行后，此时该环境已拥有解密函数 _0x53b5
eval(stringDecryptFunc)
// 测试能否正常解密
console.log(_0x53b5('0x15'))

替换字符串解密。通过解密函数的 binding.referencePaths 定位解密函数的全部调用处（避免同名函数的错误替换）

traverse(ast, {
  // 遍历变量定义处 - 解密函数为 VariableDeclarator
  VariableDeclarator(path) {
    if (path.node.id.name == stringDecryptFunName) {  // 定位到解密函数的定义
      const binding = path.scope.getBinding(stringDecryptFunName)  // 解密函数binding
      // 引用解密函数标识符的地方 - Identifier 的 path 对象
      binding && binding.referencePaths.forEach(item => {
        // 该解密函数标识符作为函数被调用了，则说明此处使用解密函数进行解密，需要进行替换
        item.parentPath.isCallExpression() && item.parentPath.replaceWith(
          t.stringLiteral(
            // 执行该解密函数调用，得到解密结果
            eval(item.parentPath + '')  // 隐式转换为代码字符串后执行得到解密结果
          )
        )
      })
    }
  }
})

已完成数组混淆处理，移除代码开头的数组声明，数组位移，字符串解密代码

ast.program.body.shift()
ast.program.body.shift()
ast.program.body.shift()
// 本例使用的加密 js ，相关代码在代码头部
// 根据情况移除。可手动移除代码，而不通过修改 ast 方式移除

eval 混淆字符串常量与代码

通过 eval 的常用混淆手段有如下两种

eval 混淆字符串常量

使用 eval 可执行解密函数，将编码或加密后的字符串常量还原。以下为混淆代码

StringLiteral(path) {
  path.replaceWith(
    t.callExpression(  // 构建调用表达式，传入 callee(函数名)，arguments(参数数组)
      t.identifier('atob'),
      [t.stringLiteral(btoa(path.node.value))]  // btoa 得到base64字符串
    )
  );
  // 替换后会生成新的 StringLiteral 因此跳过防止死循环 - 新节点类型与遍历类型相同
  path.skip()
}

eval 混淆 js 代码

eval 执行源码字符串，可增加代码中字符串出现频率，进而使用字符串常量混淆手段，对关键代码进行混淆。由于特征过于明显，一般只加密关键行，使逆向者无法通过在控制台内关键词搜索定位核心代码位置

使用 eval 执行编码后的源码字符串

eval(atob('bmV3IHdpbmRvdy5EYXRlKCk='))  // btoa('new window.Date()')

// res = [];
//for (i = 0; i< 'new window.Date()'.length; i++) {
//  res.push('new window.Date()'.charCodeAt(i))}
eval(window.String.fromCharCode([110, 101, 119, 32, 119, 105, 110, 100, 111, 119, 46, 68, 97, 116, 101, 40, 41]))

通过其他一些函数，对代码字符串进行拼接，格式化后，得到原始代码，再通过 eval 执行（该示例的逻辑类似于vmp中的指令）

eval(
  function(p,a,c,k,e,r){
    e=String;if(!''.replace(/^/,String)){while(c--)r[c]=k[c]||c;k=[function(e){return r[e]}];e=function(){return'\\w+'};c=1};while(c--)if(k[c])p=p.replace(new RegExp('\\b'+e(c)+'\\b','g'),k[c]);return p
  }('0.1("2");',3,3,'console|log|CC11001100'.split('|'),0,{})
  // 通过自执行函数，返回一个代码的明文字符串。加密逻辑通过自执行函数还原
)
// 此处使用的加密工具 https://jsrei.github.io/eval-decoder/
// 此代码的原文 console.log("CC11001100");
// 0.1("2"); 按照 console|log|CC11001100 的索引进行替换

eval混淆代码：eval 混淆 js 代码时，需要在标识符混淆之后，保证被编码的代码字符串中涉及的标识符已存在

eval-base64 混淆函数体：eval执行的base64编码的字符串

全部代码行混淆

// 函数体混淆 FunctionExpression FunctionDeclaration
FunctionExpression(path) {
  const blockStatement = path.node.body
  // 构造 eval 混淆的函数体
  const statements = blockStatement.body.map(line => {  // 遍历函数体中的每一行
    if (t.isReturnStatement(line))  // return 不能进行混淆
      return line
    const {code} = generator(line)
    const decrypt = t.callExpression(  // 构建解混的调用表达式，传入调用函数与参数数组
      t.identifier('atob'),
      [t.stringLiteral(btoa(code))]  // 参数为源码字符串混淆后的字符串
    )
    return t.expressionStatement(  // 构造新的行用于替换
      t.callExpression(  // eval的调用表达式
        t.identifier('eval'),
        [decrypt]
      )
    )
  })  // 除了return，每一行都进行了eval混淆
  path.get('body').replaceWith(t.blockStatement(statements));  // 将整个函数体替换为混淆后的
}

部分代码行混淆：在目标行后增加注释 eval-atob 用于标记需要混淆的代码行

// 在目标行末添加指定注释
// Node.trailingComments = [
//   type: 'CommentLine',
//   value: '' // 注释内容
// ]

// 遍历存在 ‘eval-atob’ 注释的行
FunctionExpression(path) {
  const blockStatement = path.node.body;
  const Statements = blockStatement.body.map(line => {
    if (t.isReturnStatement(line))
      return line
    if (!(line.trailingComments && line.trailingComments[0].value == 'eval-atob'))
      return line  // 过滤无指定注释的行
    delete line.trailingComments  // 移除base64字符串内注释信息
    const {code} = generator(line)
    const decrypt = t.callExpression(
      t.identifier('atob'),
      [t.stringLiteral(btoa(code))]
    )
    return t.expressionStatement(
      t.callExpression(
        t.identifier('eval'),
        [decrypt]
      )
    )
  })
  path.get('body').replaceWith(t.blockStatement(Statements))
}
// 行末依然含有注释，需要设置 generator 的参数移除注释
// const {code} = generator(ast, {comments: false})

eval-ascii 混淆函数体：同理，构造出 eval(String.fromCharCode(xx, xx, xx, xx....))

FunctionExpression(path) {
  const blockStatement = path.node.body;
  const Statements = blockStatement.body.map(line => {
    if (t.isReturnStatement(line))
      return line
    if (!(line.trailingComments && line.trailingComments[0].value == 'eval-ascii'))
      return line  // 过滤无指定注释的行
    delete line.trailingComments  // 移除ascii字符串内注释信息
    const {code} = generator(line)  // 代码字符串
    const asciiArray = [].map.call(  // // 构造ascii码数组
      code, 
      c => t.numericLiteral(c.charCodeAt(0))  // 每个 char 转换成 10 进制 ascii 码
    )  // 字符串不能直接调用 map，使用 call 让字符串调用 map 方法
    const decriptFuncName = t.memberExpression( // 构造 `String.fromCharCode` 成员访问
      t.identifier('String'),   // 对象
      t.identifier('fromCharCode')  // 属性
    )
    const decrypt = t.callExpression(  // 构造 String.fromCharCode 的调用
      decriptFuncName,  // 目标函数
      asciiArray  // 参数列表（n个参数） String.fromCharCode(xx,xx,xx,...)
    )
    return t.expressionStatement(
      t.callExpression(
        t.identifier('eval'),
        [decrypt]
      )
    )
  })
  path.get('body').replaceWith(t.blockStatement(Statements))
}
// 同理，需要设置 generator 的参数移除注释
// const {code} = generator(ast, {comments: false})

花指令

将代码复杂化，增加代码量。增加函数调用，如加减法，位移，异或，函数调用等简单逻辑，通过函数封装，增加函数调用，降低代码可读性

// 原始代码
c = a + b

// 混淆后
function _0x2b02af (_0x83e2f1, _0x6a9e53) {
  retutn _0x83e2f1 + _0x6a9e53
}
c = _0x2b02af(a, b)


// 调用花指令
function _0xeca66f (_0x701735, _0x22f4b0, _0x512ba8) {
  _0x701735(_0x22f4b0, _0x512ba8)  // 第一个参数作为函数，进行调用
}


// 多层花指令
function _0x8e90a3 (_0x934a5f, _0x3287ae) {
  _0x2b02af(_0x934a5f, _0x3287ae)  // 调用其他花指令
}
function _0x2b02af (_0x83e2f1, _0x6a9e53) {
  retutn _0x83e2f1 + _0x6a9e53
}

花指令混淆二项式

混淆逻辑：将代码中的二项式，都构造为花指令。将花指令放在二项式所在函数的开始处

构建目标。可使用 ast exploreropen in new window 解析此代码，理解节点结构

var a = 1
var b = 2

var func = function(a, b) {
  function x(a, b) {  // 花指令函数的声明，放置于原本二项式所在函数体的开始位置
    return a + b
  }
  return x(a, b)  // 将原本的二项式替换为花指令函数的调用
}

var origin_func = function(a, b) {
  return a + b
}

混淆代码：遍历所有二项式，在二项式所在函数体的开始位置（BlockStatement 初始位置），声明花指令函数。再将二项式替换为花指令的调用（BinaryExpression -> CallExpression）

由于花指令的作用是膨胀代码量，无需考虑该二项式是否已生成该种类的花指令（每遍历到一个加法二项式，都可以生成一个加法的花指令函数）

Tips：花指令产生了新的标识符，该混淆手段建议处于标识符混淆之前

BinaryExpression(path) {
  const operator = path.node.operator;
  const left = path.node.left;
  const right = path.node.right;
  const a = t.identifier('a');
  const b = t.identifier('b');
  const funcNameIdentifier = path.scope.generateUidIdentifier('_0x251b');
  const func = t.functionDeclaration(  // 创建函数节点
    funcNameIdentifier,  // 函数名
    [a, b],  // 参数列表
    t.blockStatement([  // 函数体
      t.returnStatement(
        t.binaryExpression(operator, a, b)
      )
    ])
  );;
  const BlockStatement = path.findParent(p => p.isBlockStatement())  // 定位二项式所在函数体
  BlockStatement.node.body.unshift(func);  // 花指令放入函数体起始位置
  path.replaceWith(t.callExpression(funcNameIdentifier, [left, right]));  // 替换花指令
}

去花指令混淆

以对象形式的花指令为例，根据具体情况进行一些调整

    var $n = {
        OTbSq: Gn(n, o) + "|0",  // Gn为 Ob 混淆的解密函数，去 Ob 混淆后为字符串常量
        PFQbW: function(e, t) {
            return e === t  // 判断相等
        },
        IfwWq: $n[Gn(1312, i)],  // 调用了其他花指令 - 说明出现了多层花指令
        KPTzH: function(e, t) {
            return e(t)  // 函数调用
        },
        ...
    }

为了处理多层指令，需要访问花指令，使用递归进行处理，当内部为成员表达式时，则说明嵌套了其他花指令，继续深层访问

字符串常量花指令

生成 dict：获取代码中的全部 ObjectExpression，放入一个对象中（花指令存放在多个函数而不是一个对象时，遍历 FunctionDeclaration ，根据花指令函数标识符特征，将花指令放入该全局对象）

const dict = {}  // 用于存储对象，便于解密时查询
function generatorDict(ast) {
  traverse(ast, {
    VariableDeclarator(path) {
      if (t.isObjectExpression(path.node.init)) {  // 定位到对象定义
        // 获取对象的标识符名
        const objName = path.node.id.name
        // 在 dict 中创建该对象，此处做了去重处理，判断是否已存在
        objName && (dict[objName] = dict[objName] || {})
        // 将该对象的属性节点写入 - 如果出现了重名对象内存在重名属性，则会覆盖
        dict[objName] && path.node.init.properties.forEach(  // 遍历属性节点键值对 item.key&value
          // 判断 key 为字符串还是标识符 {'a': 1}  {a: 1}
          item => dict[objName][t.isStringLiteral(item.key) ? item.key.value : item.key.name] = item.value
        )
      }
    }
  })
}
generatorDict(ast)

代码中的多层花指令替换为单层花指令：所给示例中多层花指令特征，花指令对应内容为访问对象属性 MemberExpression ，且对象也是花指令对象

function findRealValue(node){
  // 该键值对中的值对访问对象属性
  if(t.isMemberExpression(node)) {
    const objName = node.object.name
    // 此处是简单的多层花指令，即 var a = {xxx: 'xxx'}  var b = {xxx: a['xxx']}
    // 在处理完数组混淆后，b 中访问没有额外混淆
    const propName = node.property.value
    // 该值在字典中有出现
    if (dict[objName][propName]) {
      // 递归查询该值 - 直到不是访问对象属性
      return findRealValue(dict[objName][propName])
    }
  } else {
    // 递归结束的条件
    // 不是访问对象属性，返回该值的真实内容
    return node
  }
}
traverse(ast, {
  VariableDeclarator(path) {
    if(t.isObjectExpression(path.node.init)) {
      // 遍历对象中的键值对
      path.node.init.properties.forEach(item => {
        // 传入值，如果为多层，则此处返回真实内容
        const realNode = findRealValue(item.value)
        // 将多层替换为单层
        realNode && (item.value = realNode)
      })
    }
  }
})
ast = generatorDict(ast)  // 再次解析出字典，将其中的多层花指令重新生成，此时变更为单层

通过字典，将花指令进行替换

MemberExpression(path) {
  const objNmae = path.node.object.name
  const propName = path.node.property.value
  // 进行特征判断，比如此处的是否为字符串属性
  if (dict[objName] && t.isStringLiteral(dict[objName][propName])) {
    // 进行替换
    path.replaceWith([objName][propName])
  } 
}

处理函数内容的花指令

此处是简单的函数生成的多层花指令，ast exploreropen in new window 解析此代码，便于理解节点结构

var a = {xxx: function(e, t){return e + t}}
var b = {yyy: function(e, t){return a['xxx'](e, t)}}
console.log(b['yyy'](1, 2))
console.log(a['xxx'](3, 4))

生成全局对象 dict，代码同 #字符串生成的花指令

多层花指令替换为单层花指令

function findRealFunc(node) {
  // 当传入节点为函数，且函数体内只有一条语句
  if (t.isFunctionExpression(node) && node.body.body.length == 1) {
    // 取出该唯一一条 return 语句的内容
    const exp = node.body.body[0].argument.callee
    // 判断该语句是否为成员表达式 - 是，则说明是多层花指令
    // 
    if (t.isMemberExpression(exp)) {
      // 取出内层花指令对象 - 此处例子中为 'a'
      const objName = node.object.name
      // 取出内层花指令的key - 此处例子中为 'xxx'
      const propName = node.property.value
      // 该值在字典中有出现
      if(dict[objName]) {
        // 递归查询该值 - 直到函数体内return语句后不是访问对象属性
        return findRealFunc(dict[objName][propName])
      }
    }
    // 此处返回递归后的 FunctionExpression
    return node
  } else {
    return node
  }
}

traverse(ast, {
  // 遍历声明对象的部分
  VariableDeclarator(path) {
    if(t.isObjectExpression(path.node.init)) {
      // 遍历对象 中的键值对
      path.node.init.properties.forEach(item => {
        // 传入值，如果为多层，则此处返回最内层的函数表达式
        const realNode = findRealFunc(item.value)
        // 将多层替换为单层（多层函数调用替换为最内层函数表达式）
        realNode && (item.value = realNode)
      })
    }
  }
})
    
ast = generatorDict(ast)  // 再次解析出字典，将其中的多层花指令重新生成，此时变更为单层

通过查字典，将函数花指令还原为二项式（function(e,t){return e+t}）或函数调用（function(e,t){return e(t)}），二项式或函数调用的标识符替换为花指令实参的标识符

traverse(ast, {
  // 遍历调用处
  CallExpression(path) {
    // 通过调用处的函数是否为成员表达式来确定是否为花指令
    if (!t.isMemberExpression(path.node.callee)) return
    // 取出花指令对象名 'a' 与花指令key 'xxx'
    const objName = path.node.callee.object.name
    const propertyName = path.node.callee.property.value
    // 查字典
    if (dict[objName] && dict[objName][propertyName]) {
      // 获取该花指令函数
      const func = dict[objName][propertyName]
      // 取出花指令内容，return 后的内容 - 由于已经处理了字符串，此时花指令必然是函数
      const returnExp = func.body.body[0].argument
      
      // 判断是二项式
      if (t.isBinaryExpression(returnExp)) {
        // 使用实参构造真实的二项式
        const binExp = t.binaryExpression(returnExp.operator, path.node.arguments[0], path.node.arguments[1])
        // 花指令替换为真实的二项式
        path.replaceWith(binExp)
      
      // 判断是函数调用
      } else if (t.isCallExpression(returnExp)) {
        // 将花指令第一个参数后的参数构造为一个新的参数数组（第一个参数是目标函数）
        const paramsArray = path.node.arguments.slice(1)
        // 构造调用表达式，函数为花指令的第一个参数
        const callExp = t.callExpression(path.node.arguments[0], paramsArray)
        // 花指令替换为真实的函数调用
        path.replaceWith(callExp)
      }
    }
  }
})

移除对象声明表达式。可手动移除，防止其他非花指令对象被误删

VariableDeclarator(path) {
  if (t.isObjectExpression(path.node.init)) {
    path.remove()
  }
}

控制流混淆

流程平坦化

通过 switch 对代码流程进行处理，打乱代码中的显示顺序，通过数组作为分发器，配合循环实现按照数组顺序执行指定 case

这种混淆方式可以进行嵌套，即流程平坦化的某个 case 中，也是一个流程平坦化代码（数组声明，循环，switch-case）。也可配合花指令，即生成很多不会执行的 case

循环遍历数组可以使用 for 或 while，switch 中可能存在 default 作为最后一个步骤，也可以不存在。以下是使用 for 循环，无 default 的例子

s = '6|3|5|4|2|1'
l = s.split('|')
for (i = 0; ;) {  // 此处该写法未设置循环结束条件
	switch (l[i++]) {
    case '1':
      console.log('step 6');
      continue  // 不使用break，防止执行到 switch 后的 break，从而跳出循环
    case '2':
      console.log('step 5');
      continue
    case '3':
      console.log('step 2');
      continue
    case '4':
      console.log('step 4');
      continue
    case '5':
      console.log('step 3');
      continue
    case '6':
      console.log('step 1');
      continue
  }
  // 索引超出后，swith 条件为 undefined，无 case 满足条件，从而执行至此
  break  // 跳出循环
}

流程平坦化混淆

代码流程转换为如下内容，达到打乱代码显示顺序的目的

array = ''.split('|')  // 真实顺序数组
while(!![]) {  // jsf*ck - [] 隐式转换为 true
  switch (+array[i++]){  // + 用于隐式转换为数字类型
    case 0:  // case 顺序默认 0，1，2，3，4
      ... 
      continue
    case 1:
      ...
      continue
  }
  break  // 没有指定 case 时跳出循环
}

遍历函数体的每一行语句，并打乱顺序，混淆代码如下

FunctionExpression(path){
  const blockStatement = path.node.body
  const statements = blockStatement.body.map((item, index) => {return {index: index, value: item}})  // 记录语句的真实顺序
  // 洗牌，打乱语句顺序
  for(let i = statements.length - 1; i > 0 ; i--) {  // i 为从最后一个元素开始的索引
    const j = Math.floor(Math.random() * (i+1))  // [0, i] 的一个索引值，随机交换或不交换
    // 交换语句内容，此时 statements 的每一项包含 index 真实顺序，value 对应的语句
    ;[statements[j], statements[i]] = [statements[i], statements[j]]  // 变量值交换，类似 py。`[` 开头，需要分号
  }
  // 构建分发器，创建 switch 中的 Case 数组
  const dispenserArr = []  // 真实顺序数组
  const cases = statements.map((line, index) => {  // 将乱序后的语句构建为 case
    dispenserArr[line.index] = index  // 使当前语句出现在其真实顺序
    return t.switchCase(  // 构建 case 语句  
      t.numericLiteral(index),  // SwitchCase.test: case 条件，默认从 0 开始
      [line.value, t.continueStatement()]  // SwitchCase.consequent: case 内容
    )
  })
  // 初始化
  const dispenserStr = dispenserArr.join('|')
  const array = path.scope.generateDeclaredUidIdentifier('array')  // 数组变量名
  const index = path.scope.generateDeclaredUidIdentifier('index')  // 索引变量名
  const callee = t.memberExpression(t.stringLiteral(dispenserStr), t.identifier('split'))
  const arrayInit = t.callExpression(callee, [t.stringLiteral('|')])  // split语句
  const varArray = t.variableDeclarator(array, arrayInit)  // 数组初始化
  const varIndex = t.variableDeclarator(index, t.numericLiteral(0))  // 索引初始化
  const dispenser = t.variableDeclaration('let', [varArray, varIndex])  // 数组与索引的声明
  // 循环
  const updExp = t.updateExpression('++', index)  // 更新索引
  const memExp = t.memberExpression(array, updExp, true)  // 数组中取当前的 case
  const discriminant = t.unaryExpression('+', memExp)  // 隐式转换语句 +array(index++)
  const switchSta = t.switchStatement(discriminant, cases)  // 构建 switch， 传入 待判断变量 与 case 数组
  const unaExp = t.unaryExpression('!', t.unaryExpression('!', t.arrayExpression()))  // !![]
  const whileSta = t.whileStatement(unaExp, t.blockStatement([switchSta, t.breakStatement()]))  // while 循环

  path.get('body').replaceWith(t.blockStatement([dispenser, whileSta]))  // 替换成初始化与循环
}

去平坦化

所给混淆代码的去混淆（ while 循环，无 default，每个 case 仅一个逻辑）

获取执行顺序。成员表达式中，对象为字符串且调用 split 则为分发器。（由于访问方式是方括号字符串，不能使用 path.evaluate() 判断其返回对象的 confident 是否为真）

解析switch-case。将 case 按照编号进行存储，方便后续按照顺序取出实际的指令

按照执行顺序组装 blockStatement

MemberExpression(path) {
  // 过滤：对象为字符串且调用 `split` 
  if (types.isStringLiteral(path.node.object) && types.isStringLiteral(path.node.property, {value: 'split'})) {
    // 定位声明语句
    const varPath = path.findParent(p => types.isVariableDeclaration(p))
    // 获取 while 语句，声明语句同级节点的下一条语句
    const whilePath = varPath.getSibling(varPath.key + 1)
    // 过滤：声明语句后第一条是否为 while 语句
    if (!types.isWhileStatement(whilePath.node)) return
    // 获取 while 内的 switch 语句
    const switchPath = whilePath.node.body.body[0]
    // 过滤：while内第一条是否为 switch
    if (!types.isSwitchStatement(switchPath)) return

    // 构建 caseMap - 此处使用数组方式，不支持 default 存在，不支持 case 内多条语句 !!!
    const StaArr = []
    // 将 case 语句中的内容按照 case test 执行顺序赋值到数组中
    switchPath.cases.forEach(
      // 每个 case 只有第一句为执行逻辑 !!! 
      c => StaArr[c.test.value] = c.consequent[0]
    )
    // 以上代码等效为按顺序将每个 case 中的第一条语句放入数组 (case 一般默认为 '0' '1' '2' ...)

    // 获取真实执行顺序
    const shufferArr = path.node.object.value.split('|')

    // 根据实际执行顺序将代码放入语句块
    const parentPath = whilePath.parent
    varPath.remove()  // 移除 执行顺序声明
    whilePath.remove()  // 移除 while
    shufferArr.forEach(line => parentPath.body.push(StaArr[line]))
    
    // 每次处理一个后停止，防止内层嵌套方式的问题
    path.stop()
  }
}

多次循环的方式来对代码内的流程平坦化逐个进行处理。流程平坦化嵌套情况下（ case 内出现流程平坦化）traverse全部处理可能存在问题
```
for (let i = 0; i < 10; i++) {
  traverse(ast, {
    ...
  })
}
```

while 循环，有 default，每个 case 可能多条语句。代码来源

目标代码

var array = '1|0|2|3'['split']('|'), index = 0;

while (true) {
    switch (array[index++]) {  // 索引超出时为 undefine 执行至 default
        case '0':
            console.log('This is case 0');
            continue;
        case '1':
            console.log('This is case 1');
            continue;
        case '2':
            console.log('This is case 2');
            continue;
        case '3':
            console.log('This is case 3');
            continue;
        default:
            console.log('This is case [default], exit loop.');
    }
    break;
}

将访问方式修改为点，使 '1|0|2|3'['split']('|') 变更为 '1|0|2|3'.split('|')，从而可以使用path.evaluate()，获取返回对象中的 confident 与 value。代码来源

MemberExpression(path) {
  const { computed } = path.node
  const property = path.get('property')
  // 如果是方括号形式，则修改为点形式
  if (computed && types.isStringLiteral(property)) {
    property.replaceWith(types.identifier(property.node.value));
    path.node.computed=false;
  }
}

遍历 while语句，此时可使用 path.evaluate() ，可在 while 内获取到实际执行顺序。代码来源

WhileStatement(path) {
  const { body } = path.node
  // while内的第一条语句
  const switchStatement = body.body[0]
  // 过滤：第一条语句是否为switch
  if (!types.isSwitchStatement(switchStatement)) return
  const { discriminant, cases } = switchStatement
  // 过滤：switch 判断内容是否为 array[index++] 成员表达式，该成员表达式属性为自增表达式
  if (!types.isMemberExpression(discriminant) || !types.isUpdateExpression(discriminant.property)) return

  // 找到 array 是哪定义的，并且使用 path.evaluate() 方法获取其最终值
  const { confident, value } = path.scope.getBinding(discriminant.object.name).path.get('init').evaluate()

  // 过滤：switch 的对象是否为常量
  if (!confident) return

  // array - 真实执行顺序的数组  caseMap - 不同 case 值对应代码块的映射
  const array = value, caseMap = {}
  let result = []

  // 构建 caseMap
  cases.forEach(c => {  // 遍历每个 case
    const { consequent, test } = c  // 取出 case 的条件与语句块
    const test_value = test ? test.value : 'default_case'
    const statementArray = consequent.filter(line => !types.isContinueStatement(line))
    caseMap[test_value] = statementArray
  })

  // 根据实际执行顺序拼接出语句块
  array.forEach(i => {
    result = result.concat(caseMap[i])
  })
  // 添加默认 case
  if (caseMap.hasOwnProperty('default_case')) {
    result = result.concat(caseMap['default_case'])
  }
  path.replaceWithMultiple(result)
  // 手动更新 scope ，防止影响下个插件使用
  path.scope.crawl()
}

逗号表达式与返回语句

将多行内容使用逗号在单行内执行

代码块转复合语句。通过逗号省略花括号
```
for (i = 0; i < 10;)
  i++,console.log(i)
```

return 内使用逗号表达式。此时的代码效果看起来与python的多返回值类似，但 js 不支持多返回，前面的表达式都是赋值或计算，最后一项才是真实返回值

var c = {
  create: function (){
    return 1
  }
}
function a() {
  ...
  return false || r = c.create(), r  // 通过 `&&` 与 `||`，配合 `,` 执行多个表达式
  return r = (true, c.create)(), r  // 内部返回为函数后，再调用函数
}

function b() {  // 此时，返回部分前只声明变量，所有赋值、计算逻辑在返回部分进行
  var v1, v2;
  // 赋值方式的多种写法
  return v2 = (v1 = 1, v1 + 2), v2  // 内层返回 v1 + 2
  return v2 = ((v1 = 1, v1) + 2), v2  // 最内层返回 v1
}

添加花指令，增加无意义的赋值与计算，增大返回值内容

function b() {
  var v1, v2, v3;
  return v2 = (v1 = 1, v1 + 2), v3 = v1 + v2, v3++, v2  // 此处所有v3相关的计算都是无意义的
}

function b(v1, v2, v3) {  // 将函数声明部分放入参数中
  return v2 = (v1 = 1, v1 + 2), v3 = v1 + v2, v3++, v2
}

function add(a, b) {
  return a+b
}
function b(v1, v2, v3) {
  return v2 = (v1 = 1, (v1, true, add)(v1, 2)), v3 = (v2, v1, add)(v1, v2), v3++, v2  // 花指令函数调用时，通过逗号返回，函数名前的内容都是无意义的代码
}

Tips：js中所有未被调用的参数，值均为undefined，可等效为函数内部通过 var 声明的变量

还原：一般来说，处理逗号表达式需要明确括号层级。最内层为最先执行的逻辑，上一层的返回内容是内层中表达式的最后一项

逗号表达式混淆

混淆逻辑：将函数内多个表达式使用逗号进行连接，注意以下几种表达式不可直接连接
- 变量声明语句
```
var a = 1;
var b = 2;

// 直接连接会出错
// var a = 1, var b = 2
// 正常写法
var a = 1, b = 2;
```
  操作方法：提取作用域内的全部声明，统一放置在一个 VariableDeclaration 数组内。可提取到参数中，用函数参数进行深明，再将函数内初始化语句变更为赋值语句
```
function func() {
  var a = 1;
}
// 变量声明提取为参数。变更为
function func(a) {
  a = 1;
}
```
- return 与其他语句连接时。return 会返回最后一个逗号表达式的结果，因此要将返回语句拆分为两部分，return 放到开头，返回值使用逗号表达式拼接到后方
```
function func(a, b) {
  a = a + b
  return a
}

// 直接连接会出错
// a = a + b, return a
// 正常写法
function func(a, b) {
  return a = a + b, a
}
```

混淆代码

FunctionExpression(path) {
  const blockStatement = path.node.body
  const blockStatementLength = blockStatement.body.length
  if (blockStatementLength < 2)
    return
  
  // 1.变量处理 - 提取变量声明到函数参数
  path.traverse({
    VariableDeclaration(p) {  // 变量函数内的声明语句
      declarations = p.node.declarations
      const statements = []
      declarations.forEach(item => {
        path.node.params.push(item.id)  // 将声明变量的 Identifier 放入函数参数中
        item.init && statements.push(  // 若该变量有初始化
          t.assignmentExpression('=', item.id, item.init)  // 构建变量赋值语句
        )
        p.replaceInline(statements)  // 将初始化替换为赋值语句
      })
    }
  })

  // 2. 函数中部分语句会包裹在 `ExpressionStatement` 中（外边套了一层 `ExpressionStatement`）。在遍历函数内每条语句的类型时，会影响该语句的类型判断，需将该语句的内容从 `ExpressionStatement` 中提取出来
  // 拼接 return <逗号表达式>
  let result = blockStatement.body[0]  // 取出首条语句，作为 return 的初始值
  // 遍历后续语句，去除 ExpressionStatement 包裹
  for (let i = 1; i < blockStatementLength; i++) {
    const statement = blockStatement.body[i]
    const nextSta = t.isExpressionStatement(statement) ? statement.expression : statement

    // 3. 处理 return 语句
    // 遍历到原函数中的返回语句时，生成拼接后的返回语句
    if (t.isReturnStatement(nextSta)) {
      result = t.returnStatement( // 构造 return 语句
        // 逗号语句拼接之前的结果与当前语句
        t.toSequenceExpression([result, nextSta.argument])  // 返回内容 ReturnStatement.argument
      )

    // 4. 处理 赋值 语句
    // 遍历到赋值语句时
    } else if (t.isAssignmentExpression(nextSta)) {
      // 判断该赋值的右侧是否为函数调用
      if (t.isCallExpression(nextSta.right)) {
        const callee = nextSta.right.callee
        callee.object = t.toSequenceExpression([result, callee.object])  // 将调用者替换为逗号表达式
        result = nextSta  // 将最终的赋值设置为result
  
      // 5. 其他普通语句
      // 常规处理：由于逗号表达式默认返回最后一项，直接将值拼接到后方，将 result 重构为赋值语句
      // Eg: `a = 1; b = 2;`  处理后得到   `b = (a = 1, 2);` 即 `v = (xxx, value)`
      } else {
        nextSta.right = t.toSequenceExpression([result, nextSta.right])  // next为赋值 a=xxx,1
        result = nextSta  // 将最终的赋值设置为result
      }    

    // 普通语句，直接逗号拼接到后方 (xxx,xxx), xxx; !*** 函数内出现函数声明时，此处拼接返回undefined ***
    } else {
      result = t.toSequenceExpression([result, nextSta])
    }
  }
  
  // 函数体替换为拼接后的单行 return 语句
  path.get('body').replaceWith(t.blockStatement([result]))
}

Tips：

逗号表达式的混淆需注意减少与其他表达式混淆的混用，容易引发异常
若函数内出现了函数声明或定义，不能直接逗号拼接到 return 内，需做额外处理。此处代码不适用
- 数组还原代码中出现了自执行函数内定义了函数，此时不适用此混淆插件
- 编写不出现函数嵌套，可先尝试使用此混淆插件。测试混淆结果是否正常，再使用其他混淆手段

标识符混淆

隐藏标识符的语义，增大逆向难度

简单的标识符混淆。所有的标识符命名都不相同

Identifier(path) {Statement(path) ]{
  // 作用域重命名（作用域中指定的标识符名称，修改为目标标识符名称）
  path.scope.rename(path.node.name, path.scope.generateUidIdentifier('_0x18b2').name)
}

不同作用域的局部变量采用相同标识符名。使用 scope.getOwnBinding 获取函数( FunctionDeclaration|FunctionExpression ) 或全局( Program )的所有绑定，对函数作用域或全局作用域内的变量标识符进行重命名

function renameOwnBinding(path) {
  const OwnBindingObj = {}, globalBindingObj = {};
  path.traverse({
    Identifier(p) {  // 遍历当前作用域下的全部标识符
      const name = p.node.name;
      const binding = p.scope.getOwnBinding(name);  // 不是自己作用域内时，值为 undefined
      // 对当前作用域下的标识符进行区分
      // 不是自己作用域的放入 globalBindingObj 中
      // globalBindingObj 中不是全局，是非自己binding的变量，可能仅仅是上层函数，可能是全局
      // 不是全部的全局，因为仅遍历当前作用域的变量。如果定义了全局变量但当前作用域内未引用，则不会遍历到
      binding ? (OwnBindingObj[name] = binding) : (globalBindingObj[name] = 1);
    }
  }); // 解析，区分是否为 OwnBinding
  
  let i = 0;
  for(let oldName in OwnBindingObj) {
    let newName;
    // OwnBinding 不能与 globalBinding 重名。构建一个不重名的名称
    do {
      newName = '_0x18b2' + i++;  // 在不同作用域中都以 _0x18b2 开头定义变量
    } while (globalBindingObj[newName]);  // 该名称为全局，且作用域内有引用，需要更换名称
    
    OwnBindingObj[oldName].scope.rename(oldName, newName);  // 通过 Binding.scope.rename 改名
  }
}

traverse(ast, {
  'Program|FunctionExpression|FunctionDeclaration'(path) {
    // 全局和普通函数的作用域内标识符，都独立进行重命名
    // 已相同的起始命名，实现不同作用域的局部变量采用相同标识符名
    renameOwnBinding(path);
  }
});

会存在部分变量无法重命名，需要再转换一次 AST。例如数组混淆或花指令时，一些标识符由 types 组件生成。types 生成的为 Node 节点，不存在 Path，因此无法使用 Path.scope.rename。转换成代码后再次转为 AST，此时才拥有 Path

let ast = parser.parse(jscode);
// 其他加密手段
// ...

let {code} = generator(ast);
ast = parser.parse(code);  // 不存储而是再解析成 ast

// 标识符重命名
// function renameOwnBinding(path)  ...

code = generator(ast).code

生成相似标识符名。用 O, o, 0 三个字符生成标识符名

newName = generatorIdentifier(i++);

function generatorIdentifier(decNum) {
  const flag = ['O', 'o', '0'];
  const retval = [];
  while(decNum > 0) {  // 三进制数组
    retval.push(decNum % 3);  // 低位在前，高位在后
    decNum = parseInt(decNum / 3);
  }
  let Identifier = retval.reverse().map(v => flag[v]).join('');  // 0，1，2 对应 0 o O
  Identifier.length < 6 ? (Identifier = ('OOOOOO' + Identifier).substr(-6)) :  // 长度6
  	Identifier[0] == '0' && (Identifier = 'O' + Identifier)  // js标识符不能数字开头
  return Identifier
}

Tips: 注意 flag 的顺序，需要保证传入不同值，得到不同值。例如，如果字母 O 出现在索引1处，会出现以下问题

传入 0 ，不执行 while ，输出 OOOOOO
传入 1 ，得到字母 O，切分后依然为 OOOOOO

此时，同一作用域下的前两个变量必定重名！

jsf*ck

使用 ( ) ! + [ ] 共6个字符对待加密代码进行编码

原理：在目标前使用 + （作为一元运算符使用）对目标强转为数字，如

+'9' 得到 9
+[] 得到 0
+[3] 得到 3
+[1,3,5] 得到 NaN
+undefined 得到 NaN
+{} 得到 NaN
!+[] 得到 true 。由于 +[] 为 0，!表示当前bool取反，即现将 0 转为bool false，再取反得到 true
!+undefined 得到 true 。同理，NaN 转bool为 false，再取反得到 true
!![] + !![] 得到 2。由于 [] 转bool为真，因此 !![] 为真，两个 true 相加会先转为数字1，最后得到 2
[] == ![]。首先 ![] 为假。
- []==false 此时，两边类型不同且其中一边是布尔值，会把布尔值转换为数false转换为数字0
- []==0 当一个对象（这里是 []）与数字比较时，对象会先转换为原始值，调用 toString() → ""
- ""==0 此时 "" 转为数字 → 0，即0==0 ，此时为真
(!![]+[])[+[]] 得到 't' 参考文章open in new window
- !![] // bool值 true 变形为 (true+[])[+[]]
- !![]+[] // 字符串 "true" 变形为 ("true")[+[]]
- +[] // 0 最终变形为 ("true")[0]

Tips: js 中 false, undefined, null, 0, -0, NaN, "" 共7个值在条件逻辑中均表示"假"

还原：由于括号需要成对出现，通过括号明确代码结构，分段执行

内存爆破

通过检测非浏览器环境、代码格式化等，执行到正常情况下不会进入的分支，在分支内通过死循环等方式，使内存溢出

// 检测格式化部分的伪代码
function a() {
  return 'dev'
}
reg = new RegExp  // 正则
res = reg.test(a.toString()) ? true : false  // 是否匹配 "function a(){return 'dev'}"
// 格式化后的输出为 "function a() {\n    return 'dev';\n  }"
// js内定义的函数调用toString()。通过正则匹配判断该函数是否被格式化
if (!res) {  // 检测不通过
  a = [1, 1]
  for(i = 1, l = a.lengh; i < l; i++){
    a.push(1)
    l = a.length  // 修改结束条件，使其成为死循环，数组长度不断增加
  }
}

JS 混淆技术与 AST 还原

# 常量混淆

# 字符串常量混淆

# ascii 码混淆

# unicode 编码混淆

# ascii / unicode 编码去混淆

# 对象属性访问方式

# 标准内置对象

# 数字常量混淆

# 数组混淆

# eval 混淆字符串常量与代码

# eval 混淆字符串常量

# eval 混淆 js 代码

# 花指令

# 花指令混淆二项式

# 去花指令混淆

# 字符串常量花指令

# 处理函数内容的花指令

# 控制流混淆

# 流程平坦化

# 流程平坦化混淆

# 去平坦化

# 逗号表达式与返回语句

# 逗号表达式混淆

# 标识符混淆

# jsf*ck

# 内存爆破