protection analysis of Tiktok’s .so layer



Last time I found two key .so files: sscronet and metasec_ml. I wanted to use jni trace to see the key information such as the loading sequence, parameters, and addresses of the jni function. The result was disappointing, none of them. Think carefully about the reason: either it is useless or it is encrypted.

Continue to open mestasec_ml with ida and I found that a large number of function names are encrypted in the export function list.

Many strings are also encrypted:

init_array found a large number of functions:

  1. Curiosity drove me to check in one by one, but the first one didn’t work: When I tried to view the source code with F5, I directly popped up the error "3BC6E: positive sp value has been found", and the problem I encountered last time with F5 jni_onload Same. Look at the code carefully and find that there is a problem here, as follows:
.text:0003BC10 000                 PUSH            {R4-R7,LR}
.text:0003BC12 014                 ADD             R7, SP, #0xC
.text:0003BC14 014                 PUSH.W          {R8-R11}
.text:0003BC18 024                 SUB             SP, SP, #0x6C
.text:0003BC1A 090                 MOV.W           R0, #0x172

It was normal at the beginning: save the registers on the stack, and then open up the stack space through sub sp, sp 0x6c to store parameters, local variables, etc.;

But the next branch is like this: an add instruction greatly increases sp suddenly, the stack space is not used at all in the branch and the stack balance is not restored at the end of the branch.

text:0003BC6C     loc_3BC6C                               ; CODE XREF: sub_3BC10+3CC↓j
.text:0003BC6C 090                 ADD             SP, SP, #0x180
.text:0003BC6E -F0                 NOP
.text:0003BC70 -F0                 LDR             R0, =0x5F1D4716
.text:0003BC72 -F0                 B               loc_3BFD6
.text:0003BD42     loc_3BD42                               ; CODE XREF: sub_3BC10+3FA↓j
.text:0003BD42 -F0                 ADD             SP, SP, #0xDC
.text:0003BD44 -1CC                LDR             R0, =0xBDA61CFA
.text:0003BD46 -1CC                B               loc_3BFD6
text:0003BD72     loc_3BD72                               ; CODE XREF: sub_3BC10+40A↓j
.text:0003BD72 -1CC                ADD             SP, SP, #0x15C
.text:0003BD74 -328                MOV             R0, R2
.text:0003BD76 -328                B               loc_3BFD6
.text:0003BDA0     loc_3BDA0                               ; CODE XREF: sub_3BC10+41A↓j
.text:0003BDA0 -328                ADD             SP, SP, #0x15C
.text:0003BDA2 -484                NOP
.text:0003BDA4 -484                LDR             R0, =0x743ECA69
.text:0003BDA6 -484                B               loc_3BFD6
.text:0003BDA8     loc_3BDA8                               ; CODE XREF: sub_3BC10+422↓j
.text:0003BDA8 -484                ADD             SP, SP, #0x104
.text:0003BDAA -588                NOP
.text:0003BDAC -588                MOV             R0, R9
.text:0003BDAE -588                B               loc_3BFD6
.text:0003BDC8     loc_3BDC8                               ; CODE XREF: sub_3BC10+432↓j
.text:0003BDC8 -588                ADD             SP, SP, #0xF4
.text:0003BDCA -67C                NOP
.text:0003BDCC -67C                LDR             R0, =0x2E70D3EB
.text:0003BDCE -67C                B               loc_3BFD6

The value of sp is still added until the end of the entire function, and then popped the register value of function that was put on the stack at the beginning, and the stack has not been rebalanced through sub sp.

.text:0003C08E -67C                BNE             loc_3BFD6
.text:0003C090 -67C                ADD             SP, SP, #0x6C
.text:0003C092 -6E8                POP.W           {R8-R11}
.text:0003C096 -6F8                POP             {R4-R7,PC}
.text:0003C096     ; End of function sub_3BC10

So here is a summary of the protection methods of

Add some useless branches, break the stack balance in the branch, and then jump to the originally useful branch to continue execution.

Continue to look at the other functions of init_array. From the second function onwards, F5 decompilation can be performed smoothly, but it does not work for the first one. This must be very important, so it is protected. The first function needs to be tracked carefully.

Through code analysis, it is found that these add sp branches are quoted in other codes, but they are all quoted in cmp conditions, and these conditions are not established. In other words, none of these add sp branches will be executed. The essence is only for anti-IDA. If you think about it carefully, a serious compiler won't do this kind of thing, and it's not a serious compiler that does this kind of thing. In order to rebalance the stack, we use 010editor to NOP any extra add. The way is as follows:

010Editor is more intimate. All the places I manually changed are marked in red: a total of 6 changes were made, and all the NOPs were dropped.

After all these add sp codes are nop off, the first function can be F5 normally. Some code snippets are as follows: It is found that it is OLLVM confusion again.

signed int sub_3BC10()
  int v0; // r1
  signed int result; // r0
  int v2; // r1
  bool v3; // zf
  signed int v4; // r1
  char v5; // nf
  int v6; // r1
  int v7; // r1
  int v8; // r1
  char v9; // r11
  int v10; // r1
  char v11; // r0
  int v12; // r1
  int v13; // r1
  signed int v14; // r1
  char v15; // [sp+61h] [bp-27h]
  char v16; // [sp+63h] [bp-25h]
  int v17; // [sp+64h] [bp-24h]
  char v18; // [sp+6Bh] [bp-1Dh]

  sub_82B38(547604, 19);
  if ( v0 )
    LOBYTE(v0) = 1;
  v15 = v0;
  result = -224184235;
    while ( 1 )
        while ( 1 )
          while ( 1 )
            while ( 1 )
                while ( 1 )
                  while ( 1 )
                    while ( 1 )
                      while ( 1 )
                        while ( 1 )
                          while ( 1 )
                            while ( 1 )
                              while ( 1 )
                                while ( 1 )
                                    while ( 1 )
                                      while ( 1 )
                                        while ( 1 )
                                          v14 = result;
                                          if ( result != -1786743035 )
                                          result = 1595754262;
                                        if ( result != -1766343261 )
                                        *(_DWORD *)((char *)R2bC6xH3fE6sH5rZ6gG
                                                  + ((((~(unsigned int)sub_3BC10 | 0xA021040) & 0xA061440)
                                                    + ((unsigned int)sub_40400 & (unsigned int)sub_3BC10 | 0x1010104)) ^ 0xF5F5F57C)) = 563;
                                        sub_82B38(49730, 7);
                                        result = 1701695347;
                                      if ( result != -1732828906 )
                                      sub_82B38(53825, 7);
                                      v3 = v2 == 0;
                                      v4 = -1113187078;
                                      result = -964039141;
                                      if ( !v3 )
                                        result = -1113187078;
                                      v5 = 1;

Firstly find some functions and click in to see what they are doing, and found that there is a function F5 of sub_40400 that also reported the same error. Here we can only continue to NOP the add sp instruction and even remove the pop code, because there is no push at the entry of the function. As follows:

This time the stack is balanced, and F5 decompilation still fails. In retrospect, don't they worry about changing the business logic in the past by adding security protection? Going back to the code, I found that this branch is also inside the cbz condition, but this condition is not true at all, so this branch will never be executed. It is the same as the flower instruction above.

.text:00040D7E 000 30 46                       MOV             R0, R6
.text:00040D80 000 00 21                       MOVS            R1, #0
.text:00040D82 000 CE F7 BB FF                 BL              sub_FCFC
.text:00040D86 000 B0 B3                       CBZ             R0, loc_40DF6
.text:00040DB6 000 30 46                       MOV             R0, R6
.text:00040DB8 000 00 21                       MOVS            R1, #0
.text:00040DBA 000 CE F7 F3 FE                 BL              sub_FBA4
.text:00040DBE 000 D0 B1                       CBZ             R0, loc_40DF6
.text:0003A76C 108 0D 9A                       LDR             R2, [SP,#0x100+var_CC]
.text:0003A76E 108 12 68                       LDR             R2, [R2]
.text:0003A770 108 51 1A                       SUBS            R1, R2, R1
.text:0003A772 108 02 BF                       ITTT EQ
.text:0003A774 108 39 B0                       ADDEQ           SP, SP, #0xE4
.text:0003A776 024 BD E8 00 0F                 POPEQ.W         {R8-R11}
.text:0003A77A 014 F0 BD                       POPEQ           {R4-R7,PC}
.text:0003A77C 000 00 BF                       NOP
.text:0003A77E 000 00 BF                       NOP
.text:0003A77E                 ; END OF FUNCTION CHUNK FOR JNI_OnLoad

So far, the static analysis is basically done. There are two reasons: (1) Many places artificially make the stack unbalanced and anti-ida F5. It is very troublesome to look for them one by one, and I personally don't have so much time to look for them. (2) Even if F5 succeeds, it still faces OLLVM's control flow confusion and string encryption. In this case, there is no way for static analysis.

Since static analysis can't work, I want to try dynamic debugging. But after trying it, I found that ida often popped up a window saying that it caught an exception (as shown in the figure below) and asked me to choose how to deal with it. As a result, I couldn't even trace the command or block.

Well, so far both static analysis and dynamic debugging are not working, I only have this way to go:

Use Frida to hook the key string and see what the decrypted string is in the memory.
Use android simulators such as unicorn, androidNativeEmu, and unidbg to run .so file.
Magically change the execution order of the trace function such as registerNative, prettyMethod, JniMethodStart and other methods inside artMethod class.

Tiktok Reverse