This post will explain the exercise from chapter 5 in the textbook “Practical Binary Analysis Build Your Own Linux Tools for Binary Instrumentation, analysis, and Disassembly (Andriesse, Dennis) “
The post will be divided into 2 parts.
Many tools will be mentioned in this post.
The textbook asks us to find a flag in the payload file.
Let’s do it step by step.
check the file type
First of all, let’s check the identity of the file.
$file payload
payload: ASCII tex
t
It is an ASCII text. we can see the content with head
command.
$head payload
...AZdZZ92z + XrS733fu993v/ v/ vnt/ bqmVfNNkBlq0cCFyy6KFZiUHKi1buMhMLAvMi0oXWSzlZYtA...
The content implies it is base64 encoded text.
Decode the content and make an output file.
base64 -d payload > decoded_payload
Then, check the identity of the decoded file.
file decoded_payload
decoded_payload: gzip compressed data
Now it has changed to gzip file!
It can see what’s inside the compressed data with -z option.
file -z decoded_payload
decoded_payload: POSIX tar archive (GNU)
There is another archive in the compressed file.
If it is extracted, we can get two files: ctf and 67b8601
file ctf
ctf: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=29aeb60bcee44b50d1db3a56911bd1de93cd2030, stripped
The ctf file is dynamically linked and stripped.
file 67b8601
67b8601: PC bitmap, Windows 3.x format, 512 x 512 x 24, image size 786434, resolution 7872 x 7872 px/m, 1165950976 important colors, cbSize 786488, bits offset 54
And we found 67b8601 is a BMP file.
Let’s investigate further these two files.
Using ldd (Checking library dependencies)
Since the ctf is etf file, we can try executing it.
It returns an error saying a lib5ae9b7f.so is missing when it is executed.
It can check the dependencies of libraries.
ldd ctf
linux-vdso.so.1 (0x00007ffcdbbe2000)
lib5ae9b7f.so => not found
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007efca7200000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007efca75e2000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007efca701b000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007efca7500000)
/lib64/ld-linux-x86-64.so.2 (0x00007efca761f000)
It can be checked the lib5ae9b7f.so is not found.
In this step, it checked a part of the dependencies is missing.
Next, let’s investigate the file contents.
Using xxd (Seeing binaries)
To discover the file’s contents, it is a good idea to analyze it at the byte level.
It can be realized using xxd.
xxd 67b8601| head -n 15
00000000: 424d 3800 0c00 0000 0000 3600 0000 2800 BM8.......6...(.
00000010: 0000 0002 0000 0002 0000 0100 1800 0000 ................
00000020: 0000 0200 0c00 c01e 0000 c01e 0000 0000 ................
00000030: 0000 0000 7f45 4c46 0201 0100 0000 0000 .....ELF........
00000040: 0000 0000 0300 3e00 0100 0000 7009 0000 ......>.....p...
00000050: 0000 0000 4000 0000 0000 0000 7821 0000 ....@.......x!..
00000060: 0000 0000 0000 0000 4000 3800 0700 4000 ........@.8...@.
00000070: 1b00 1a00 0100 0000 0500 0000 0000 0000 ................
00000080: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000090: 0000 0000 f40e 0000 0000 0000 f40e 0000 ................
000000a0: 0000 0000 0000 2000 0000 0000 0100 0000 ...... .........
000000b0: 0600 0000 f01d 0000 0000 0000 f01d 2000 .............. .
000000c0: 0000 0000 f01d 2000 0000 0000 6802 0000 ...... .....h...
000000d0: 0000 0000 7002 0000 0000 0000 0000 2000 ....p......... .
000000e0: 0000 0000 0200 0000 0600 0000 081e 0000 ................
It can see the ELF’s magic byte at offset 0x34 which is 52 in decimal.
It means the ELF file starts from the offset 0x34, but we don’t know where it ends.
However, we know the 64-bit ELF headers contain exactly 64 bytes.
Therefore, we can dump the only EFL header part by dumping 64 bytes from the start offset.
dd skip=52 count=64 if=67b8601 of=elf_header bs=1
64+0 records in
64+0 records out
64 bytes copied, 0.00296444 s, 21.6 kB/s
Check the content with xxd.
xxd elf_header
00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000 .ELF............
00000010: 0300 3e00 0100 0000 7009 0000 0000 0000 ..>.....p.......
00000020: 4000 0000 0000 0000 7821 0000 0000 0000 @.......x!......
00000030: 0000 0000 4000 3800 0700 4000 1b00 1a00 ....@.8...@.....
Nice! It parsed only the ELF header.
Using readelf (Checking details)
It can check the details of the ELF header with readelf tool.
-h option tells readelf to print only executable header.
readelf -h elf_header
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
readelf: Error: Too many program headers - 0x7 - the file is not that big
Type: DYN (Shared object file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x970
Start of program headers: 64 (bytes into file)
Start of section headers: 8568 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 7
Size of section headers: 64 (bytes)
Number of section headers: 27
Section header string table index: 26
readelf: Error: Reading 1728 bytes extends past end of file for section headers
readelf: Error: Too many program headers - 0x7 - the file is not that big
In the details, there are 3 important factors to figure out the whole size of the hidden ELF library.
- The start of section headers tells us the offset of the starting point of the hidden library.
- Size of section headers.
- Number of section headers in the hidden library.
With this information, it can finally calculate the whole size of the hidden library in the BMP file.
size = e_shoff + (e_shnum * e_shentsize)
= 8568 + (27 * 64)
= 10296
Knowing the size, we can extract the entire library using dd.
dd skip=52 count=10296 if=67b8601 of=lib5ae9b7f.so bs=1
10296+0 records in
10296+0 records out
10296 bytes (10 kB, 10 KiB) copied, 0.100686 s, 102 kB/s
Then, check the details with readelf again.
readelf -hs lib5ae9b7f.so
Using nm (Parsing symbols)
The result of readelf shows some functions written in the library. However, the names are mangled so it is difficult to guess usages.
A tool to demangle mangled names is nm.
nm -D lib5ae9b7f.so
-D option is used because the file is stripped. It needs to parse the dynamic symbol table instead.
nm -D lib5ae9b7f.so
0000000000202058 B __bss_start
w __cxa_finalize@GLIBC_2.2.5
0000000000202058 D _edata
0000000000202060 B _end
0000000000000d20 T _fini
w __gmon_start__
00000000000008c0 T _init
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
w _Jv_RegisterClasses
U malloc@GLIBC_2.2.5
U memcpy@GLIBC_2.14
U __stack_chk_fail@GLIBC_2.4
0000000000000c60 T _Z11rc4_decryptP11rc4_state_tPhi
0000000000000c70 T _Z11rc4_decryptP11rc4_state_tRNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
0000000000000b40 T _Z11rc4_encryptP11rc4_state_tPhi
0000000000000bc0 T _Z11rc4_encryptP11rc4_state_tRNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
0000000000000cb0 T _Z8rc4_initP11rc4_state_tPhi
U _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE9_M_createERmm@GLIBCXX_3.4.21
U _ZSt19__throw_logic_errorPKc@GLIBCXX_3.4
Now it looks a bit better.
I see words encrypt and decrypt. I guess this library is relevant to encryption and decryption.
It can be demangled further.
nm -D --demangle lib5ae9b7f.so
0000000000202058 B __bss_start
w __cxa_finalize@GLIBC_2.2.5
0000000000202058 D _edata
0000000000202060 B _end
0000000000000d20 T _fini
w __gmon_start__
00000000000008c0 T _init
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
w _Jv_RegisterClasses
U malloc@GLIBC_2.2.5
U memcpy@GLIBC_2.14
U __stack_chk_fail@GLIBC_2.4
0000000000000c60 T rc4_decrypt(rc4_state_t*, unsigned char*, int)
0000000000000c70 T rc4_decrypt(rc4_state_t*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)
0000000000000b40 T rc4_encrypt(rc4_state_t*, unsigned char*, int)
0000000000000bc0 T rc4_encrypt(rc4_state_t*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)
0000000000000cb0 T rc4_init(rc4_state_t*, unsigned char*, int)
U std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long)@GLIBCXX_3.4.21
U std::__throw_logic_error(char const*)@GLIBCXX_3.4
Ok, we can safely assume this library is about encryption algorithms
Execute ctf file again
Since we extracted the hidden encryption algorithm library from the BMP file and all the pieces are gathered, running ctf file should work this time.
Before that, the path of the hidden library should be added.
export LD_LIBRARY_PATH='pwd' <- use back ticks
./ctf
echo $?
1
It is executed!
The $? shows the exit status of the file. 1 means an error occurred.
Let’s investigate it in the hope of grasping any hints.
strings ctf
...
[]A\A]A^A_
DEBUG: argv[1] = %s
checking '%s'
show_me_the_flag
>CMb
-v@P^:
flag = %s
guess again!
It's kinda like Louisiana. Or Dagobah. Dagobah - Where Yoda lives!
;*3$"
zPLR
GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
.shstrtab
.interp
...
It gives us some suspicious strings.
And it seems we need to provide a variable when running the file.