|
1 |
| -# go_moduledata_parser |
| 1 | +# go_moduledata_parser |
| 2 | + |
| 3 | +Personal project to parse and extract function and type metadata from Go binaries |
| 4 | +as JSON. Since the parser output is JSON, this allows integration with different tools. |
| 5 | + |
| 6 | +Currently, only generation of an IDC for annotating an IDA dissassembly is supported. This IDC |
| 7 | +can be applied to the disassembly to rename functions and types. |
| 8 | + |
| 9 | +[[_TOC_]] |
| 10 | + |
| 11 | +# Example Usage |
| 12 | + |
| 13 | +```bash |
| 14 | +pip3 install pyelftools pefile |
| 15 | +python3 parser.py win64s.exe > win64s.json |
| 16 | +``` |
| 17 | + |
| 18 | +A sample JSON can be viewed [here](integrations/ida/sample.json) |
| 19 | + |
| 20 | +## IDA integration |
| 21 | +``` |
| 22 | +python3 integrations/ida/generate_go_idc.py win64s.json annotate_win64s.idc |
| 23 | +``` |
| 24 | + |
| 25 | +To use the IDC (tested on IDA Freeware only!) |
| 26 | +* **File > Load File > Parse C header file ...** *(Ctrl+F9)* and select `go_32.h` or `go_64.h` depending on the bitness of the binary to import the necessary structs |
| 27 | +* **View > Open subviews > Local types** *(Shift+F1)*, select all types (Ctrl+A) and right click **Synchronize to IDB** |
| 28 | +* **File > Script File ...** *(Alt+F7)* and select the IDC file |
| 29 | + |
| 30 | +### Before |
| 31 | + |
| 32 | + |
| 33 | + |
| 34 | +### After |
| 35 | + |
| 36 | +Types and functions are annotated. |
| 37 | + |
| 38 | + |
| 39 | + |
| 40 | +# Limitations |
| 41 | + |
| 42 | +* Assumes that only one moduledata struct in use, Go binaries can contain more than one |
| 43 | +* Assumes that there is only one text section |
| 44 | +* Assumes that architecture is little-endian |
| 45 | +* Assumes that binary is built with later Go versions (currently 1.15), moduledata struct is not the same for binaries built with earlier versions |
| 46 | +* Only Windows and Linux (both x86/x64) supported (same architectures and OSes supported by IDA Freeware) |
| 47 | +* Code is not very Pythonic.:sweat_smile: |
| 48 | + |
| 49 | +# Project Organization |
| 50 | + |
| 51 | +## root |
| 52 | + |
| 53 | +All the Python code needed for parsing the moduledata and related structures |
| 54 | +within a Golang binary. Start from `parser.py` |
| 55 | + |
| 56 | +## go_files |
| 57 | + |
| 58 | +* [custom.go](go_files/custom.go) contains the sample Go code to test the parsing |
| 59 | +and annotation |
| 60 | +* [build.sh](go_files/build.sh) builds the Go code into stripped, unstripped versions for x86/x64 Windows and Linux |
| 61 | +* Prebuilt binaries. Stripped binaries are suffixed with **s**. |
| 62 | + |
| 63 | +## integrations |
| 64 | + |
| 65 | +Currently only IDA is supported. |
| 66 | + |
| 67 | +### ida |
| 68 | + |
| 69 | +* [generate_go_idc.py](integrations/ida/generate_go_idc.py) generates a IDC script from JSON for types and functions (useful for stripped binaries) |
| 70 | +* [generate_go_idc_types.py](integration/ida/generate_go_idc_types.py) generates a IDC script from JSON for types only (useful for unstripped binaries so that you don't rename what IDA already generated for you) |
| 71 | +* [go_32.h](integrations/ida/go_32.h) contains the 32 bit version of the Golang structs |
| 72 | +* [go_64.h](integrations/ida/go_64.h) contains the 64 bit version of the Golang structs |
| 73 | +* [go_structs.h](integrations/ida/go_64.h) contains the bitness-independent |
| 74 | + structs for selected Golang types. Primitive data types are defined in `go_32.h` and `go_64.h` |
| 75 | +* [sample](integrations/ida/sample) contains the generated JSON and IDC for a stripped Windows x64 binary |
| 76 | + |
| 77 | +# Background |
| 78 | + |
| 79 | +This started out as an attempt to convert the structs in Golang's source code (mainly in runtime/type.go, runtime/symtab.go, runtime/runtime2.go, reflect/type.go and reflect/value.go) into a header file usable by IDA and then manually identify the types from the disassembly and apply each struct by hand. |
| 80 | + |
| 81 | +A good way to learn this stuff was to write a simple Go program using the different features of the language (see [custom.go](go_files/custom.go)) and then building with and without stripping the symbols (see [build.sh](go_files/build.sh)). |
| 82 | + |
| 83 | +By comparing the disassembly of both binaries side-by-side, it is possible to manually identify the main functions, recognize types. etc. |
| 84 | + |
| 85 | +After spending enough time manually annotating a stripped binary, I decided to automate most of these tasks and hence this project :smiley: |
| 86 | + |
| 87 | +Some of the excellent articles and tools that I referred to during this project are listed below. |
| 88 | + |
| 89 | +# References |
| 90 | + |
| 91 | +## Reverse Engineering Articles |
| 92 | + |
| 93 | +* https://lekstu.ga/tags/go/ |
| 94 | +* https://x0r19x91.gitlab.io/categories/golang/ |
| 95 | + |
| 96 | +## Other Good :thumbsup: Tools |
| 97 | + |
| 98 | +* https://go-re.tk/redress/ (Windows and Linux, r2 integration) |
| 99 | +* https://github.com/alexander-hanel/gopep (Windows only, good notes and references as well) |
| 100 | +* https://github.com/getCUJO/ThreatIntel/tree/master/Scripts/Ghidra (Ghidra integration) |
| 101 | + |
0 commit comments