Software Composition Analysis (SCA) for binary helps a lot for vulnerability discovery and program analysis, related fields include: stripped binary symbols recovery, and open-source components detection.
Recover Symbols for Stripped Binaries
Stripping symbols is a good way to reduce size of binaries, and to hide program information.
Use strip
or sstrip
in Kickers of ELF to remove symbols and sections.
But for program analysts, recovering binary symbols helps a lot.
IDA Pro FLIRT
The traditional technique FLIRT
(Fast Library Identification and Recognition Technology) using signatures to match binary functions if they are stripped library functions linked statically.
Prepare Sig Files
In /path/to/IDA/ida_plugin/
there is a flair70.zip
(for IDA 7.0 version, backup here), unzip for binaries we want to use (e.g. in linux), and remember to give them executable permission:
1 | $ pwd |
To prepare signature database, first choose materials (compiled archived libraries, like libc.a
), then use pelf
(it’s for linux ELF
format, pcf
for windows COFF
format) and sigmake
to generate sig
file:
1 | # choose one small lib as an example, not *.so but *.a format |
There are databases for matching already:
Usage
shift
+ F5
and right button to choose Apply new signature
,
or File
-> Load file
-> FLIRT signature file
to add sig
file,
After added sig, we can see matched results like:
To make a comparison, before and after:
While sig
files may be too much, so how to quickly locate which sig
to use?
- Use
file
orstrings
to match version information first. - Use lscan check This DEMO out
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17$ pip install pyelftools pefile
$ git clone https://github.com/maroueneboubakri/lscan.git
$ cd lscan
$ python lscan.py -S amd64/sig -f stripped
No symbol table found bin binary
amd64/sig/libm-2.13.sig 6/445 (1.35%)
amd64/sig/libpthread-2.13.sig 18/319 (5.64%)
amd64/sig/libc-2.23.sig 447/2869 (15.58%)
amd64/sig/libc-2.22.sig 420/2859 (14.69%)
amd64/sig/libssl-1.0.2h.sig 0/665 (0.00%)
amd64/sig/libm-2.23.sig 5/600 (0.83%)
amd64/sig/libc-2.13.sig 133/3369 (3.95%)
amd64/sig/libm-2.22.sig 5/582 (0.86%)
amd64/sig/libpthread-2.22.sig 18/262 (6.87%)
amd64/sig/libcrypto-1.0.2h.sig 3375/5057 (66.74%)
amd64/sig/libpcre-8.38.sig 1/150 (0.67%)
amd64/sig/libpthread-2.23.sig 19/258 (7.36%)
Finger
The Best by far
Finger, a tool for recognizing function symbol, developed by Alibaba Cloud · Cloud Security Technology Lab.
Let’s take a look at the before-and-after comparison:
Rizzo
It also generate signatures to match, like FLIRT
.
While, “Formal” signatures, “Fuzzy” signatures, String-based signatures and Immediate-based signatures can be generated separately, to face scenarios requiring different accuracy.
Rizzo-IDA for IDA 7.4+, and IDA7-Rizzo for IDA 7.0, and easy to find in tacnetsol/ida
To generate signatures for functions in your current IDB:
To apply generated signatures to your current IDB:
BinDiff and Diaphora
BinDiff and Diaphora are both binary code similarity detection tools, using hashing to do signature matching.
Actually all these tools need is a complete hash/signature database.
BinaryAI
BinaryAI: The Neural Search Engine for Binaries.
It used to be a IDA Pro plugin for binary-source matching, with a fairly good performance, opensourced at binaryai/sdk.
Researchers could install and register by referring to BinaryAI Docs, on which got installation steps, usages, and a video demo.
However for now BinaryAI aims at SCA, let users upload binary files, BinaryAI provides users with detailed and clear online reports, which include basic file information, software composition analysis, string information, etc., helping users to find the starting point for security analysis and improve efficiency of security analysis.
Try this at BinaryAI Binary Analysis Platform. See more docs at BinaryAI Documents
SCA (Software Composition Analysis) Tools
Software composition analysis (SCA) is a process of identifying the third party and open source components in the applications of an organization. This analysis leads to the discovery of security risk, quality of code and license compliance of the components.
Karta
“Karta” (Russian for “Map”) is an IDA Python plugin that identifies and matches open-sourced libraries in a given binary.
See RTD for its document.
Installation:
1 | $ pip install elementals sark |
Usage:
File
->Script File...
to choose scripts inhttps://github.com/CheckPointSW/Karta/tree/master/src
thumbs_up/thumbs_up_firmware.py
andthumbs_up/thumbs_up_ELF.py
for analysis on ARM code/data segments (dealing with ARM/THUMB code transitions), needsscikit-learn
librarykarta_identifier.py
identifies the existence of supported open source projects and fingerprints the exact library version, like:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27Karta Identifier - printer_firmware.bin:
========================================
Identified Open Sources:
------------------------
libpng: 1.2.29
zlib: 1.2.3
OpenSSL: 1.0.1j
gSOAP: 2.7
mDNSResponder: unknown
Identified Closed Sources:
--------------------------
Treck: unknown
Missing Open Sources:
---------------------
OpenSSH: Was not found
net-snmp: Was not found
libxml2: Was not found
libtiff: Was not found
MAC-Telnet: Was not found
Final Note - Karta
------------------
If you encountered any bug, or wanted to add a new extension / feature, don't hesitate to contact us on GitHub:
https://github.com/CheckPointSW/Kartakarta_manual_anchor.py
for defining “manual anchors”:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16$ python karta_manual_anchor.py --help
usage: karta_manual_anchor.py [-h] [-D] [-W] bin lib-name lib-version configs
Enables the user to manually defined matches, acting as manual anchors, later
to be used by Karta's Matcher.
positional arguments:
bin path to the disassembler's database for the wanted binary
lib-name name (case sensitive) of the relevant open source library
lib-version version string (case sensitive) as used by the identifier
configs path to the *.json "configs" directory
optional arguments:
-h, --help show this help message and exit
-D, --debug set logging level to logging.DEBUG
-W, --windows signals that the binary was compiled for Windows- other scripts:
karta_manual_identifier.py
,karta_matcher.py
, …
OpenSCA-Cli
OpenSCA is intended for scanning the third-party component dependencies and vulnerabilities.
Download github releases/gitee releases to use.
For detecting the component information only:
1 | $ opensca-cli -path ${project_path} |
For connecting to the cloud platform (visit opensca.xmirror.cn for token first):
1 | $ opensca-cli -url ${url} -token ${token} -path ${project_path} |
Or for using the local vulnerability database:
1 | $ opensca-cli -db db.json -path ${project_path} |
Pigaios
Pigaios (‘πηγαίος’, Greek for ‘source’ as in ‘source code’) is a tool for diffing/matching source codes directly against binaries.
It can match binaries with source regardless of it being compilable or not, using Python CLang bindings to match, and import IDA symbols . However if source can be compiled easily, Diaphora might be preferred.
Install requirements first (my env: kali-rolling, python 3.10, LLVM-14, IDA 7.7):
1 | $ sudo apt-get install clang python3-clang libclang-dev python-colorama python-sklearn |
Choose Zlib 1.2.13 as an example, first generate code signature database.
Step 1: generate project file sbd.project
1 | $ wget https://zlib.net/zlib-1.2.13.tar.gz |
Step 2: generate sqlite database zlib-1.2.13.sqlite
1 | $ python /path/to/srcbindiff.py --no-parallel -export # parallel also work |
We can open this sqlite db file
1 | $ sqlite3 zlib-1.2.13.sqlite |
Finally use IDA to match binary with source. Take busybox installed from kali-rolling apt repo as an example, as it uses functions in zlib.File
-> Script File...
to choose script sourceimp_ida.py
choose right sqlite file, and we can see the result
We can even visually diffing the pseudo-code of functions, like function gzopen
We can also see match reasons
CycloneDX + Dependency-Track, not for Binary but Open source
Nowadays SCA tools are mostly commercial and not free open source, but we can get support from community and OWASP.
As companies use multiple languages to develop, SCA tools should cover multi-langs like Java
, Golang
, Python
and NodeJS
.
By analysing files like pom.xml
, go.mod
, requirements.txt
and yarn.lock
to extract name/version of third parties from key parameters, we can achieve software composition analysis.
SCA tools should also be convenient to embed in existing DevOps
process, and an updating vuln database is needed, which crawling OSSIndex
, NVD
, NPM
and CPE
.
CycloneDX and Dependency-Track are both from OWASP.
OWASP CycloneDX
is a full-stack Bill of Materials (BOM) standard that provides advanced supply chain capabilities for cyber risk reduction, in a word, it creates BOM for projects.
OWASP Dependency-Track
is a continuous SBOM analysis platform that allows organizations to identify and reduce risk in the software supply chain, it accepts BOM and does vulnerability analysis.
CycloneDX
CycloneDX
has its own github repos and various plugins for different language projects, like CycloneDX Maven Plugin for Java
and cyclonedx-gomod for Golang
, usages can be found in README
and all their results are SBOM for the project.
Dependency-Track
Dependency-Track
also has its own github repos and we can setup according to docs. Deploying with docker container and executable war are both available, here I use jar
bundle to run.
First download in releases page. The dependency-track-apiserver.jar
should be used with frontend, and I use dependency-track-bundled.jar
which can be used independently.
1 | $ java -jar dependency-track-bundled.jar |
After login, setup a project for testing
Upload BOM in Components
tab to analyze, here we upload the bom.json
in releases page
Finally we get SCA result
It includes all recognized components, and indexes their potential vulnerabilities.
More
For DevOps, like for Jenkins
, it has dependency-track plugin, and add commands of Cyclonedx
to building routine can generate SBOM everytime you build the release.
Dependency-Check is also a Software Composition Analysis (SCA) tool that attempts to detect publicly disclosed vulnerabilities contained within a project’s dependencies. It does this by determining if there is a Common Platform Enumeration (CPE) identifier for a given dependency. If found, it will generate a report linking to the associated CVE entries.
It also has Jenkins
plugin to use.
Synk is another SCA scanner, which provides more accurate service than Dependency-Check
, but it is not open source and needs token authentication.