package bap-std

  1. Overview
  2. Docs
Legend:
Library
Module
Module type
Parameter
Class
Class type

Overview

Layered Architecture

The BAP library has the layered architecture consisting of four layers. Although the layers are not really observable from outside of the library, they make it easier to learn the library as they introduce new concepts sequentially. On top of these layers, the Project module is defined that consolidates all information about a target of analysis. The Project module may be viewed as an entry point to the library.

        +-----------------------------------------------------+
        | +--------+   +-----------------------------------+  |
        | |        |   |                                   |  |
        | |        |   |       Foundation Library          |  |
        | |        |   |                                   |  |
        | |        |   +-----------------------------------+  |
        | |   P    |                                          |
        | |        |   +-----------------------------------+  |
        | |   R    |   |                                   |  |
        | |        |   |          Memory Model             |  |
        | |   O    |   |                                   |  |
        | |        |   +-----------------------------------+  |
        | |   J    |                                          |
        | |        |   +-----------------------------------+  |
        | |   E    |   |                                   |  |
        | |        |   |           Disassembly             |  |
        | |   C    |   |                                   |  |
        | |        |   +-----------------------------------+  |
        | |   T    |                                          |
        | |        |   +-----------------------------------+  |
        | |        |   |                                   |  |
        | |        |   |        Semantic Analysis          |  |
        | |        |   |                                   |  |
        | +--------+   +-----------------------------------+  |
        +-----------------------------------------------------+

The Foundation library defines BAP Instruction language data types, as well as other useful data structures, like Value, Trie, Vector, etc. The Memory model layer is responsible for loading and parsing binary objects and representing them in a computer memory. It also defines a few useful data structures that are used extensively by later layers, e.g., Table and Memmap. The next layer performs disassembly and lifting to BIL. Finally, the semantic analysis layer transforms a binary into an IR representation, that is suitable for writing analysis.

Plugin Architecture

The standard library tries to be as extensible as possible. We are aware, that there are not good solutions for some problems, so we don't want to force our way of doing things. In short, we're trying to provide mechanisms, not policies. We achieve this by employing the dependency injection principle. By inversing the dependency we allow the library to depend on a user code. For example, a user code can teach the library how to disassemble the binary or even how to reconstruct the CFG. In fact, the library by itself doesn't contain the disassembler or lifter, or any architecture specific code. Everything is injected later by corresponding plugins.

The library defines a fixed set of extension points. (Other libraries, that constitute the Platform and follow the same principle, can define their own extension points, so the following set is not complete):

The Regular.Std library, that forms a foundation for the BAP Standard Library, also follows the dependency injection principle, so every data type that implements regular interface, can be dynamically extended with:

  • pretty printing function;
  • serialization subroutines;
  • caching.

Writing the analysis

A common use case, is to write some analysis that will take the program in some representation and then either output result of analysis in a human or machine readable way, or transform the program, in a way that can be employed by other analysis. Following a naming convention of a more established community of compiler writers, we name such analysis a _pass_.

The library itself doesn't run any analysis, it part of the job of a frontend to run it. In particular, the bap frontend, will run the analyses based on a command line specification. See bap --help for more information.

We use Project data structure to represent a program and all associated knowledge that we were capable to infer. To learn how to use the project data structure continue to Working with project.

Foundation Library

At this layer we define (Binary Instruction language) and few other useful data structures:

  • arch - describes computer architecture;
  • size - word and register sizes;
  • var - BIL variable;
  • typ - BIL type system;
  • exp - BIL expression sub-language;
  • stmt - BIL statements;
  • bitvector - a bitvector data structure to represent immediate data, used usually by their aliases word and addr;
  • value - an extensible variant type;
  • dict - an extensible record;
  • vector - an array that can grow;
  • Trie - prefix trees;

Most of the types implement the Regular interface. This interface is very similar to Core's Identifiable, and is supposed to represent a type that is as common as a built-in type. One should expect to find any function that is implemented for such types as int, string, char, etc. Namely, this interface includes:

  • comparison functions: (<, >, <= , >= , compare, between, ...);
  • each type defines a polymorphic Map with keys of type t;
  • each type provides a Set with values of type t;
  • hashtable is exposed via Table module;
  • hashset is available under Hash_set name
  • sexpable and binable interface;
  • to_string, str, pp, ppo, pps functions for pretty-printing.

It is a convention, that for each type, there is a module with the same name that implements its interface. For example, type exp is a type abbreviation for Exp.t, and module Exp contains all functions and types related to type exp. For example, to create a hashtable of statements, just type:

let table = Exp.Table.create ()

If a type is a variant type (i.e., it defines constructors) then for each constructor named Name, there exists a corresponding function named name that will accept the same number of arguments as the arity of the constructor (also named a _functional constructor_). For example, a Bil.Int can be constructed with the Bil.int function that has type word -> exp. If a constructor has several arguments of the same type we usually disambiguate using labels, e.g., Bil.Load of (exp,exp,endian,size) has function Bil.load with type: mem:exp -> addr:exp -> endian -> size -> exp

Value

Universal values can be viewed as extensible variants on steroids. Not only they maybe extended, but they also can be serialized, compared with user-defined comparison function and even pretty printed.

Dict

Like value is an extensible sum type, dict can be viewed as an extensible product type. Dict is a sequence of values of type value, with tags used as field names. Of course, fields are unique.

Vector

Vector is an implementation of C++ STL like vectors with logarithmic push back.

Tries

The Foundation library also defines a prefix tree data structure that proves to be useful for binary analysis applications. Tries in BAP is a functor that derives a polymorphic trie data structure for a given Key.

For convenience we support instantiating tries for most of our data structures. For example, Word has several tries inside.

For the common string trie, there's Trie.String.

Memory model

This layer is responsible for the representation of binaries. It provides interfaces for the memory objects:

  • mem - a contiguous array of bytes, indexed with absolute addresses;
  • 'a table - a mapping from a memory regions to arbitrary data (no duplicates or intersections);
  • a memmap - a mapping from memory region to arbitrary data with duplicates and intersections allowed, aka segment tree or interval map;
  • image - represents a binary object with all its symbols, segments, sections and other meta information.

The Image module uses the plugin system to load binary objects. In order to add new loader, one should implement the Backend.t loader function and register it with the Image.register_backend function.

Disassembler

This layer defines the interfaces for disassemblers. Two interfaces are provided:

  • Disasm - a regular interface that hides all complexities, but may not always be very flexible.
  • Disasm_expert - an expert interface that provides access to a low-level representation. It is very flexible and fast, but harder to use.

To disassemble files or data with the regular interface, use one of the following functions:

All these functions perform disassembly by recursive descent, reconstruct the control flow graph, and perform lifting.

The result of disassembly is represented by the abstract value of type disasm. Two main data structures that are used to represent disassembled program are:

  • insn - a machine instruction;
  • block - a basic block, i.e., a linear sequence of instructions.

The following figure shows the relationship between basic data structures of the disassembled program.

        +-----------------+
        | +-------------+ |
        | |   disasm    | |
        | +-------------+ |
        |        |        |
        |        | *      |
        | +-------------+ |
        | |    block    | |
        | +-------------+ |
        |        |        |
        |        | *      |
        | +-------------+ |
        | |     insn    | |
        | +-------------+ |
        |        |        |
        |        | *      |
        | +-------------+ |
        | |     stmt    | |
        | +-------------+ |
        +-----------------+

A disassembled program is represented as a set of interconnected basic blocks, called a whole program control flow graph (CFG) and it is indeed represented as a graph Graphs.Cfg. See graphlib for more information on graphs.

Each block is a container to a sequence of machine instructions. It is guaranteed that there's at least one instruction in the block, thus the Block.leader and Block.terminator functions are total.

Each machine instruction is represented by its opcode, name and array of operands (these are machine and disassembler specific), a set of predicates (that approximates instruction semantics on a very high level), and a sequence of BIL statements that precisely define the semantics of the instruction.

The expert interface exposes low level interface that provides facilities for building custom implementations of disassemblers. The interface to the disassembler backend is exposed via the Disasm_expert.Basic module. New backends can be added by implementing the 'disasm.hpp' interface.

Modules of type CPU provide a high level abstraction of the machine CPU and allow one to reason about the instruction semantics independently from the target platform. The module type Target brings CPU and ABI together. To get an instance of this module, you can use the target_of_arch function. Architecture specific implementations of the Target interface may (and usually do) provide more information, see corresponding support libraries for ARM and x86 architectures.

Semantic Analysis

On the semantic level the disassembled program is lifted into the BAP Intermediate Representation (BIR). BIR is a semi-graphical representation of BIL (where BIL represents a program as Abstract Syntax Tree). The BIR provides mechanisms to express richer relationships between program terms and it also easier to use for most use cases, especially for data dependency analysis.

The program in IR is build of terms. In fact the program itself is also a term. There're only 7 kinds of terms:

  • program - the program in whole;
  • sub - subroutine;
  • arg - subroutine argument;
  • blk - basic block;
  • def - definition of a variable;
  • phi - phi-node in the SSA form;
  • jmp - a transfer of control.

Unlike expressions and statements in BIL, IR's terms are concrete entities. Concrete entity is such entity that can change in time and space, as well as come in and out of existence. Contrary, abstract entity is eternal and unchangeable. Identity denotes the sameness of a concrete entity as it changes in time. Abstract entities don't have an identity since they are immutable. Program is built of concrete entities called terms. Terms have attributes that can change in time, without affecting the identity of a term. Attributes are abstract entities. In each particular point of space and time a term is represented by a snapshot of all its attributes, colloquially called value. Functions that change the value of a term in fact return a new value with different set of attributes. For example, def term has two attributes: left hand side (lhs), that associates definition with abstract variable, and right hand side (rhs) that associates def with an abstract expression. Suppose, that the definition was:

# let d_1 = Def.create x Bil.(var y + var z);;
val d_1 : Def.t = 00000001: x := y + z

To change the right hand side of a definition we use Def.with_rhs that returns the same definition but with different value:

# let d_2 = Def.with_rhs d_1 Bil.(int Word.b1);;
val d_2 : Def.t = 00000001: x := true

d_1 and d_2 is different values

# Def.equal d_1 d_2;;
- : bool = false

of the same term

# Term.same d_1 d_2;;
- : bool = true

The identity of this terms is denoted by the term identifier (tid). In the textual representation term identifiers are printed as ordinal numbers.

Terms, can contain other terms. But unlike BIL expressions or statements, this relation is not truly recursive, since the structure of program term is fixed: arg, phi, def, jmp are leaf terms; sub can only contain arg's or blk's; blk consists of phi, def and jmp sequences of terms, as pictured in the figure below. Although, the term structure is closed to changes, you still can extend particular term with attributes, using set_attr and get_attr functions of the Term module. This functions are using extensible variant type to encode attributes.

        +--------------------------------------------------------+
        |                +-------------------+                   |
        |                |      program      |                   |
        |                +---------+---------+                   |
        |                          |*                            |
        |                +---------+---------+                   |
        |                |        sub        |                   |
        |                +---------+---------+                   |
        |                          |                             |
        |        +-----------------+---------------+             |
        |        |*                                |*            |
        |  +-----+-------+                 +-------+-------+     |
        |  |    arg      |                 |      blk      |     |
        |  +-------------+                 +-------+-------+     |
        |                                          |             |
        |           +---------------+--------------+             |
        |           |*              |*             | *           |
        |     +-----+-----+   +-----+-----+   +----+-----+       |
        |     |    phi    |   |    def    |   |   jmp    |       |
        |     +-----------+   +-----------+   +----------+       |
        +--------------------------------------------------------+

Working with project

There're two general approaches to obtain a value of type project:

  • create it manually using Project.create function;
  • write a plugin to the bap frontend.

Although the first approach is simplistic and gives you a full control, we still recommend to use the latter.

To write a program analysis plugin (or pass in short) you need to implement a function with one of the following interfaces:

Once loaded from the bap frontend (see bap --help) this function will be invoked with a value of type project that provides access to all information gathered from the input source. If the registered function returns a non unit type, then it can functionally update the project state, e.g., add annotations, discover new symbols, transform program representation, etc.

Example

The following plugin prints all sections in a file:

open Core_kernel[@@warning "-D"]
open Bap.Std
open Format

let print_sections p =
  Project.memory p |> Memmap.to_sequence |> Seq.iter ~f:(fun (mem,x) ->
      Option.iter (Value.get Image.section x) ~f:(fun name ->
          printf "Section: %s@.%a@." name Memory.pp mem))

let () = Project.register_pass' print_sections

Note: this functionality is provided by the print plugin.

Passing information between passes

To pass data from one pass to another in a type safe manner, we use universal values. Values can be attached to a particular memory region, IR terms, or put into the storage dictionary. For the first case we use the memmap data structure. It is an interval tree containing all the memory regions that are used during analysis. For the storage we use Dict data structure. Also, each program term, has its own dictionary.

Memory annotations

By default the memory is annotated with the following attributes:

  • section -- for regions of memory that had a particular name in the original binary. For example, in ELF, sections have names that annotate a corresponding memory region. If project was created from memory object, then the overall memory will be marked as a "bap.user" section.
  • segment -- if the binary data was loaded from a binary format that contains segments, then the corresponding memory regions are be marked. Segments provide access to permission information.

BAP API

module Integer : sig ... end

Abstract integral type.

module Seq : module type of Regular.Std.Seq with type 'a t = 'a Base.Sequence.t

Lazy sequence

type 'a seq = 'a Seq.t

type abbreviation for 'a Sequence.t

val compare_seq : ('a -> 'a -> int) -> 'a seq -> 'a seq -> int
val sexp_of_seq : ('a -> Ppx_sexp_conv_lib.Sexp.t) -> 'a seq -> Ppx_sexp_conv_lib.Sexp.t
val seq_of_sexp : (Ppx_sexp_conv_lib.Sexp.t -> 'a) -> Ppx_sexp_conv_lib.Sexp.t -> 'a seq
module Trie : sig ... end

Constructs a trie

module Interval_tree : sig ... end

Balanced Interval Tree.

type value
val bin_shape_value : Core_kernel.Bin_prot.Shape.t
val __bin_read_value__ : (int -> value) Core_kernel.Bin_prot.Read.reader
val compare_value : value -> value -> int
val sexp_of_value : value -> Ppx_sexp_conv_lib.Sexp.t
val value_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> value
type dict
val bin_shape_dict : Core_kernel.Bin_prot.Shape.t
val __bin_read_dict__ : (int -> dict) Core_kernel.Bin_prot.Read.reader
val compare_dict : dict -> dict -> int
val sexp_of_dict : dict -> Ppx_sexp_conv_lib.Sexp.t
val dict_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> dict
type word

Type to represent machine word

val bin_shape_word : Core_kernel.Bin_prot.Shape.t
val __bin_read_word__ : (int -> word) Core_kernel.Bin_prot.Read.reader
val compare_word : word -> word -> int
val sexp_of_word : word -> Ppx_sexp_conv_lib.Sexp.t
val word_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> word
type addr = word

A synonym for word, that should be used for words that are addresses

val bin_shape_addr : Core_kernel.Bin_prot.Shape.t
val __bin_read_addr__ : (int -> addr) Core_kernel.Bin_prot.Read.reader
val compare_addr : addr -> addr -> int
val sexp_of_addr : addr -> Ppx_sexp_conv_lib.Sexp.t
val addr_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> addr
module Size : sig ... end

Type safe operand and register sizes.

type size = Size.t

size of operand

val bin_shape_size : Core_kernel.Bin_prot.Shape.t
val __bin_read_size__ : (int -> size) Core_kernel.Bin_prot.Read.reader
val compare_size : size -> size -> int
val sexp_of_size : size -> Ppx_sexp_conv_lib.Sexp.t
val size_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> size
type addr_size = [ `r32 | `r64 ] Size.p

size of address

val bin_shape_addr_size : Core_kernel.Bin_prot.Shape.t
val __bin_read_addr_size__ : (int -> addr_size) Core_kernel.Bin_prot.Read.reader
val compare_addr_size : addr_size -> addr_size -> int
val sexp_of_addr_size : addr_size -> Ppx_sexp_conv_lib.Sexp.t
val addr_size_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> addr_size
module Bitvector : sig ... end

Bitvector -- an integer with modular arithmentics.

type endian = Bitvector.endian =
  1. | LittleEndian
  2. | BigEndian

Expose endian constructors to Bap.Std namespace

val sexp_of_endian : endian -> Ppx_sexp_conv_lib.Sexp.t
val endian_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> endian
val bin_shape_endian : Core_kernel.Bin_prot.Shape.t
val __bin_read_endian__ : (int -> endian) Core_kernel.Bin_prot.Read.reader
val compare_endian : endian -> endian -> int
module Word : module type of Bitvector with type t = word and type endian = endian and type comparator_witness = Bitvector.comparator_witness

Shortcut for bitvectors that represent words

module Addr : sig ... end

Shortcut for bitvectors that represent addresses

module Bil : sig ... end

Main BIL module.

type typ = Bil.typ
val bin_shape_typ : Core_kernel.Bin_prot.Shape.t
val __bin_read_typ__ : (int -> typ) Core_kernel.Bin_prot.Read.reader
val compare_typ : typ -> typ -> int
val sexp_of_typ : typ -> Ppx_sexp_conv_lib.Sexp.t
val typ_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> typ
type var = Bil.var
val bin_shape_var : Core_kernel.Bin_prot.Shape.t
val __bin_read_var__ : (int -> var) Core_kernel.Bin_prot.Read.reader
val compare_var : var -> var -> int
val sexp_of_var : var -> Ppx_sexp_conv_lib.Sexp.t
val var_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> var
type bil = Bil.t
val bin_shape_bil : Core_kernel.Bin_prot.Shape.t
val __bin_read_bil__ : (int -> bil) Core_kernel.Bin_prot.Read.reader
val compare_bil : bil -> bil -> int
val sexp_of_bil : bil -> Ppx_sexp_conv_lib.Sexp.t
val bil_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> bil
type binop = Bil.binop
val bin_shape_binop : Core_kernel.Bin_prot.Shape.t
val __bin_read_binop__ : (int -> binop) Core_kernel.Bin_prot.Read.reader
val compare_binop : binop -> binop -> int
val sexp_of_binop : binop -> Ppx_sexp_conv_lib.Sexp.t
val binop_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> binop
type cast = Bil.cast
val bin_shape_cast : Core_kernel.Bin_prot.Shape.t
val __bin_read_cast__ : (int -> cast) Core_kernel.Bin_prot.Read.reader
val compare_cast : cast -> cast -> int
val sexp_of_cast : cast -> Ppx_sexp_conv_lib.Sexp.t
val cast_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> cast
type exp = Bil.exp
val bin_shape_exp : Core_kernel.Bin_prot.Shape.t
val __bin_read_exp__ : (int -> exp) Core_kernel.Bin_prot.Read.reader
val compare_exp : exp -> exp -> int
val sexp_of_exp : exp -> Ppx_sexp_conv_lib.Sexp.t
val exp_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> exp
type stmt = Bil.stmt
val bin_shape_stmt : Core_kernel.Bin_prot.Shape.t
val __bin_read_stmt__ : (int -> stmt) Core_kernel.Bin_prot.Read.reader
val compare_stmt : stmt -> stmt -> int
val sexp_of_stmt : stmt -> Ppx_sexp_conv_lib.Sexp.t
val stmt_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> stmt
type unop = Bil.unop

The type of a BIL expression.

Each BIL expression is either an immediate value of a given width, or a chunk of memory of a give size. The following predefined constructors are brought to the scope:

val bin_shape_unop : Core_kernel.Bin_prot.Shape.t
val __bin_read_unop__ : (int -> unop) Core_kernel.Bin_prot.Read.reader
val compare_unop : unop -> unop -> int
val sexp_of_unop : unop -> Ppx_sexp_conv_lib.Sexp.t
val unop_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> unop
module Type : sig ... end

The type of a BIL expression.

val bool_t : typ

one bit

val reg8_t : typ

one bit

8-bit width value

val reg16_t : typ

8-bit width value

16-bit width value

val reg32_t : typ

16-bit width value

32-bit width value

val reg64_t : typ

32-bit width value

64-bit width value

val reg128_t : typ

64-bit width value

128-bit width value

val reg256_t : typ

128-bit width value

256-bit width value

val mem32_t : size -> typ

mem32_t size creates a type for memory with 32-bit addresses and elements of the given size.

val mem64_t : size -> typ

mem64_t size creates a type for memory with 64-bit addresses and elements of the given size.

module Var : sig ... end

BIL variable.

module Context : sig ... end

Base class for evaluation contexts.

module Type_error : module type of Type.Error with type t = Type.Error.t
type type_error = Type_error.t

A BIL type error

val bin_shape_type_error : Core_kernel.Bin_prot.Shape.t
val __bin_read_type_error__ : (int -> type_error) Core_kernel.Bin_prot.Read.reader
val compare_type_error : type_error -> type_error -> int
val sexp_of_type_error : type_error -> Ppx_sexp_conv_lib.Sexp.t
val type_error_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> type_error
module Eval : sig ... end

Basic and generic expression evaluator.

module Expi : sig ... end

Expression Language Interpreter.

class 'a expi : 'a Expi.t

Expression interpreter

module Bili : sig ... end

BIL Interpreter.

class 'a bili : 'a Bili.t
module Eff : sig ... end

Effect analysis.

module Exp : sig ... end

Regular interface for BIL expressions

module Stmt : sig ... end

Regular interface for BIL statements

module Arch : sig ... end

Architecture

type arch = Arch.t

architecture

val bin_shape_arch : Core_kernel.Bin_prot.Shape.t
val __bin_read_arch__ : (int -> arch) Core_kernel.Bin_prot.Read.reader
val compare_arch : arch -> arch -> int
val sexp_of_arch : arch -> Ppx_sexp_conv_lib.Sexp.t
val arch_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> arch
module Value : sig ... end

Universal Values.

type 'a tag = 'a Value.tag
module Dict : sig ... end

Universal Heterogeneous Map.

type 'a vector
module Vector : sig ... end

Resizable Array.

type 'a term

BAP IR.

Program is a tree of terms.

val compare_term : ('a -> 'a -> int) -> 'a term -> 'a term -> int
val sexp_of_term : ('a -> Ppx_sexp_conv_lib.Sexp.t) -> 'a term -> Ppx_sexp_conv_lib.Sexp.t
val term_of_sexp : (Ppx_sexp_conv_lib.Sexp.t -> 'a) -> Ppx_sexp_conv_lib.Sexp.t -> 'a term
type program
val bin_shape_program : Core_kernel.Bin_prot.Shape.t
val __bin_read_program__ : (int -> program) Core_kernel.Bin_prot.Read.reader
val compare_program : program -> program -> int
val sexp_of_program : program -> Ppx_sexp_conv_lib.Sexp.t
val program_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> program
type sub
val bin_shape_sub : Core_kernel.Bin_prot.Shape.t
val __bin_read_sub__ : (int -> sub) Core_kernel.Bin_prot.Read.reader
val compare_sub : sub -> sub -> int
val sexp_of_sub : sub -> Ppx_sexp_conv_lib.Sexp.t
val sub_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> sub
type arg
val bin_shape_arg : Core_kernel.Bin_prot.Shape.t
val __bin_read_arg__ : (int -> arg) Core_kernel.Bin_prot.Read.reader
val compare_arg : arg -> arg -> int
val sexp_of_arg : arg -> Ppx_sexp_conv_lib.Sexp.t
val arg_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> arg
type blk
val bin_shape_blk : Core_kernel.Bin_prot.Shape.t
val __bin_read_blk__ : (int -> blk) Core_kernel.Bin_prot.Read.reader
val compare_blk : blk -> blk -> int
val sexp_of_blk : blk -> Ppx_sexp_conv_lib.Sexp.t
val blk_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> blk
type phi
val bin_shape_phi : Core_kernel.Bin_prot.Shape.t
val __bin_read_phi__ : (int -> phi) Core_kernel.Bin_prot.Read.reader
val compare_phi : phi -> phi -> int
val sexp_of_phi : phi -> Ppx_sexp_conv_lib.Sexp.t
val phi_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> phi
type def
val bin_shape_def : Core_kernel.Bin_prot.Shape.t
val __bin_read_def__ : (int -> def) Core_kernel.Bin_prot.Read.reader
val compare_def : def -> def -> int
val sexp_of_def : def -> Ppx_sexp_conv_lib.Sexp.t
val def_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> def
type jmp
val bin_shape_jmp : Core_kernel.Bin_prot.Shape.t
val __bin_read_jmp__ : (int -> jmp) Core_kernel.Bin_prot.Read.reader
val compare_jmp : jmp -> jmp -> int
val sexp_of_jmp : jmp -> Ppx_sexp_conv_lib.Sexp.t
val jmp_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> jmp
type nil
val bin_shape_nil : Core_kernel.Bin_prot.Shape.t
val __bin_read_nil__ : (int -> nil) Core_kernel.Bin_prot.Read.reader
val compare_nil : nil -> nil -> int
val sexp_of_nil : nil -> Ppx_sexp_conv_lib.Sexp.t
val nil_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> nil
val bin_shape_tid : Core_kernel.Bin_prot.Shape.t
val __bin_read_tid__ : (int -> tid) Core_kernel.Bin_prot.Read.reader
val compare_tid : tid -> tid -> int
val sexp_of_tid : tid -> Ppx_sexp_conv_lib.Sexp.t
val tid_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> tid
type call
val bin_shape_call : Core_kernel.Bin_prot.Shape.t
val __bin_read_call__ : (int -> call) Core_kernel.Bin_prot.Read.reader
val compare_call : call -> call -> int
val sexp_of_call : call -> Ppx_sexp_conv_lib.Sexp.t
val call_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> call
type label =
  1. | Direct of tid
    (*

    direct jump

    *)
  2. | Indirect of exp
    (*

    indirect jump

    *)

target of control transfer

val bin_shape_label : Core_kernel.Bin_prot.Shape.t
val __bin_read_label__ : (int -> label) Core_kernel.Bin_prot.Read.reader
val compare_label : label -> label -> int
val sexp_of_label : label -> Ppx_sexp_conv_lib.Sexp.t
val label_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> label
type jmp_kind =
  1. | Call of call
    (*

    call to subroutine

    *)
  2. | Goto of label
    (*

    jump inside subroutine

    *)
  3. | Ret of label
    (*

    return from call to label

    *)
  4. | Int of int * tid
    (*

    interrupt and return to tid

    *)

control transfer variants

val bin_shape_jmp_kind : Core_kernel.Bin_prot.Shape.t
val __bin_read_jmp_kind__ : (int -> jmp_kind) Core_kernel.Bin_prot.Read.reader
val compare_jmp_kind : jmp_kind -> jmp_kind -> int
val sexp_of_jmp_kind : jmp_kind -> Ppx_sexp_conv_lib.Sexp.t
val jmp_kind_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> jmp_kind
type intent =
  1. | In
    (*

    input argument

    *)
  2. | Out
    (*

    output argument

    *)
  3. | Both
    (*

    input/output

    *)

argument intention

val bin_shape_intent : Core_kernel.Bin_prot.Shape.t
val __bin_read_intent__ : (int -> intent) Core_kernel.Bin_prot.Read.reader
val compare_intent : intent -> intent -> int
val sexp_of_intent : intent -> Ppx_sexp_conv_lib.Sexp.t
val intent_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> intent
type ('a, 'b) cls
Term type classes
val program_t : (nil, program) cls

program

val sub_t : (program, sub) cls

program

sub

val arg_t : (sub, arg) cls

sub

arg

val blk_t : (sub, blk) cls

arg

blk

val phi_t : (blk, phi) cls

blk

phi

val def_t : (blk, def) cls

phi

def

val jmp_t : (blk, jmp) cls

def

jmp

module Biri : sig ... end

BIR Interpreter

class 'a biri : 'a Biri.t

Some predefined tags

type color = [
  1. | `black
  2. | `red
  3. | `green
  4. | `yellow
  5. | `blue
  6. | `magenta
  7. | `cyan
  8. | `white
  9. | `gray
]
val bin_shape_color : Core_kernel.Bin_prot.Shape.t
val __bin_read_color__ : (int -> color) Core_kernel.Bin_prot.Read.reader
val compare_color : color -> color -> int
val sexp_of_color : color -> Ppx_sexp_conv_lib.Sexp.t
val color_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> color
val __color_of_sexp__ : Ppx_sexp_conv_lib.Sexp.t -> color
val color : color tag

Color something with a color

val foreground : color tag

print marked entity with the specified color. (the same as color, but pretty printing function will output ascii escape sequence of corresponding color.

val background : color tag

print marked entity with specified color. See foreground.

val comment : string tag

A human readable comment

val python : string tag

A command in python language

val shell : string tag

A command in shell language

val mark : unit tag

Mark something as marked

val weight : float tag

Give a weight

val address : addr tag

A virtual address of an entity

val filename : string tag

A name of a file

type image

an image loaded into memory

type mem

opaque memory

val sexp_of_mem : mem -> Ppx_sexp_conv_lib.Sexp.t
type 'a table

a table from memory to 'a

val sexp_of_table : ('a -> Ppx_sexp_conv_lib.Sexp.t) -> 'a table -> Ppx_sexp_conv_lib.Sexp.t
type 'a memmap

interval trees from memory regions to 'a

val sexp_of_memmap : ('a -> Ppx_sexp_conv_lib.Sexp.t) -> 'a memmap -> Ppx_sexp_conv_lib.Sexp.t
module type Memory_iterators = sig ... end

Iterators lifted into monad

module Memory : sig ... end

Memory region

module Table : sig ... end

Table.

module Location : sig ... end

A locations of a chunk of memory

type location = Location.t

memory location

val bin_shape_location : Core_kernel.Bin_prot.Shape.t
val __bin_read_location__ : (int -> location) Core_kernel.Bin_prot.Read.reader
val compare_location : location -> location -> int
val sexp_of_location : location -> Ppx_sexp_conv_lib.Sexp.t
val location_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> location
module Backend : sig ... end

A backend interface.

module Image : sig ... end

Binary Image.

module Memmap : sig ... end

Memory maps. Memory map is an assosiative data structure that maps memory regions to values. Unlike in the Table, memory regions in the Memmap can intersect in an arbitrary ways. This data structure is also known as an Interval Tree.

type symbolizer

Symbolizer defines a method for assigning symbolic names to addresses

type rooter

Rooter defines a method for finding function starts in a program

type brancher

Brancher defines a method for resolving branch instruction

type reconstructor

Reconstructor defines a method for reconstructing symbol tables

type disasm

value of type disasm is a result of the disassembling of a memory region.

values of type insn represents machine instructions decoded from a given piece of memory

val bin_shape_insn : Core_kernel.Bin_prot.Shape.t
val __bin_read_insn__ : (int -> insn) Core_kernel.Bin_prot.Read.reader
val compare_insn : insn -> insn -> int
val sexp_of_insn : insn -> Ppx_sexp_conv_lib.Sexp.t
val insn_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> insn
type block

block is a region of memory that is believed to be a basic block of control flow graph to the best of our knowledge.

val compare_block : block -> block -> int
val sexp_of_block : block -> Ppx_sexp_conv_lib.Sexp.t
type cfg
val compare_cfg : cfg -> cfg -> int
type jump = [
  1. | `Jump
    (*

    unconditional jump

    *)
  2. | `Cond
    (*

    conditional jump

    *)
]

a jump kind. A jump to another block can be conditional or unconditional.

This type defines a relation between two basic blocks.

val compare_jump : jump -> jump -> int
val sexp_of_jump : jump -> Ppx_sexp_conv_lib.Sexp.t
val jump_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> jump
val __jump_of_sexp__ : Ppx_sexp_conv_lib.Sexp.t -> jump
type edge = [
  1. | jump
  2. | `Fall
]

This type defines a relation between two basic blocks.

val compare_edge : edge -> edge -> int
val sexp_of_edge : edge -> Ppx_sexp_conv_lib.Sexp.t
val edge_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> edge
val __edge_of_sexp__ : Ppx_sexp_conv_lib.Sexp.t -> edge
module Kind : sig ... end

Kinds of instructions

type reg

abstract and opaque register

val bin_shape_reg : Core_kernel.Bin_prot.Shape.t
val __bin_read_reg__ : (int -> reg) Core_kernel.Bin_prot.Read.reader
val compare_reg : reg -> reg -> int
val sexp_of_reg : reg -> Ppx_sexp_conv_lib.Sexp.t
val reg_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> reg
type imm

opaque immediate value

val bin_shape_imm : Core_kernel.Bin_prot.Shape.t
val __bin_read_imm__ : (int -> imm) Core_kernel.Bin_prot.Read.reader
val compare_imm : imm -> imm -> int
val sexp_of_imm : imm -> Ppx_sexp_conv_lib.Sexp.t
val imm_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> imm
type fmm

floating point value

val bin_shape_fmm : Core_kernel.Bin_prot.Shape.t
val __bin_read_fmm__ : (int -> fmm) Core_kernel.Bin_prot.Read.reader
val compare_fmm : fmm -> fmm -> int
val sexp_of_fmm : fmm -> Ppx_sexp_conv_lib.Sexp.t
val fmm_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> fmm
type kind = Kind.t

kind of instruction

val bin_shape_kind : Core_kernel.Bin_prot.Shape.t
val __bin_read_kind__ : (int -> kind) Core_kernel.Bin_prot.Read.reader
val compare_kind : kind -> kind -> int
val sexp_of_kind : kind -> Ppx_sexp_conv_lib.Sexp.t
val kind_of_sexp : Ppx_sexp_conv_lib.Sexp.t -> kind
module Reg : sig ... end

Register.

module Imm : sig ... end

Integer immediate operand

module Fmm : sig ... end

Floating point immediate operand

module Op : sig ... end

Operand

type op = Op.t
val bin_shape_op : Core_kernel.Bin_prot.Shape.t
val __bin_read_op__ : (int -> op) Core_kernel.Bin_prot.Read.reader
val compare_op : op -> op -> int
val sexp_of_op : op -> Ppx_sexp_conv_lib.Sexp.t
module Disasm_expert : sig ... end

Expert interface to disassembler.

module Insn : sig ... end

Assembly instruction.

module Block : sig ... end

Basic block.

module Graphs : sig ... end

BAP Common Graphs.

module Disasm : sig ... end

The interface to the disassembler level.

type symtab
module Symtab : sig ... end

Reconstructed symbol table.

module type CPU = sig ... end

A BIL model of CPU

module type Target = sig ... end

Abstract interface for all targets.

val target_of_arch : arch -> (module Target)

target_of_arch arch returns a module packed into value, that abstracts target architecture. The returned module has type Target and can be unpacked locally with:

let module Target = (val target_of_arch arch) in
val register_target : arch -> (module Target) -> unit

Register new target architecture. If target for the given arch already exists, then it will be superseded by the new target.

module Tid : sig ... end

Term identifier.

module Live : sig ... end

Live Variables.

module Term : sig ... end

IR language term.

module Program : sig ... end

Program in Intermediate representation.

module Sub : sig ... end

Subroutine.

module Def : sig ... end

Definition.

module Jmp : sig ... end

A control transfer operation.

module Phi : sig ... end

PHI-node

module Blk : sig ... end

Basic block.

module Arg : sig ... end

Subroutine argument.

module Call : sig ... end

A control transfer to another subroutine.

module Label : sig ... end

Target of a control flow transfer.

module Source : sig ... end

Source of information.

module Taint : sig ... end

Abstract taint.

type 'a source = 'a Source.t
module Symbolizer : sig ... end

Symbolizer maps addresses to function names

module Rooter : sig ... end

Rooter finds starts of functions in the binary.

module Brancher : sig ... end

Brancher is responsible for resolving destinations of branch instructions.

module Reconstructor : sig ... end

Reconstructor is responsible for reconstructing symbol table from a CFG. It should partition a CFG into a set of possibly intersecting functions. See Symtab module for more information about symbol table and functions.

module Event : sig ... end

Event subsystem.

type event = Event.t = ..
type project
module Toplevel : sig ... end

The interface to the BAP toplevel state.

module Project : sig ... end

Disassembled program.

module Self () : sig ... end

A self reflection.

module Log : sig ... end
OCaml

Innovation. Community. Security.