Call stack representation and construction.
The call stack computation algorithm is the following:
There are two components: the current node (or stack frame) and the continuation (a list of stack frames). There's a "toplevel node" describing the execution at the toplevel of the interpreter. A node contains:
id
: a function call representation (an identifier)t
: the ticks elapsed during the calltime
: the time elapsed during the callsub
: the subcalls.
Note that for the rest of the algorithm, `time` will be eluded as its computation is equivalent to the ticks.
The algorithm starts with an empty toplevel and an empty continuation.
- on function call (id, current_tick, current_node, continuation): 1. create a node N_id: (id, t: current_tick, sub:) 2. update current_node N_curr with t:(current_tick - t) => the number of ticks is now the diff between the moment the call started and the subcall started. 3. push N_curr on the continuation 4. return N_id, continuation
- on function end (current_tick, current_node, continuation): 1. update current_node N_curr with t:(current_tick - t) 2. pop N_prev from the continuation 3. update N_prev: t:(current_tick - t) sub:(sub + N_curr) 4. return N_prev, continuation
Let's take an example: call: f () { ...... g () { .... h () { ...... } .......... } ......... } tick: 0 ----------> 10 -------> 30 ---------> 60 --------> 100 -----> 160 10 ticks
20 ticks
30 ticks
40 ticks
60 ticks
- `f` takes 10 + 60 = 70 ticks
- `g` takes 20 + 40 = 60 ticks
- `h` takes 30 ticks
T (nodes
) : toplevel K : continuation (list) N : current node (N(id) means it hasn't changed from previous step)
N, K |- exec
Start: T, |- f () { g () { h () { } } } ==> at tick 0 N (f, 0, ), T
|- g () { h () { } } } ==> at tick 10 N (g, 10, ), N (f, 10 - 0 = 10, []); T
|- h () { } } } ==> at tick 30 N (h, 30, ), N (g, 30 - 10 = 20, []); N(f); T
|- } } } ==> at tick 60 N (g, 60 - 20 = 40, N (h, 60 - 30 = 30, [])
), N(f); T
|- } } ==> at tick 100 N (f, 100 - 10 = 90, N (g, 100 - 40 = 60, [N(h)])
), T
|- } ==> at tick 160 T N (f, 160 - 90 = 70, [N(g, 60, [N(h, 30, [])])])
, |- _
end_function_call current_tick current_function call_stack
implements an ending call. Please refer to the prelude of the file.
call_function called_function current_tick current_function call_stack
implements a function start. Please refere to the prelude of the module.
Profiling the execution of the PVM
type function_call =
| Function of string
| CallDirect of int32
| CallRef of int32
| Internal of string
A function call can be either a direct call, a call through a reference or an internal step of the PVM.
initial_eval_call
is `kernel_run` function call.
update_on_decode current_tick current_call_state
starts and stop `internal` calls related to the Decode step of the PVM.
update_on_link current_tick current_call_state
starts and stop `internal` call to the Link step of the PVM.
update_on_init current_tick current_call_state
starts and stop `internal` call to the Init step of the PVM.
update_on_instr current_tick current_node call_stack
handle function calls during the evaluation.
update_on_eval current_tick current_call_state
handle function calls and end during the evaluation.
update_call_stack current_tick current_state_call symbols state
returns the call state changes for any state. Returns None
if no change happened.
module State : sig ... end
Flamegraph building
Flamegraph are an aggregation of all the same callstacks, thus there is no longer a notion of time. We can easily collapse all nodes into a single one.
collapse_stack ~max_depth pp_call call_stack
collapses a call stack into a valid flamegraph. Node deeper than max_depth
are not considered. pp_call
is used to print the identifiers.
Pretty printing and flamegraph output
pp_indent ppf depth
prints an indentation corresponding to the given depth
.
pp_stack ~max_depth ppf stack
pretty prints the stack. It should be used for debug only.
pp_callstack_as_flamegraph
if pp_stack
with the syntax of flamegraphs.
pp_flamegraph
collapses the stack and print it as a valid flamegraph.
pp_flamegraph ~collapsed ~max_depth pp_call ppf call_stack
outputs the given call_stack
with its `flamegraph` representation. If collapse =
true
, the stacks are collapsed. This can be useful to output smaller files, but the stack cannot be analyzed on a time basis (i.e. as a flamechart).
aggregate_toplevel_time_and_ticks ~call_stack
counts the time and ticks spent in each toplevel phases during an execution.