package gapi-ocaml

  1. Overview
  2. Docs
Legend:
Library
Module
Module type
Parameter
Class
Class type

ASN.1 support functions

See below for a little intro into ASN.1: intro

exception Out_of_range
exception Parse_error of int

Byte position in string

exception Header_too_short

Byte position in string

module Type_name : sig ... end
module Value : sig ... end
val decode_ber : ?pos:int -> ?len:int -> string -> int * Value.value

Decodes a BER-encoded ASN.1 value. Note that DER is a subset of BER, and can also be decoded.

pos and len may select a substring for the decoder. By default, pos=0, and len as large as necessary to reach to the end of the string.

The function returns the number of interpreted bytes, and the value. It is not considered as an error if less than len bytes are consumed.

The returned value represents implicitly tagged values as Tagptr(class,tag,pc,pos,len). pos and len denote the substring containting the contents. Use Netasn1.decode_ber_contents to further decode the value. You can use ITag to put the decoded value back into the tree.

A number of values are not verified (i.e. nonsense values can be returned):

  • for all string types it is not checked whether the constraints are satisfied (e.g. whether an UTF8String really contains UTF-8).
  • External, Embedded_PDV and Real are unchecked
  • Other values may first be checked on first access (e.g. GeneralizedTime).
val decode_ber_tstring : ?pos:int -> ?len:int -> Netsys_types.tstring -> int * Value.value

Same for tagged strings

val decode_ber_poly : ?pos:int -> ?len:int -> 's Netstring_tstring.tstring_ops -> 's -> int * Value.value

polymorphic version

val decode_ber_contents : ?pos:int -> ?len:int -> ?indefinite:bool -> string -> Value.pc -> Type_name.type_name -> int * Value.value

Decodes the BER-encoded contents of a data field. The contents are assumed to have the type denoted by type_name.

pos and len may select a substring for the decoder. By default, pos=0, and len as large as necessary to reach to the end of the string.

If indefinite, the extent of the contents region is considered as indefinite, and the special end marker is required. This is only allowed when pc = Constructed.

The function returns the number of interpreted bytes, and the value. It is not considered as an error if less than len bytes are consumed.

You need to use this function to recursively decode tagged values. If you get a Tagptr(class,tag,pc,s,pos,len) value, it depends on the kind of the tag how to proceed:

The BER encoding doesn't include whether the tag is implicit or explicit, so the decode cannot do by itself the right thing here.

val decode_ber_contents_tstring : ?pos:int -> ?len:int -> ?indefinite:bool -> Netsys_types.tstring -> Value.pc -> Type_name.type_name -> int * Value.value

Same for tagged strings

val decode_ber_contents_poly : ?pos:int -> ?len:int -> ?indefinite:bool -> 's Netstring_tstring.tstring_ops -> 's -> Value.pc -> Type_name.type_name -> int * Value.value

Polymorphic version

val decode_ber_length : ?pos:int -> ?len:int -> string -> int

Like decode_ber, but returns only the length.

This function skips many consistency checks.

val decode_ber_length_tstring : ?pos:int -> ?len:int -> Netsys_types.tstring -> int

Same for tagged strings

val decode_ber_length_poly : ?pos:int -> ?len:int -> 's Netstring_tstring.tstring_ops -> 's -> int

Polymorphic version

val decode_ber_header : ?pos:int -> ?len:int -> ?skip_length_check:bool -> string -> int * Value.tag_class * Value.pc * int * int option

let (hdr_len, tc, pc, tag, len_opt) = decode_ber_header s: Decodes only the header:

  • hdr_len will be the length of the header in bytes
  • tc is the tag class
  • pc whether primitive or constructed
  • tag is the numeric tag value
  • len_opt is the length field, or None if the header selects indefinite length

If skip_length_check is set, the function does not check whether the string is long enough to hold the whole data part.

If the string is a valid beginning of a header, the special exception Header_too_short is raised (instead of Parse_error).

val decode_ber_header_tstring : ?pos:int -> ?len:int -> ?skip_length_check:bool -> Netsys_types.tstring -> int * Value.tag_class * Value.pc * int * int option

Same for tagged strings

val decode_ber_header_poly : ?pos:int -> ?len:int -> ?skip_length_check:bool -> 's Netstring_tstring.tstring_ops -> 's -> int * Value.tag_class * Value.pc * int * int option

Polymorphic version

val streamline_seq : (Value.tag_class * int * Type_name.type_name) list -> Value.value list -> Value.value option list

streamline_seq expected seq: This function can be called for a list of values Value.Seq seq, and will compare the list seq with the expected list, and will mark missing elements in the sequence, and will recursively decode the occurring elements with the type information from expected.

For example, if expected is

[Context,0,Integer; Context,1,Octetstring; Context,2,IA5String] 

and the passed seq is just

[Tagptr(Context,1,...)] 

the function assumes that the elements with tags 0 and 2 are optional and it assumes that the element with tag 1 is decoded as Octetstring, leading to

None; Some(Octetstring ...); None 

It is allowed to put Universal tags into the expected list. The tag number is ignored in this case (for simplicity).

val streamline_set : (Value.tag_class * int * Type_name.type_name) list -> Value.value list -> Value.value list

streamline_set typeinfo set: This function can be called for a list of values Value.Set seq, and decodes the list with the type information from typeinfo.

For example, if typeinfo is

[Context,0,Integer; Context,1,Octetstring; Context,2,IA5String] 

and the passed set is just

[Tagptr(Context,1,...); Tagptr(Context 0,...)] 

the function decodes the elements as

[ Octetstring ...; Integer ... ] 

The Abstract Syntax Notation 1 (ASN.1)

ASN.1 allows you to represent structured values as octet streams. The values can be composed from a wide range of base types (e.g. numbers and many different kinds of strings) and can be arranged as sequences (records and arrays), sets, and tagged values (a concept fairly close to OCaml variant types). There is a definition language allowing you to define types and values. This language is not covered here (and there is no IDL compiler). Look for ITU X.680 standard if you want to know more. We focus here on the octet representation, which is sufficient for parsing and printing ASN.1 values.

Encoding rules

There are three variants on the representation level:

  • BER: Basic Encoding Rules
  • CER: Canonical Encoding Rules
  • DER: Distinguished Encoding Rules

BER describes the basic way how the octets are obtained, but leaves several details up to the sender of an ASN.1 message. CER and DER use stricter rules that are subsets of BER so that a given value can only be represented in a single way. CER targets at large messages, whereas DER is optimized for small messages. This module includes a generic decoder for all BER messages, and Netasn1_encode supports DER encoding. The ASN.1 octet representations are described in ITU X.690.

The TLV representation

ASN.1 uses a type-length-value (TLV) style representation, i.e. there is a header containing type information and the length of the data, followed by the payload data. The data can be primitive (e.g. a number) or "constructed" (i.e. a composition of further values). For certain data types the user can choose whether to prefer a primitive representation or a construction from several part values (e.g. a very long string can be given as a sequence of string chunks). Because of this, there is a Netasn1.Value.pc bit in the representation so that this choice is available at runtime.

The type is given as a numeric tag (a small number), and a tag class (Netasn1.Value.tag_class). There are four tag classes:

  • Universal: These tags are used for types defined by the ASN.1 standard, and should not be used for anything else. For example the type OctetString gets the universal tag 3.
  • Application: These tags are intended for marking newly defined types. E.g. if you have a definition type filename = string and you would like to have filenames specially tagged to distinguish them from other uses of strings, the runtime representation of filenames could get an application tag (e.g. the number 8). In ASN.1 syntax:

    Filename ::= [APPLICATION 8] IA5String
  • Context-specific: These tags are intended for marking variants, i.e. tags that are local to a specific use. An example in ASN.1 syntax:

    CustomerRecord ::= SET { name            [0] VisibleString,
                             mailingAddress  [1] VisibleString,
                             accountNumber   [2] INTEGER,
                             balanceDue      [3] INTEGER }

    The numbers in brackets are the context-specific tags.

  • Private: These are reserved for site-specific extensions of standardized message formats.

Conceptionally, universal and application tags identify types, whereas context-specific tags identify variants (local cases). Both concepts are not cleanly separated, though. If you e.g. define a set of values, and one value variant is a string and another variant is an integer, there is no strict need to use context-specific tags, because the tags for the type "string" and for the type "integer" are already different. In ASN.1 syntax:

Example ::= SET { x VisibleString,
                  y INTEGER }

A VisibleString has universal tag 26, and an INTEGER has universal tag 3.

Note that the bracket notation includes a keyword "UNIVERSAL", "APPLICATION", or "PRIVATE" for these three classes, and that a plain number indicates context-specific tags.

Finally, there are two ways of applying tags: Explicit and implicit. Explicit tagging is used when the binary values should retain the complete type information: If a tag is applied to an existing value, another header with tag and length field is created, and the value is seen as the contents of this construction. In other words, tagging is an explicit construction like others (e.g. like a record).

Implicit tagging means that the tag of the existing value is replaced by the new tag. As tags also encode the types, this means that type information is lost, and you need apriori knowledge about the possible tags to decode such values (e.g. that an application tag 8 always means an IA5String).

How to decode values

The function Netasn1.decode_ber will happily decode any BER data and return a complex Netasn1.Value.value unless implicit tagging is used. Implicit tags cannot be decoded in one go because the type information is missing. Instead of completely decoding such tags, only a marker Tagptr(tag_class,tag,pc,data,pos,len) is created. Here, tag_class and tag describe the tag. The value to which the tag is applied is not yet parsed, but only a "pointer" in form of the string data, the position pos and the byte length len is returned. This range inside data represents the inner value.

After determining the type of this value (by knowing which type is applicable for tag and tag_class), you can call Netasn1.decode_ber_contents to decode the value. This function is different from Netasn1.decode_ber because it doesn't start at the header of the BER representation but after the header. The type needs to be passed explicitly because it isn't retrieved from the header.

OCaml

Innovation. Community. Security.