Indigresso Wiki

Open Source Stuff for DASH7

User Tools

Site Tools


opentag:otlibext:bintex

BinTex

BinTex is a library extension for OpenTag that provides parsing for a lightweight markup language of the same name. BinTex was created by JP Norair as a way to represent binary streams of mixed types (string, hex, decimal) in an efficient yet human-readable way. It is distributed with OpenTag, as well as with other products related the OpenTag project such as OTcom.

Modularity

BinTex relies on the Queue Module, but nothing else except standard C libraries. The Queue Module itself is mostly modular, so it is not difficult to add BinTex to any sort of C/C++ application. As such, there are two variants of BinTex: bintex_ot, which is meant to be compiled together with OpenTag, and regular bintex, which is a single C file and can be compiled independently.

Porting BinTex to another language (like Python, for example) should also be pretty easy. Since BinTex can be useful inside a higher-level markup language, like XML, it might be important in the future to port it to common language frameworks that implement XML parsers. C and Python are probably the important ones, here.

BinTex Markup (Quickstart)

A more formal BinTex reference is under development

Single Data Expressions

Single Data Expressions are of a consistent input format, and they are bounded by whitespace. Hex and Decimal input formats are available.

1. Variable Length Hex: (example, x098a52B248)

In this mode, an ASCII string that starts with a leading “x” and ends with whitespace will be parsed as a single hex data stream. Valid characters are 0-9, a-f, and A-F. Non valid characters will be converted to 0. The hex stream terminates when the next character read is whitespace or an end-bracket “]”.

2. 8/16/32 bit integer: (examples, d9, d-13252, d8uc, d8l ...)

An ASCII string that starts with a leading “d” and ends with whitespace will be parsed as a single decimal number. Valid chars are digits (0-9) and minus sign (-). Supplied numerals are parsed into 8, 16, or 32 bits. The parser will use the minimum container to fit the supplied number unless you explicitly specify a type-code at the end of the number. The parser will consider the number terminated after it reads whitespace, an end-parenthesis “)” or a supported type-code.

Supported Integer Type-code characters:

  • u: unsigned (must be first)
  • c: char (8 bits)
  • s: short (16 bits)
  • l: long (32 bits)

Supported Integer Type-codes:

  • [none]: signed, implicit length
  • u: unsigned, implicit length
  • uc: unsigned 8 bits
  • us: unsigned 16 bits
  • ul: unsigned 32 bits
  • c: signed, 8 bits
  • s: signed, 16 bits
  • l: signed, 32 bits

Multiple data expressions

Multiple data expressions are bounded by open and close characters (such as [], (), and “”). Whitespace can exist inside the open and close characters.

1. Multiple Hex expression

Use the square brackets [] to enclose one or more hex sequences. Inside the brackets, the leading “x” is not included. Therefore, an example can be: [0346 83c6 35 2b89 28a860f3]. If you want to load items from a struct, this format can make it easier, since each element is separated.

2. Multiple Integer Expression

Use the parenthesis () to enclose one or more decimal integers. An example can be (84 13 -93s 25026ul).

3. ASCII string

Use the double-quotes “” to enclose an ASCII string. The escape character is the backslash \, and the input rules are the same as those from printf.

Experimental Features

There is a binary input mode that takes input like “b010101”, etc. For the mean time, it is undocumented.

Usage Examples

opentag/otlibext/bintex.txt · Last modified: 2012/08/19 00:01 by jpnorair