{
int: 3,
float: 3.14,
bool: true,
str: "a"
}
primitive types are built in types that represent the general simple data types found in every laguage.
1
-129
0x07Ff_07Ff
fixed size integers are whole numbers, signed or unsigned.
they come in deffirent sizes: 8 bit, 16 bit, 32 bit and 64 bit.
type | bits | bytes | sign | min | max |
---|---|---|---|---|---|
u8 | 8 | 1 | unsigned | 0 | 255 |
i8 | 8 | 1 | signed | -128 | 127 |
u16 | 16 | 2 | unsigned | 0 | 65535 |
i16 | 16 | 2 | signed | -32768 | 32767 |
u32 | 32 | 4 | unsigned | 0 | 4294967295 |
i32 | 32 | 4 | signed | -2147483648 | 2147483647 |
u64 | 64 | 8 | unsigned | 0 | 18446744073709551615 |
i64 | 64 | 8 | signed | -9223372036854775808 | 9223372036854775807 |
uint_value = nb
int_value = ["+" | "-"] nb
fixed size integers are witten with number literals.
+--------+ +---------+
| id | | value |
+--------+ +---------+
| 0x1x | | ux / ix |
+--------+ +---------+
fixed size integers are encoded in 2 complement little endian in 1, 2, 4 or 8 bytes depending on the type.
type | id | bytes |
---|---|---|
u8 | 0x10 | 1 |
u16 | 0x11 | 2 |
u32 | 0x12 | 4 |
u64 | 0x13 | 8 |
i8 | 0x14 | 1 |
i16 | 0x15 | 2 |
i32 | 0x16 | 4 |
i64 | 0x17 | 8 |
123
-123
1234567890_1234567890bint
variable size integers are integers, signed or unsigned, that takes a variable number of bytes.
vint
and vuint
are variable size integers that correspond to 64 bit signed and unsigned integers, taking 1 to 9 bytes.
bint
is variable size integer of arbitrary sizes, taking as many bytes as needed.
vint_value = signed_nb
vuint_value = nb
bint_value = signed_nb "bint"
variable size integers are written with number literals.
the suffix is required for bigint.
if a number literal value in not inferable and with no suffix, it is assumed to be a vuint
then vint
value.
vint / vuint
+--------+ +--------+
| id | | value |
+--------+ +--------+
| 0x1x | | vint |
+--------+ +--------+
bint value
+--------+ +--------+--------+
| id | | size | value |
+--------+ +--------+--------+
| 0x1e | | vint | x size |
+--------+ +--------+--------+
vint
/ vuint
are encoded in 2 complement in LEB128 encoding, taking as many bytes as needed, max 10 bytes.
bint
is encoded in 2 complement and in little endian encoded as an array of bytes.
in case a bint
is encoded in a field value, the size section is omitted and the length is infered from the len
section in the field encoding.
type | id |
---|---|
vuint | 0x1c |
vint | 0x1d |
bint | 0x1e |
123.456
3.3e-12
floating point numbers are IEEE 754 compliant numbers.
they can be of half, single or double precision.
float_exp = ("e" | "E") ["+" | "-"] dec_part
float_value =
["+" | "-"] (dec_part | [dec_part] "." dec_part) [float_exp] |
("nan" | (["+" | "-"] "inf"))
floating point numbers are written with float literals.
float literals are written in decimal base, with optional exponent.
float literals can be NaN, positive or negative infinity.
if a number literal value in not inferable, it is assumed to be f32
value.
+--------+ +--------+
| id | | value |
+--------+ +--------+
| 0x1x | | fn |
+--------+ +--------+
floating point numbers are encoded in IEEE 754 encoding in little endian, in 2, 4 or 8 bytes depending on the type.
type | id | bytes |
---|---|---|
f32 | 0x18 | 4 |
f64 | 0x19 | 8 |
true
false
boolean is a type that can take only two values: true
or false
.
"true" | "false"
+--------+ +--------+
| id | | value |
+--------+ +--------+
| 0x08 | | u8 |
+--------+ +--------+
booleans are encoded in 1 byte.
value | encoding |
---|---|
true | 0x01 |
false | 0x00 |
other values are not allowed.
"hallo world"
strings are UTF-8 encoded sequences of characters.
str
they are 2 types of strings:
for more information, read about string literals
+--------+
| id |
+--------+
| 0x20 |
+--------+
base value value in fields
+---------+--------+ +-------+
| len | chars | | chars |
+---------+--------+ +-------+
| varuint | len | | len |
+---------+--------+ +-------+
strings are encoded in UTF-8, they consists of a varuint specifing the length in bytes of the string, followed by the content of the string.
in case a string is encoded in a field value, the length section is omitted, and the length is inferred from the len
section in the field encoding.
any
any is a type that can take any value, it allow schemaless data, even inside schemaed data.
any_id = "any"
any_value = value
struct MyStruct {
custom_data: any,
}
MyStruct {
custom_data: 1 // vuint
}
id value
+--------+ +--------+--------+
| id | | id | value |
+--------+ +--------+--------+
| 0x01 | | typeid | N |
+--------+ +--------+--------+
any is encoded through a typeid, followed by a value encoding of that type.