Oracle’s internal formats.1

Posted on November 15, 2015

The Oracle database documentation is really helpful for the decompiler 🙂 There are a lot of information about the internal storage of datatypes (DATE and NUMBER). You can search for “Oracle native datatypes”.

The FMX files contain a lot of such byte arrays, and it is really simple to translate these with such information. Currently I try to translate the values of static record groups, which can contain CHARACTER (simple strings), DATEs and NUMBERs.

Date

Oracle uses an internal format, which has 7 bytes: century, year, month, day_of_month, hour, minutes, seconds. In my example files, century and year are increased by a static value of 100. So you have an array:

78 = 120
73 = 115
04 = 4
0A = 10
01 = unused
01 = unused
01 = unused

for the 10-APR-2015 (without timestamp). I’ll see, how the time part is set for complete datetimes. The offset of 100 could differ between the BC and AD base.

Number

There is also a chapter about the internal numeric format on the same page in the Oracle documentation.

Oracle uses 1 byte to store the exponent and up to 20 bytes to store the mantissa.

For the value 100, there should be an exponent of 2 (10^2) and a mantissa of 1. But the byte array contains 2 bytes:

C2 = 1100 0010 = exponent of 2 and a bitmask in the upper half byte
02 = 0000 0010 = mantissa ?

For the value of 10, I get

C1 = 1100 0001 = exponent of 1 and the same bitmask
0B = 0000 1011 = mantissa ?

For the value of 1, I get

C1
02

For the value of 0, I get only one byte as described in the documentation:

80 = 1000 0000

For the value of 5, I get

C1 = 1100 0001
06 = 0000 0110

There seems to be a pattern to the value of 1, the mantissa should be reduced by 1 for the real value. It works also for the value 10, 0x0B minus 1 is 0x0A=10. Then I should also reduce the exponent by 1 because 10^1 * 10 = 100 and not 10 (but 10^0 = 1 * 10 = 10). But it doesn’t work for 100, 10^1 = 10 * 1 = 10 and not 100.

I could try a bitmask, exponent & 0x3E, which would remove the lowest bit and the upper 2 bits, if it set. The upper two bits could mark positive/negative values, I will see.

Value of 412, the example in the Oracle documentation:

C2 = 1100 0010 = exponent & 0x3E = 2 = 100
05 = 0000 0101 = mantissa of 5 (-1 = 4)
0D = 0000 1101 = mantissa of 13 (-1 = 12)

This seems to work, the exponent is 2 (=100), the mantissa is 4.12 * 100 = 412.

The question is, which value will switch to the next byte? 12 could also be stores as 1 and 2, but it is stored into one byte as 0C (+1 = 0D).

Value of 499:

C2 = exponent 2 = 100
05 = (5-1) = 4
64 = 0110 0100 = (100-1) = 99

Value of 4999:

C2 = exponent 2 = 100
32 = (50-1) = 49
64 = (100-1) = 99

So it seems that every byte can store values till 99. This will also match with the highest possible number (Oracle documentation) of 9.99…9 x 10^125.

Let us test some decimal numbers with some significant fractions.

Value of 412.56

c2 exponent = 2 = 100
05 mantissa 5-1 = 4
0d mantissa 13-1 = 12
39 mantissa 57-1 = 56

As expected.

Value of 0.00412

bf = 1011 1111 = 191?
2a = 0010 1010 = (42-1) = 41
15 = 0001 0101 = (21-1) = 20

Hm, the two mantissa bytes get 41.20, we would need an exponent of -4 to get the original value. Our exponent (0xbf) should contain a negative sign and the value of 4. Too heavy for me, let us use another value:

Value of 0.0412

c0
05 = (5-1) = 4
0d = (13-1) = 12

The mantissa is 4.12, we need an exponent of -2.

Value of 0.412

c0
2a = (42-1) = 41
15 = (21-1) = 20

The mantissa is 41.20, we need an exponent of -2 too.

So it seems, there is no exponent 10^-1 and -2 is encoded as 0xC0. The exponent 10^1 is also not used, because the first mantissa byte can hold values from 1..99.

What about 0.0000412?

be = 1011 1110
2a = (42-1) = 41
15 = (21-1) = 20

The exponent has been changed by 1, but it must store now -6 instead of 0xBF, which should be -4. So the exponent -5 doesn’t exist.

If we remove the highest bit and define the second bit as negative flag (0=negative, 1=positive), then we have values of 3F (-4) and 3E (-6).

The next exponent switch must be with 0.000000412:

bd = 1011 1101 = (3D = -8)
2a = 41
15 = 20

As algorithm to calculate the real negative exponent (if the second highest bit is 0) I could use

1. shift the bits by 1 to the left
2. add 1
3. negate the value
4. add 4

111111 XOR 0xbf << 1 = 0000000 0 + 4 = 4
111110 XOR 0xbf << 1 = 0000010 2 + 4 = 6
111101 XOR 0xbf << 1 = 0000100 4 + 4 = 8
111100 XOR 0xbf << 1 = 0000110 6 + 4 = 10

Seems to work, special case for exponent 0xC0 = -2.

Now there are some special numbers (NaN, +/-infinity) and also negative values left. But not today.

Post Views: 51

Oracle’s internal formats.1

Leave a Reply

Archives

Oracle’s internal formats.1

Leave a Reply

Archives

Tags