Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Expand bigint support to have more configurability #239

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 31 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,14 +44,15 @@ deepStrictEqual(decode(encoded), object);
- [`EncoderOptions`](#encoderoptions)
- [`decode(buffer: ArrayLike<number> | BufferSource, options?: DecoderOptions): unknown`](#decodebuffer-arraylikenumber--buffersource-options-decoderoptions-unknown)
- [`DecoderOptions`](#decoderoptions)
- [`IntMode`](#intmode)
- [`decodeMulti(buffer: ArrayLike<number> | BufferSource, options?: DecoderOptions): Generator<unknown, void, unknown>`](#decodemultibuffer-arraylikenumber--buffersource-options-decoderoptions-generatorunknown-void-unknown)
- [`decodeAsync(stream: ReadableStreamLike<ArrayLike<number> | BufferSource>, options?: DecoderOptions): Promise<unknown>`](#decodeasyncstream-readablestreamlikearraylikenumber--buffersource-options-decoderoptions-promiseunknown)
- [`decodeArrayStream(stream: ReadableStreamLike<ArrayLike<number> | BufferSource>, options?: DecoderOptions): AsyncIterable<unknown>`](#decodearraystreamstream-readablestreamlikearraylikenumber--buffersource-options-decoderoptions-asynciterableunknown)
- [`decodeMultiStream(stream: ReadableStreamLike<ArrayLike<number> | BufferSource>, options?: DecoderOptions): AsyncIterable<unknown>`](#decodemultistreamstream-readablestreamlikearraylikenumber--buffersource-options-decoderoptions-asynciterableunknown)
- [Reusing Encoder and Decoder instances](#reusing-encoder-and-decoder-instances)
- [Extension Types](#extension-types)
- [ExtensionCodec context](#extensioncodec-context)
- [Handling BigInt with ExtensionCodec](#handling-bigint-with-extensioncodec)
- [Handling BigInt](#handling-bigint)
- [The temporal module as timestamp extensions](#the-temporal-module-as-timestamp-extensions)
- [Decoding a Blob](#decoding-a-blob)
- [MessagePack Specification](#messagepack-specification)
Expand Down Expand Up @@ -113,7 +114,7 @@ Name|Type|Default
----|----|----
extensionCodec | ExtensionCodec | `ExtensionCodec.defaultCodec`
context | user-defined | -
useBigInt64 | boolean | false
useInt64 | boolean | false
maxDepth | number | `100`
initialBufferSize | number | `2048`
sortKeys | boolean | false
Expand Down Expand Up @@ -148,14 +149,27 @@ Name|Type|Default
extensionCodec | ExtensionCodec | `ExtensionCodec.defaultCodec`
context | user-defined | -
useBigInt64 | boolean | false
intMode | IntMode | IntMode.BIGINT if useBigInt64 is true or IntMode.UNSAFE_NUMBER otherwise
maxStrLength | number | `4_294_967_295` (UINT32_MAX)
maxBinLength | number | `4_294_967_295` (UINT32_MAX)
maxArrayLength | number | `4_294_967_295` (UINT32_MAX)
maxMapLength | number | `4_294_967_295` (UINT32_MAX)
maxExtLength | number | `4_294_967_295` (UINT32_MAX)
intMode | `IntMode` | `IntMode.UNSAFE_NUMBER`

You can use `max${Type}Length` to limit the length of each type decoded.

`intMode` determines whether decoded integers should be returned as numbers or bigints. The possible values are [described below](#intmode).

##### `IntMode`

The `IntMode` enum defines different options for decoding integers. They are described below:

- `IntMode.UNSAFE_NUMBER`: Always returns the value as a number. Be aware that there will be a loss of precision if the value is outside the range of `Number.MIN_SAFE_INTEGER` to `Number.MAX_SAFE_INTEGER`.
- `IntMode.SAFE_NUMBER`: Always returns the value as a number, but throws an error if the value is outside of the range of `Number.MIN_SAFE_INTEGER` to `Number.MAX_SAFE_INTEGER`.
- `IntMode.MIXED`: Returns all values inside the range of `Number.MIN_SAFE_INTEGER` to `Number.MAX_SAFE_INTEGER` as numbers and all values outside that range as bigints.
- `IntMode.BIGINT`: Always returns the value as a bigint, even if it is small enough to safely fit in a number.

### `decodeMulti(buffer: ArrayLike<number> | BufferSource, options?: DecoderOptions): Generator<unknown, void, unknown>`

It decodes `buffer` that includes multiple MessagePack-encoded objects, and returns decoded objects as a generator. See also `decodeMultiStream()`, which is an asynchronous variant of this function.
Expand Down Expand Up @@ -352,7 +366,7 @@ const encoded = = encode({myType: new MyType<any>()}, { extensionCodec, context
const decoded = decode(encoded, { extensionCodec, context });
```

#### Handling BigInt with ExtensionCodec
#### Handling BigInt

This library does not handle BigInt by default, but you have two options to handle it:

Expand Down Expand Up @@ -488,28 +502,27 @@ Note that as of June 2019 there're no official "version" on the MessagePack spec

The following table shows how JavaScript values are mapped to [MessagePack formats](https://github.com/msgpack/msgpack/blob/master/spec.md) and vice versa.

The mapping of integers varies on the setting of `useBigInt64`.

The default, `useBigInt64: false` is:
The mapping of integers varies on the setting of `intMode`.

Source Value|MessagePack Format|Value Decoded
----|----|----
null, undefined|nil|null (*1)
boolean (true, false)|bool family|boolean (true, false)
number (53-bit int)|int family|number
number (64-bit float)|float family|number
number (53-bit int)|int family|number or bigint (*2)
number (64-bit float)|float family|number (64-bit float)
bigint|int family|number or bigint (*2)
string|str family|string
ArrayBufferView |bin family|Uint8Array (*2)
ArrayBufferView |bin family|Uint8Array (*3)
Array|array family|Array
Object|map family|Object (*3)
Date|timestamp ext family|Date (*4)
bigint|N/A|N/A (*5)
Object|map family|Object (*4)
Date|timestamp ext family|Date (*5)
bigint|int family|bigint

* *1 Both `null` and `undefined` are mapped to `nil` (`0xC0`) type, and are decoded into `null`
* *2 Any `ArrayBufferView`s including NodeJS's `Buffer` are mapped to `bin` family, and are decoded into `Uint8Array`
* *3 In handling `Object`, it is regarded as `Record<string, unknown>` in terms of TypeScript
* *4 MessagePack timestamps may have nanoseconds, which will lost when it is decoded into JavaScript `Date`. This behavior can be overridden by registering `-1` for the extension codec.
* *5 bigint is not supported in `useBigInt64: false` mode, but you can define an extension codec for it.
* *2 MessagePack ints are decoded as either numbers or bigints depending on the [IntMode](#intmode) used during decoding.
* *3 Any `ArrayBufferView`s including NodeJS's `Buffer` are mapped to `bin` family, and are decoded into `Uint8Array`
* *4 In handling `Object`, it is regarded as `Record<string, unknown>` in terms of TypeScript
* *5 MessagePack timestamps may have nanoseconds, which will lost when it is decoded into JavaScript `Date`. This behavior can be overridden by registering `-1` for the extension codec.

If you set `useBigInt64: true`, the following mapping is used:

Expand All @@ -519,15 +532,15 @@ null, undefined|nil|null
boolean (true, false)|bool family|boolean (true, false)
**number (32-bit int)**|int family|number
**number (except for the above)**|float family|number
**bigint**|int64 / uint64|bigint (*6)
**bigint**|int64 / uint64|bigint (*5)
string|str family|string
ArrayBufferView |bin family|Uint8Array
Array|array family|Array
Object|map family|Object
Date|timestamp ext family|Date


* *6 If the bigint is larger than the max value of uint64 or smaller than the min value of int64, then the behavior is undefined.
* *5 If the bigint is larger than the max value of uint64 or smaller than the min value of int64, then the behavior is undefined.

## Prerequisites

Expand Down
65 changes: 28 additions & 37 deletions src/Decoder.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import { prettyByte } from "./utils/prettyByte";
import { ExtensionCodec, ExtensionCodecType } from "./ExtensionCodec";
import { getInt64, getUint64, UINT32_MAX } from "./utils/int";
import { IntMode, getInt64, getUint64, convertSafeIntegerToMode, UINT32_MAX } from "./utils/int";
import { utf8Decode } from "./utils/utf8";
import { createDataView, ensureUint8Array } from "./utils/typedArrays";
import { CachedKeyDecoder, KeyDecoder } from "./CachedKeyDecoder";
Expand All @@ -16,10 +16,17 @@ export type DecoderOptions<ContextType = undefined> = Readonly<
* Depends on ES2020's {@link DataView#getBigInt64} and
* {@link DataView#getBigUint64}.
*
* Defaults to false.
* Defaults to false. If true, equivalent to intMode: IntMode.BIGINT.
*/
useBigInt64: boolean;

/**
* Allows for more fine-grained control of BigInt handling, overrides useBigInt64.
*
* Defaults to IntMode.BIGINT if useBigInt64 is true or IntMode.UNSAFE_NUMBER otherwise.
*/
intMode?: IntMode,

/**
* Maximum string length.
*
Expand Down Expand Up @@ -194,7 +201,7 @@ const sharedCachedKeyDecoder = new CachedKeyDecoder();
export class Decoder<ContextType = undefined> {
private readonly extensionCodec: ExtensionCodecType<ContextType>;
private readonly context: ContextType;
private readonly useBigInt64: boolean;
private readonly intMode: IntMode;
private readonly maxStrLength: number;
private readonly maxBinLength: number;
private readonly maxArrayLength: number;
Expand All @@ -214,7 +221,7 @@ export class Decoder<ContextType = undefined> {
this.extensionCodec = options?.extensionCodec ?? (ExtensionCodec.defaultCodec as ExtensionCodecType<ContextType>);
this.context = (options as { context: ContextType } | undefined)?.context as ContextType; // needs a type assertion because EncoderOptions has no context property when ContextType is undefined

this.useBigInt64 = options?.useBigInt64 ?? false;
this.intMode = options?.intMode ?? options?.useBigInt64 ? IntMode.BIGINT : IntMode.UNSAFE_NUMBER
this.maxStrLength = options?.maxStrLength ?? UINT32_MAX;
this.maxBinLength = options?.maxBinLength ?? UINT32_MAX;
this.maxArrayLength = options?.maxArrayLength ?? UINT32_MAX;
Expand Down Expand Up @@ -371,11 +378,11 @@ export class Decoder<ContextType = undefined> {

if (headByte >= 0xe0) {
// negative fixint (111x xxxx) 0xe0 - 0xff
object = headByte - 0x100;
object = this.convertNumber(headByte - 0x100);
} else if (headByte < 0xc0) {
if (headByte < 0x80) {
// positive fixint (0xxx xxxx) 0x00 - 0x7f
object = headByte;
object = this.convertNumber(headByte);
} else if (headByte < 0x90) {
// fixmap (1000 xxxx) 0x80 - 0x8f
const size = headByte - 0x80;
Expand Down Expand Up @@ -418,36 +425,28 @@ export class Decoder<ContextType = undefined> {
object = this.readF64();
} else if (headByte === 0xcc) {
// uint 8
object = this.readU8();
object = this.convertNumber(this.readU8());
} else if (headByte === 0xcd) {
// uint 16
object = this.readU16();
object = this.convertNumber(this.readU16());
} else if (headByte === 0xce) {
// uint 32
object = this.readU32();
object = this.convertNumber(this.readU32());
} else if (headByte === 0xcf) {
// uint 64
if (this.useBigInt64) {
object = this.readU64AsBigInt();
} else {
object = this.readU64();
}
object = this.readU64();
} else if (headByte === 0xd0) {
// int 8
object = this.readI8();
object = this.convertNumber(this.readI8());
} else if (headByte === 0xd1) {
// int 16
object = this.readI16();
object = this.convertNumber(this.readI16());
} else if (headByte === 0xd2) {
// int 32
object = this.readI32();
object = this.convertNumber(this.readI32());
} else if (headByte === 0xd3) {
// int 64
if (this.useBigInt64) {
object = this.readI64AsBigInt();
} else {
object = this.readI64();
}
} else if (headByte === 0xd9) {
// str 8
const byteLength = this.lookU8();
Expand Down Expand Up @@ -692,6 +691,10 @@ export class Decoder<ContextType = undefined> {
return this.extensionCodec.decode(data, extType, this.context);
}

private convertNumber(value: number): number | bigint {
return convertSafeIntegerToMode(value, this.intMode);
}

private lookU8() {
return this.view.getUint8(this.pos);
}
Expand Down Expand Up @@ -740,26 +743,14 @@ export class Decoder<ContextType = undefined> {
return value;
}

private readU64(): number {
const value = getUint64(this.view, this.pos);
this.pos += 8;
return value;
}

private readI64(): number {
const value = getInt64(this.view, this.pos);
this.pos += 8;
return value;
}

private readU64AsBigInt(): bigint {
const value = this.view.getBigUint64(this.pos);
private readU64(): number | bigint {
const value = getUint64(this.view, this.pos, this.intMode);
this.pos += 8;
return value;
}

private readI64AsBigInt(): bigint {
const value = this.view.getBigInt64(this.pos);
private readI64(): number | bigint {
const value = getInt64(this.view, this.pos, this.intMode);
this.pos += 8;
return value;
}
Expand Down
54 changes: 34 additions & 20 deletions src/Encoder.ts
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
import { utf8Count, utf8Encode } from "./utils/utf8";
import { ExtensionCodec, ExtensionCodecType } from "./ExtensionCodec";
import { setInt64, setUint64 } from "./utils/int";
import { ensureUint8Array } from "./utils/typedArrays";
import type { ExtData } from "./ExtData";
import type { ContextOf } from "./context";
import { setInt64, setUint64 } from "./utils/int";

export const DEFAULT_MAX_DEPTH = 100;
export const DEFAULT_INITIAL_BUFFER_SIZE = 2048;
Expand All @@ -13,14 +13,11 @@ export type EncoderOptions<ContextType = undefined> = Partial<
extensionCodec: ExtensionCodecType<ContextType>;

/**
* Encodes bigint as Int64 or Uint64 if it's set to true.
* {@link forceIntegerToFloat} does not affect bigint.
* Depends on ES2020's {@link DataView#setBigInt64} and
* {@link DataView#setBigUint64}.
* Encodes a `number` greater than 32-bit as Int64 or Uint64 if it's set to true, otherwise encode as float64.
*
* Defaults to false.
*/
useBigInt64: boolean;
useInt64: boolean;

/**
* The maximum depth in nested objects and arrays.
Expand All @@ -43,6 +40,7 @@ export type EncoderOptions<ContextType = undefined> = Partial<
* Defaults to `false`. If enabled, it spends more time in encoding objects.
*/
sortKeys: boolean;

/**
* If `true`, non-integer numbers are encoded in float32, not in float64 (the default).
*
Expand Down Expand Up @@ -74,7 +72,7 @@ export type EncoderOptions<ContextType = undefined> = Partial<
export class Encoder<ContextType = undefined> {
private readonly extensionCodec: ExtensionCodecType<ContextType>;
private readonly context: ContextType;
private readonly useBigInt64: boolean;
private readonly useInt64: boolean;
private readonly maxDepth: number;
private readonly initialBufferSize: number;
private readonly sortKeys: boolean;
Expand All @@ -90,7 +88,7 @@ export class Encoder<ContextType = undefined> {
this.extensionCodec = options?.extensionCodec ?? (ExtensionCodec.defaultCodec as ExtensionCodecType<ContextType>);
this.context = (options as { context: ContextType } | undefined)?.context as ContextType; // needs a type assertion because EncoderOptions has no context property when ContextType is undefined

this.useBigInt64 = options?.useBigInt64 ?? false;
this.useInt64 = options?.useInt64 ?? false;
this.maxDepth = options?.maxDepth ?? DEFAULT_MAX_DEPTH;
this.initialBufferSize = options?.initialBufferSize ?? DEFAULT_INITIAL_BUFFER_SIZE;
this.sortKeys = options?.sortKeys ?? false;
Expand Down Expand Up @@ -144,8 +142,6 @@ export class Encoder<ContextType = undefined> {
}
} else if (typeof object === "string") {
this.encodeString(object);
} else if (this.useBigInt64 && typeof object === "bigint") {
this.encodeBigInt64(object);
} else {
this.encodeObject(object, depth);
}
Expand Down Expand Up @@ -200,7 +196,7 @@ export class Encoder<ContextType = undefined> {
// uint 32
this.writeU8(0xce);
this.writeU32(object);
} else if (!this.useBigInt64) {
} else if (this.useInt64) {
// uint 64
this.writeU8(0xcf);
this.writeU64(object);
Expand All @@ -223,7 +219,7 @@ export class Encoder<ContextType = undefined> {
// int 32
this.writeU8(0xd2);
this.writeI32(object);
} else if (!this.useBigInt64) {
} else if (this.useInt64) {
// int 64
this.writeU8(0xd3);
this.writeI64(object);
Expand All @@ -248,15 +244,29 @@ export class Encoder<ContextType = undefined> {
}
}

private encodeBigInt64(object: bigint): void {
if (object >= BigInt(0)) {
// uint 64
this.writeU8(0xcf);
this.writeBigUint64(object);
private encodeBigInt(object: bigint) {
if (object >= 0) {
if (object < 0x100000000 || this.forceIntegerToFloat) {
// uint 32 or lower, or force to float
this.encodeNumber(Number(object))
} else if (object < BigInt("0x10000000000000000")) {
// uint 64
this.writeU8(0xcf);
this.writeBigUint64(object);
} else {
throw new Error(`Bigint is too large for uint64: ${object}`);
}
} else {
// int 64
this.writeU8(0xd3);
this.writeBigInt64(object);
if (object >= -0x80000000 || this.forceIntegerToFloat) {
// int 32 or lower, or force to float
this.encodeNumber(Number(object));
} else if (object >= BigInt(-1) * BigInt("0x8000000000000000")) {
// int 64
this.writeU8(0xd3);
this.writeBigInt64(object);
} else {
throw new Error(`Bigint is too small for int64: ${object}`);
}
}
}

Expand Down Expand Up @@ -296,6 +306,10 @@ export class Encoder<ContextType = undefined> {
const ext = this.extensionCodec.tryToEncode(object, this.context);
if (ext != null) {
this.encodeExtension(ext);
} else if (typeof object === "bigint") {
// this is here instead of in doEncode so that we can try encoding with an extension first,
// otherwise we would break existing extensions for bigints
this.encodeBigInt(object);
} else if (Array.isArray(object)) {
this.encodeArray(object, depth);
} else if (ArrayBuffer.isView(object)) {
Expand Down
Loading