Skip to content

MLE-22706 Added encode/decode support for vectors #1789

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 10, 2025

Conversation

rjrudin
Copy link
Contributor

@rjrudin rjrudin commented Jul 9, 2025

No description provided.

@Copilot Copilot AI review requested due to automatic review settings July 9, 2025 20:38
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for Base64 encoding and decoding of float vectors in the Java client to match the MarkLogic server’s vec:base64-encode/vec:base64-decode behavior and verifies interoperability via new tests.

  • Introduces VectorUtil.encode(float...) and VectorUtil.decode(String) with versioned, little-endian serialization.
  • Adds JUnit tests in VectorUtilTest to validate round-trip encoding/decoding using both Java client and server evaluations.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
marklogic-client-api/src/main/java/com/marklogic/client/util/VectorUtil.java Adds static methods for Base64 vector serialization with version handling.
marklogic-client-api/src/test/java/com/marklogic/client/util/VectorUtilTest.java Implements tests covering client-only, server-only, and cross-boundary encode/decode scenarios.
Comments suppressed due to low confidence (2)

marklogic-client-api/src/test/java/com/marklogic/client/util/VectorUtilTest.java:15

  • [nitpick] Test constants are generally defined as static final. Consider changing this to private static final float[] VECTOR to follow standard constant conventions and avoid redundant instance fields.
	private final float[] VECTOR = new float[]{3.14f, 1.59f, 2.65f};

marklogic-client-api/src/main/java/com/marklogic/client/util/VectorUtil.java:36

  • Add a test case to cover this IllegalArgumentException path by decoding a Base64 string with a non-zero version to ensure the exception is thrown as expected.
			throw new IllegalArgumentException("Unsupported vector version: " + version);

Comment on lines 18 to 27
static String encode(float... vector) {
final int dimensions = vector.length;
ByteBuffer buffer = ByteBuffer.allocate(8 + 4 * dimensions);
buffer.order(ByteOrder.LITTLE_ENDIAN);
buffer.putInt(0); // version
Copy link
Preview

Copilot AI Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract the hard-coded version 0 into a named constant (e.g. VERSION = 0) to improve clarity and make future version changes easier.

Suggested change
static String encode(float... vector) {
final int dimensions = vector.length;
ByteBuffer buffer = ByteBuffer.allocate(8 + 4 * dimensions);
buffer.order(ByteOrder.LITTLE_ENDIAN);
buffer.putInt(0); // version
// Define the version constant
public static final int VERSION = 0;
static String encode(float... vector) {
final int dimensions = vector.length;
ByteBuffer buffer = ByteBuffer.allocate(8 + 4 * dimensions);
buffer.order(ByteOrder.LITTLE_ENDIAN);
buffer.putInt(VERSION); // version

Copilot uses AI. Check for mistakes.

buffer.order(ByteOrder.LITTLE_ENDIAN);

final int version = buffer.getInt();
if (version != 0) {
Copy link
Preview

Copilot AI Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the same named constant instead of the literal 0 when checking the version to maintain consistency with the encoding logic.

Suggested change
if (version != 0) {
if (version != VECTOR_VERSION) {

Copilot uses AI. Check for mistakes.

String encoded = VectorUtil.encode(VECTOR);
assertEquals("AAAAAAMAAADD9UhAH4XLP5qZKUA=", encoded);

ArrayNode decoded = (ArrayNode) Common.newEvalClient().newServerEval()
Copy link
Preview

Copilot AI Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of casting from JsonNode, consider using .evalAs(ArrayNode.class) directly to improve type safety and remove the unchecked cast.

Copilot uses AI. Check for mistakes.

BillFarber
BillFarber previously approved these changes Jul 9, 2025
@rjrudin rjrudin merged commit 317ffcb into develop Jul 10, 2025
2 checks passed
@rjrudin rjrudin deleted the feature/22706-base64 branch July 10, 2025 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants