Skip to content

feat(go/genkit): add DefineSchema and Dotprompt reference support #2769

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 0 commits into from

Conversation

ihan211
Copy link

@ihan211 ihan211 commented Apr 16, 2025

This PR adds a schema definition system to genkit with Dotprompt reference support.

Key Components

  • Schema registration with Dotprompt framework
  • DefineSchema API for defining prompt data structures
  • Enhanced initialization process for schema registration
  • Schema resolution within templates

Example Usage

// Define a product schema
type ProductSchema struct {
    Name        string  `json:"name"`
    Description string  `json:"description"`
    Price       float64 `json:"price"`
    Category    string  `json:"category"`
    InStock     bool    `json:"inStock"`
}

// Register the schema with genkit
genkit.DefineSchema("ProductSchema", ProductSchema{})

// Reference the schema in a Dotprompt template
// promptContent example:
// ---
// input:
//   schema:
//     theme: string
// output:
//   schema: ProductSchema
// ---
// Generate a product that fits the {{theme}} theme.
// Make sure to provide a detailed description and appropriate pricing.

The implementation enables:
* Automatic schema registration during Genkit initialization
* Structured output parsing from model responses to Go structs
* Schema reference resolution within Dotprompt templates
* JSON Schema support for standardized type definitions

Testing
Unit tests in schema_test.go
Working example in samples directory

Checklist (if applicable):
- [X] PR title is following https://www.conventionalcommits.org/en/v1.0.0/
- [X] Tested (manually, unit tested, etc.)
- [ ] Docs updated (updated docs or a docs bug required)

//
// personSchema := genkit.DefineSchema("Person", Person{})
func DefineSchema(name string, schema Schema) Schema {
if name == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doubt, if core.RegisterSchema already has those validations, is it okay to have them also here? Just learning :D

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You raise an interesting point, genkit.DefineSchema not only calls core.RegisterSchema but also adds to pendingSchemas. refactor to remove duplication of validation is a good idea, I'm thinking to refactor pendingSchemas to core package.

}

// registerSchemaResolver registers a schema resolver with Dotprompt to handle schema lookups
func registerSchemaResolver(dp *dotprompt.Dotprompt) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the purpose of the private function? Maybe the Public one (RegisterGlobalSchemaResolver) could instance the private one (registerSchemaResolver) to factorize code, what are your thoughts about it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good observation! and your suggestion follows DRY (Don't Repeat Yourself) principle. I'll expose it out by adding func RegisterGlobalSchemaResolver(dp *dotprompt.Dotprompt) { registerSchemaResolver(dp) }

}

r.Dotprompt.DefineSchema(name, jsonSchema)
r.RegisterValue("schema/"+name, structType)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about adding 'schema/' as a const?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EditGood suggestion! I'll add a const SchemaPrefix = "schema/" .


r.Dotprompt.DefineSchema(name, jsonSchema)
r.RegisterValue("schema/"+name, schema)
r.setupSchemaLookupFunction()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe RegisterSchemaWithDotprompt also could call DefineSchema and only ejecute r.setupSchemaLookupFunction() after calling it and validating there is no error.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! I'll update the code to call setupSchemaLookupFunction() only after successful schema registration to ensure proper error handling.

schema.Properties = orderedmap.New[string, *jsonschema.Schema]()
schema.Required = []string{}

for i := 0; i < t.NumField(); i++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this part of code could be factorized in another function to be recursively? Well, supporting more nesting levels could be another issue for itself. Let me know what you think! :D

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this section could benefit from refactoring! I'll create a helper function to handle the field-to-schema conversion recursively, which would improve code organization and support nested struct types more elegantly.

@apascal07
Copy link
Collaborator

Tests are not passing, can you resolve that first?

// SchemaType is the type identifier for schemas in the registry.
const SchemaType = "schema"

// schemaRegistry maintains registry of schemas.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment needed here?

schemasMu.RLock()
defer schemasMu.RUnlock()

// First check local registry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is pretty clear to read, these comments are not needed at all.
A good suggestion from Alex was that we should remove the comments that explains what and only keep the comments that explains why


// GetPendingSchemas returns a copy of pending schemas that need to be
// registered with Dotprompt.
func GetPendingSchemas() map[string]Schema {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func GetPendingSchemas() map[string]Schema {
func PendingSchemas() map[string]Schema {

}

// GetSchemas returns a copy of all registered schemas.
func GetSchemas() map[string]any {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func GetSchemas() map[string]any {
func Schemas() map[string]any {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, avoid having function names with Get/Set prefixes 👍🏽


// GetSchema retrieves a registered schema by name.
// It returns an error if no schema exists with that name.
func GetSchema(name string) (Schema, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason behind having GetSchema if it internally calls LookupSchema?


// LookupSchema retrieves a registered schema by name.
// It returns nil and false if no schema exists with that name.
func LookupSchema(name string) (Schema, bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't be cleaner and easier to use if the return value would be to return Schema if found and nil if not found?

return nil
}

// Convert the schema to a JSON schema
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: not needed, same for comment in line 113


schema := &jsonschema.Schema{
Type: "object",
Properties: orderedmap.New[string, *jsonschema.Schema](),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, why do we need an ordered map?


// ClearSchemas removes all registered schemas.
// This is primarily for testing purposes.
func ClearSchemas() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where's this function used?

}

// convertStructToJsonSchema converts a Go struct to a JSON schema
func convertStructToJsonSchema(structType any) (*jsonschema.Schema, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better if we use map[string]any as the type instead of just any?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants