[Core feature] Simple Type Transformers for data backed by FlyteFile or FlyteDirectory #5688

thomasjpfan · 2024-08-23T15:15:25Z

Motivation: Why do you think this is important?

Currently, creating a type transformer requires learning about the TypeTransformer API which requires someone to learn about Flyte Literals, Flyte Context, etc. For example, here is the TypeTransformer for np.ndarray

For the simple case, most data are backed by just a FlyteFile or FlyteDirectory at rest. I propose we have a simplified interface for creating these type transformers. This way, we can make it very easy for users to define their own serializers and deserializers.

Goal: What should the final outcome look like, ideally?

I propose this simplified TypeTransformer API for data backed by FlyteFile:

from typing import Protocol, runtime_checkable, Any, Protocol
from flytekit.types.file import FlyteFile

@runtime_checkable
class FlyteFileBackedTransformerProtocol(Protocol):
    name: str
    python_type: type
      
    def save(self, obj: Any) -> FlyteFile:
        """Serialized obj and return a FlyteFile"""

    def load(self, file: FlyteFile) -> Any:
        """Load model from file."""

The same API can then be applied to FlyteDirectory.

For example, we can define the NumPy Transformer as:

import numpy as np

class NumPyTransformer:
    name: str = "NumPy"
    python_type: type = np.ndarray

    def save(self, obj: np.ndarray) -> FlyteFile:
        new_file = FlyteFile.new_remote_file()
        with new_file.open("wb") as f:
           np.save(f, obj)
        return new_file

    def load(self, file: FlyteFile) -> np.ndarray:
        with file.open("rb") as f:
            return np.load(f)

TypeEngine.register(NumPyTransformer())

Describe alternatives you've considered

Do nothing and only have TypeTransformer API.

Propose: Link/Inline OR Additional context

No response

Are you sure this issue hasn't been raised already?

Yes

Have you read the Code of Conduct?

Yes

The text was updated successfully, but these errors were encountered:

runllm · 2024-08-23T15:15:38Z

The proposed simplified TypeTransformer API for data backed by FlyteFile or FlyteDirectory aims to make it easier for users to define their own serializers and deserializers without needing to learn the full TypeTransformer API. This approach would streamline the process by focusing on the essential methods for saving and loading data.

According to the TypeTransformer documentation, creating a custom transformer involves implementing methods like get_literal_type, to_literal, and to_python_value. The proposed API simplifies this by only requiring save and load methods, which handle the serialization to and from FlyteFile.

Would you like more detail on how to implement this simplified API or examples of existing TypeTransformer implementations?

Sources:

TypeTransformer documentation

This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.

Mention @RunLLM to ask follow-up questions. See runllm.com for more!

thomasjpfan added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels Aug 23, 2024

eapolinario added needs discussion and removed untriaged This issues has not yet been looked at by the Maintainers labels Sep 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Core feature] Simple Type Transformers for data backed by FlyteFile or FlyteDirectory #5688

[Core feature] Simple Type Transformers for data backed by FlyteFile or FlyteDirectory #5688

thomasjpfan commented Aug 23, 2024 •

edited

Loading

runllm bot commented Aug 23, 2024

[Core feature] Simple Type Transformers for data backed by FlyteFile or FlyteDirectory #5688

[Core feature] Simple Type Transformers for data backed by FlyteFile or FlyteDirectory #5688

Comments

thomasjpfan commented Aug 23, 2024 • edited Loading

Motivation: Why do you think this is important?

Goal: What should the final outcome look like, ideally?

Describe alternatives you've considered

Propose: Link/Inline OR Additional context

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?

runllm bot commented Aug 23, 2024

thomasjpfan commented Aug 23, 2024 •

edited

Loading