- Barcelona
Starred repositories
Janus-Series: Unified Multimodal Understanding and Generation Models
Flexible and powerful framework for managing multiple AI agents and handling complex conversations
Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
Ultra-Lightweight Durable Execution in Python
DuckDB is an analytical in-process SQL database management system
Stocator is high performing connector to object storage for Apache Spark, achieving performance by leveraging object storage semantics.
Upserts, Deletes And Incremental Processing on Big Data.
checks all of the hyperlinks in a markdown text to determine if they are alive or dead
Provision remote development environments via Terraform
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
A Gateway for connecting application services in different domains, networks, and cloud infrastructures
A Cloud Native Batch System (Project under CNCF)
A flexible and scalable platform for running Kubernetes control plane APIs.
Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
Machine Learning Pipelines for Kubeflow
Panel: The powerful data exploration & web app framework for Python
Anonymized version of six datasets taken from IBM's DataStage™ production systems and can be used for frequent subgraph mining
A Python package for interactive mapping and geospatial analysis with minimal coding in a Jupyter environment
⎈ Multi pod and container log tailing for Kubernetes -- Friendly fork of https://github.com/wercker/stern
Holistic job manager on Kubernetes
Enabling Kubernetes to make pod placement decisions with platform intelligence.
A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Apache Hive, and Presto.