Releases: flyteorg/flyte
Flyte v1.11.0 milestone release
Flyte v1.11.0 Release Notes
We're excited to announce the release of Flyte v1.11.0! This version brings a host of improvements, bug fixes, and new features designed to enhance your experience with Flyte. From operational enhancements to documentation updates, this release aims to make Flyte more robust, user-friendly, and feature-rich.
Breaking change
As of this version, overriding container image via task node overrides requires flytekit 1.11.0.
Highlights
- Agents hit General Availability (GA): Agents, now in General Availability, are long-running, stateless services that facilitate asynchronous job launches on platforms like Databricks or Snowflake and enable external service calls. They are versatile, supporting implementations in any language through a protobuf interface, enhancing Flyte's flexibility and operational efficiency.
- Improved Caching: Support for loading cached sublists with multiple data types has been introduced, eliminating issues related to cache retrieval across varied data formats.
- Tracing and Observability: The introduction of opentelemetry BlobstoreClientTracer in flyteadmin enhances observability, allowing for better monitoring and troubleshooting.
- Security Enhancements: Added securityContext configuration to Flyte-core charts, strengthening the security posture of Flyte deployments.
- Documentation Overhaul: Continuous improvements and updates have been made to the documentation, fixing broken links and updating content for better clarity and usability.
- Operational Improvements: This release introduces enhancements such as adding a service account for V1 Ray Jobs, caching console assets in a single binary, and conditional mounting of secrets to improve the operational efficiency of Flyte. Additionally, we are removing
kustomize
from our deployment process to simplify the configuration and management of Flyte instances, making it easier for users to maintain and streamline their deployment workflows.
Bug Fixes
- Fixed Literal in Launchplan: Added fixed_literal to the launchplan template, addressing issues with hardcoded values in workflows.
- Corrected Metadata and Resources: Fixes have been applied to correct IsParent metadata in ArrayNode eventing and to address invalid "resources" scope issues in deployment configurations.
- Enhanced Stability and Performance: Numerous bug fixes have been implemented to address stability and performance issues, including fixes for data catalog errors, yaml comment errors in pod template examples, and more.
Documentation and Guides
- Comprehensive Guides: New guides and documentation updates have been added, including a ChatGPT Agent Setup guide and an Airflow migration guide. Improvements in documentation for developing agents have been integrated into the broader enhancements for this release.
- Updated Troubleshooting and Configuration Docs: New troubleshooting guides for spark task execution and updates to deployment configuration documents enhance the knowledge base for Flyte users.
Contributors
We extend our deepest gratitude to all the contributors who made this release possible. Special shoutouts to @neilisaur, @lowc1012, @MortalHappiness, @novahow, and @pryce-turner for making their first contributions!
For a full list of changes, enhancements, and bug fixes, visit our changelog.
Thank you for your continued support of Flyte. We look forward to hearing your feedback on this release!
Flyte v1.11.0-b1 milestone release
Flyte v1.11.0-b1
Second beta release for 1.11.0.
This includes a refresh of flyteconsole.
Flyte v1.11.0-b0 milestone release
Flyte v1.11.0-b0
Beta release to test new idl
Flyte v1.10.7 milestone release
Flyte 1.10.7 Release Notes
We're excited to share the release of Flyte 1.10.7, featuring a broad spectrum of updates, improvements, and bug fixes across the Flyte ecosystem. This release marks a pivotal shift in our development approach, notably with our adoption of buf for protobuf stub generation. This move optimizes our development workflow and discontinues the automatic creation of Java and C++ stubs, making it easier to adapt the generated code for other languages as needed. Additionally, we've upgraded to gRPC-gateway v2, aligning with the latest advancements and recommendations found in the v2 migration guide.
Our sincere gratitude goes to all contributors for their invaluable efforts towards this release.
Core Improvements and Bug Fixes
- Improved error handling for transient secret sync issues, enhancing the robustness of secret management. [PR #4310]
- Introduced Sphinx build for Monodocs, improving documentation generation and integration. [PR #4347]
- Enhanced the Spark plugin by fixing the environment variable
ValueFrom
for pod templates, allowing for more dynamic configurations. [PR #4532] - Optimized fastcache behavior to not cache lookups on node skip, reducing unnecessary cache hits. [PR #4524]
- Removed composition errors from branch nodes, streamlining execution paths. [PR #4528]
- Added support for ignoring warnings related to AWS SageMaker imports, improving integration compatibility. [PR #4540]
- Fixed a bug related to setting the service account from PodTemplate, ensuring correct service account usage. [PR #4536]
- Addressed flaky tests in test_monitor, enhancing test reliability. [PR #4537]
- Updated the boilerplate version and contribution guide, facilitating better community contributions. [PR #4541], [PR #4501]
- Improved documentation build processes by manually creating version files and introducing a conda-lock file for consistent environment setup. [PR #4556], [PR #4553]
- Enhanced array node evaluation frequency optimization by detecting subNode phase updates. [PR #4535]
- Introduced support for failure nodes, allowing workflows to handle failures more gracefully. [PR #4308]
- Made various updates to Go versions, plugin integrations, and GitHub workflows to enhance performance and developer experience. [PR #4534], [PR #4582], [PR #4589]
- Addressed several bugs and made improvements in caching, metadata handling, and task execution, further stabilizing the Flyte platform. [PR #4594], [PR #4590], [PR #4607]
- Streamlined development workflow with the transition to buf for generating protobuf stubs, ceasing the automatic generation of Java and C++ stubs.
- Upgraded to grpc-gateway v2, optimizing API performance and compatibility.
Plugin and Integration Enhancements
- Added new features and fixed bugs in the Spark plugin, Ray Autoscaler integration, and other areas, expanding Flyte's capabilities and integration ecosystem. [PR #4363]
- Updated various dependencies and configurations, ensuring compatibility and security. [PR #4571], [PR #4643]
- Improved the handling and documentation of plugin secrets management, making it easier for users to manage sensitive information. [PR #4732]
Documentation and Community
- Updated community meeting cadence and contribution guidelines, fostering a more engaged and welcoming community. [PR #4699]
- Enhanced documentation through various updates, including the introduction of a new architecture image for FlytePlugins and clarification of propeller scaling. [PR #4661], [PR #4741]
Full Changelog
- Fix transient secret sync error handling by @Tom-Newton in #4310
- Monodocs sphinx build by @cosmicBboy in #4347
- [Spark plugin] Fix environment variable ValueFrom for pod templates by @Tom-Newton in #4532
- fastcache should not cache lookup on node skip by @hamersaw in #4524
- Removed composition error from branch node by @hamersaw in #4528
- ignore warnings related to awssagemaker import by @cosmicBboy in #4540
- [BUG] Fix setting of service_account from PodTemplate by @pvditt in #4536
- Fix flaky test_monitor by @pingsutw in #4537
- Update boilerplate version by @flyte-bot in #4541
- remove hardcoded list of tests by @samhita-alla in #4521
- manually create flytekit/_version.py file in docs build by @cosmicBboy in #4556
- introduce conda-lock file for docs by @cosmicBboy in #4553
- Detect subNode phase updates to reduce evaluation frequency of ArrayNode by @hamersaw in #4535
- Add support failure node by @pingsutw in #4308
- Return InvalidArgument for workflow compilation failures in CreateWorkflow by @katrogan in #4566
- Update to go 1.21 by @eapolinario in #4534
- Update contribution guide by @pingsutw in #4501
- Add flyin plugin to monodocs integrations page by @neverett in #4582
- Use updated cronSchedule in CreateLaunchPlanModel by @pmahindrakar-oss in #4564
- Writing zero length inputs by @hamersaw in #4594
- Feature/add pod pending timeout config by @pvditt in #4590
- Run single-binary gh workflows on all PRs by @eapolinario in #4589
- auto-generate toctree from flytesnacks index.md docs by @cosmicBboy in #4587
- add repo tag and commit associated with the build by @cosmicBboy in #4571
- monodocs - gracefully handle case when external repo doesn't contain tags: use current commit by @cosmicBboy in #4598
- convert commit to string by @cosmicBboy in #4599
- Bug/abort map task subtasks by @pvditt in #4506
- Supporting parallelized workers in ArrayNode subNodes by @hamersaw in #4567
- Don't use experimental readthedocs build.commands config by @cosmicBboy in #4606
- Ignore cache variables by @hamersaw in #4618
- Feature/add cleanup non recoverable pod statuses by @pvditt in #4607
- Agent Metadata Servicer by @Future-Outlier in #4511
- Add Flyin propeller config by @eapolinario in #4610
- Correctly computing ArrayNode maximum attempts and system failures by @hamersaw in #4627
- Agent Sync Plugin by @Future-Outlier in #4107
- Add github token in buf gh action by @eapolinario in #4626
- Update flyte-binary values by @davidmirror-ops in #4604
- Fixing cache overwrite metadata update by @hamersaw in #4617
- Fixing 100 kilobyte max error message size by @hamersaw in #4631
- Add Ray Autoscaler to the Flyte-Ray plugin by @Yicheng-Lu-llll in #4363
- Artifact protos and related changes by @wild-endeavor in #4474
- Remove protoc-gen-validate by @eapolinario in #4643
- Readme update 2023 by @davidmirror-ops in #4549
- Fixing ArrayNode integration with backoff controller by @hamersaw in #4640
- Avoid to use the http.DefaultClient by @andresgomezfrr in #4667
- Update dns policy for sandbox buildkit instance to ClusterFirstWithHo… by @jeevb in #4678
- Updating ArrayNode ExternalResourceInfo ID by @hamersaw in #4677
- Feat: Inject user identity as pod label in K8s plugin by @fg91 in https://githu...
Flyte v1.10.7-b4 milestone release
Flyte v1.10.7-b4 Release
Pre-release testing.
Flyte v1.10.7-b3 milestone release
Flyte v1.10.7-b3 Release
Pre-release testing.
Flyte v1.10.7-b2 milestone release
Flyte v1.10.7-b2 Release
Pre-release testing.
Flyte v1.10.7-b1 milestone release
Flyte v1.10.7-b1 Release
Pre-release testing.
Flyte v1.10.7-b0 milestone release
Flyte v1.10.7-b0 Release
Beta release.
Flyte v1.10.6 milestone release
Flyte 1.10.6 Release
Due to a mishap in the move to the monorepo, we ended up generating the git tags between 1.10.1 to 1.10.5, so in order to decrease the confusion we decided to skip those patch versions and go straight to the next available version.
We've shipped a ton of stuff in this patch release, here are some of the highlights.
GPU Accelerators
You'll be able to get more fine-grained in the use GPU Accelerators in your tasks. Here are some examples:
No preference of GPU accelerator to use:
@task(limits=Resources(gpu="1"))
def my_task() -> None:
...
Schedule on a specific GPU accelerator:
from flytekit.extras.accelerators import T4
@task(
limits=Resources(gpu="1"),
accelerator=T4,
)
def my_task() -> None:
...
Schedule on a Multi-instance GPU (MIG) accelerator with no preference of partition size:
from flytekit.extras.accelerators import A100
@task(
limits=Resources(gpu="1"),
accelerator=A100,
)
def my_task() -> None:
...
Schedule on a Multi-instance GPU (MIG) accelerator with a specific partition size:
from flytekit.extras.accelerators import A100
@task(
limits=Resources(gpu="1"),
accelerator=A100.partition_1g_5gb,
)
def my_task() -> None:
...
Schedule on an unpartitioned Multi-instance GPU (MIG) accelerator:
from flytekit.extras.accelerators import A100
@task(
limits=Resources(gpu="1"),
accelerator=A100.unpartitioned,
)
def my_task() -> None:
...
Improved support for Ray logs
#4266 opens the door for RayJob logs to be persisted.
In #4397 we added support for a link to a Ray dashboard to show up in the task card.
Updated grafana dashboards
We updated the official grafana dashboards in #4382.
Support for Azure AD
A new version of our stow fork added support for Azure AD in flyteorg/stow#9.
Full changelog:
- Restructure Flyte releases by @eapolinario in #4304
- Use debian bookworm as single binary base image by @eapolinario in #4311
- Use local version in single-binary by @eapolinario in #4294
- Accessibility for README by @mishmanners in #4322
- Add tests in
flytepropeller/pkg /controller/executors
from 72.3% to 87.3% coverage by @Future-Outlier in #4276 - fix: remove unused setting in deployment charts by @HeetVekariya in #4252
- Document simplified retry behaviour introduced in #3902 by @fg91 in #4022
- Ray logs persistence by @jeevb in #4266
- Not revisiting task nodes and correctly incrementing parallelism by @hamersaw in #4318
- Fix RunPluginEndToEndTest util by @andresgomezfrr in #4342
- Tune sandbox readiness checks to ensure that sandbox is fully accessi… by @jeevb in #4348
- Chore: Ensure Stalebot doesn't close issues we've not yet triaged. by @brndnblck in #4352
- Do not automatically close stale issues by @eapolinario in #4353
- Fix: Set flyteadmin gRPC port to 80 in ingress if using TLS between load balancer and backend by @fg91 in #3964
- Support Databricks WebAPI 2.1 version and Support
existing_cluster_id
andnew_cluster
options to create a Job by @Future-Outlier in #4361 - Fixing caching on maptasks when using partials by @hamersaw in #4344
- Fix read raw limit by @honnix in #4370
- minor fix to eks-starter.yaml by @guyarad in #4337
- Reporting running if the primary container status is not yet reported by @hamersaw in #4339
- completing retries even if minSuccesses are achieved by @hamersaw in #4338
- Add comment to auth scope by @wild-endeavor in #4341
- Update tests by @eapolinario in #4381
- Update order of cluster resources config to work with both uctl and flytectl by @neverett in #4373
- Update tests in single-binary by @eapolinario in #4383
- Passthrough unique node ID in task execution ID for generating log te… by @jeevb in #4380
- Add Sections in the PR Template by @Future-Outlier in #4367
- Update metadata in ArrayNode TaskExecutionEvents by @hamersaw in #4355
- Fixes list formatting in flytepropeller arch docs by @thomasjpfan in #4345
- Update boilerplate end2end tests by @hamersaw in #4393
- Handle all ray job statuses by @EngHabu in #4389
- Relocate sandbox config by @davidmirror-ops in #4385
- Refactor task logs framework by @jeevb in #4396
- Add support for displaying the Ray dashboard when a RayJob is active by @jeevb in #4397
- Disable path filtering for monorepo components by @eapolinario in #4404
- Silence NotFound when get task resource by @honnix in #4388
- adding consoleUrl parameterization based on partition by @lauralindy in #4375
- [Docs] Sensor Agent Doc by @Future-Outlier in #4195
- [flytepropeller] Add Tests in v1alpha.go including
array_test.go
,branch_test.go
,error_test.go
, andiface_test.go
with 0.13% Coverage Improvement by @Future-Outlier in #4234 - Add more context for ray log template links by @jeevb in #4416
- Add ClusterRole config for Ray by @davidmirror-ops in #4405
- Fix and update scripts for generating grafana dashboards by @Tom-Newton in #4382
- Add artifacts branch to publish to buf on push by @squiishyy in #4450
- Add service monitor for flyte admin and propeller service by @vraiyaninv in #4427
- Fix Kubeflow TF Operator
GetTaskPhase
Bug by @Future-Outlier in #4469 - Instrument opentelemetry by @hamersaw in #4357
- Delete the .github folder from each subdirectory by @pingsutw in #4480
- Fix the loop variable scheduler issue by @pmahindrakar-oss in #4468
- Databricks Plugin Setup Doc Enhancement by @Future-Outlier in #4445
- Put ticker back in place in propeller gc by @eapolinario in #4490
- Store failed execution in flyteadmin by @iaroslav-ciupin in #4390
- Moving from flyteadmin - Upgrade coreos/go-oidc to v3 to pickup claims parsing fixes by @eapolinario in #4139
- Bump flyteorg/stow to 0.3.8 by @eapolinario in #4312
- Remove 'needs' from generate_flyte_manifest by @eapolinario in #4495
- Update Flyte components by @flyte-bot in #4302
- Modify how flytecopilot version is parsed from values file by @eapolinario in #4496
- Ignore component tags in goreleaser by @eapolinario in #4497
- Fix indentation of
shell: task
by @eapolinario in #4498 - Implemented simple echo plugin for testing by @hamersaw in #4489
- Correctly handle resource overrides in KF plugins by @jeevb in #4467
- Remove deprecated InjectDecoder by @EngHabu in #4507
- Fix $HOME resolution and webhook namespace by @EngHabu in #4509
- Add note on updating sandbox cluster configuration by @jeevb in #4510
- Add New PR Template by @Future-Outlier in #4512
- [Docs] Databricks Agent Doc by @Future-Outlier in #4008
- Bump version of goreleaser gh action to v5 by @eapolinario in #4519
- Kf operators use
GetReplicaFunc
(Error Handling) by @Future-Outlier in #4471
New Contributors
- @HeetVekariya made their first contribution in #4252
- @andresgomezfrr made their first contribution in #4342
- @brndnblck made their first contribution in #4352
- @guyarad made their first contribution in #4337
- @neverett made their first contribution in #4373
- @thomasjpfan made their first...