# Core concepts
> This bundle contains all pages in the Core concepts section.
> Source: https://www.union.ai/docs/v2/union/user-guide/core-concepts/

=== PAGE: https://www.union.ai/docs/v2/union/user-guide/core-concepts ===

# Core concepts

> **📝 Note**
>
> An LLM-optimized bundle of this entire section is available at [`section.md`](section.md).
> This single file contains all pages in this section, optimized for AI coding agent context.

Now that you've completed the [Quickstart](https://www.union.ai/docs/v2/union/user-guide/quickstart/page.md), let's explore Flyte's core concepts through working examples.

By the end of this section, you'll understand:

- **TaskEnvironment**: The container configuration that defines where and how your code runs
- **Tasks**: Python functions that execute remotely in containers
- **Runs and Actions**: How Flyte tracks and manages your executions
- **Apps**: Long-running services for APIs, dashboards, and inference endpoints

Each concept is introduced with a practical example you can run yourself.

## How Flyte works

When you run code with Flyte, here's what happens:

1. You define a **TaskEnvironment** that specifies the container image and resources
2. You decorate Python functions with `@env.task` to create **tasks**
3. When you execute a task, Flyte creates a **run** that tracks the execution
4. Each task execution within a run is an **action**

Let's explore each of these in detail.

## Reliability

How Union.ai keeps your work accountable when running across clusters.

### **Core concepts > Leases**

How Union.ai tracks work across clusters, why you occasionally see `lease expired`, and how the system protects you from runaway compute.

=== PAGE: https://www.union.ai/docs/v2/union/user-guide/core-concepts/task-environment ===

# TaskEnvironment

A `TaskEnvironment` defines the hardware and software environment where your tasks run. Think of it as the container configuration for your code.

## A minimal example

Here's the simplest possible TaskEnvironment:

```python
import flyte

env = flyte.TaskEnvironment(name="my_env")

@env.task
def hello() -> str:
    return "Hello from Flyte!"
```

With just a `name`, you get Flyte's default container image and resource allocation. This is enough for simple tasks that only need Python and the Flyte SDK.

## What TaskEnvironment controls

A TaskEnvironment specifies two things:

**Hardware environment** - The compute resources allocated to each task:
- CPU cores
- Memory
- GPU type and count

**Software environment** - The container image your code runs in:
- Base image (Python version, OS)
- Installed packages and dependencies
- Environment variables

## Configuring resources

Use the `limits` parameter to specify compute resources:

```python
env = flyte.TaskEnvironment(
    name="compute_heavy",
    limits=flyte.Resources(cpu="4", mem="16Gi"),
)
```

For GPU workloads:

```python
env = flyte.TaskEnvironment(
    name="gpu_training",
    limits=flyte.Resources(cpu="8", mem="32Gi", gpu="1"),
    accelerator=flyte.GPUAccelerator.NVIDIA_A10G,
)
```

## Configuring container images

For tasks that need additional Python packages, specify a custom image:

```python
image = flyte.Image.from_debian_base().with_pip_packages("pandas", "scikit-learn")

env = flyte.TaskEnvironment(
    name="ml_env",
    image=image,
)
```

See [Container images](https://www.union.ai/docs/v2/union/user-guide/task-configuration/container-images) for detailed image configuration options.

## Multiple tasks, one environment

All tasks decorated with the same `@env.task` share that environment's configuration:

```python
env = flyte.TaskEnvironment(
    name="data_processing",
    limits=flyte.Resources(cpu="2", mem="8Gi"),
)

@env.task
def load_data(path: str) -> dict:
    # Runs with 2 CPU, 8Gi memory
    ...

@env.task
def transform_data(data: dict) -> dict:
    # Also runs with 2 CPU, 8Gi memory
    ...
```

This is useful when multiple tasks have similar requirements.

## Multiple environments

When tasks have different requirements, create separate environments:

```python
light_env = flyte.TaskEnvironment(
    name="light",
    limits=flyte.Resources(cpu="1", mem="2Gi"),
)

heavy_env = flyte.TaskEnvironment(
    name="heavy",
    limits=flyte.Resources(cpu="8", mem="32Gi"),
)

@light_env.task
def preprocess(data: str) -> str:
    # Light processing
    ...

@heavy_env.task
def train_model(data: str) -> dict:
    # Resource-intensive training
    ...
```

## Next steps

Now that you understand TaskEnvironments, let's look at how to define [tasks](./tasks) that run inside them.

=== PAGE: https://www.union.ai/docs/v2/union/user-guide/core-concepts/tasks ===

# Tasks

A task is a Python function that runs remotely in a container. You create tasks by decorating functions with `@env.task`.

> **📝 Note**
>
> In Flyte 1, tasks and workflows were defined with separate `@task`, `@workflow`, and `@dynamic` decorators. Flyte 2 uses a single `@env.task` decorator off a `flyte.TaskEnvironment` — everything is a task.

## Defining a task

Here's a simple task:

```python
import flyte

env = flyte.TaskEnvironment(name="my_env")

@env.task
def greet(name: str) -> str:
    return f"Hello, {name}!"
```

The `@env.task` decorator tells Flyte to run this function in a container configured by `env`.

## Type hints are required

Flyte uses type hints to understand your data and serialize it between tasks:

```python
@env.task
def process_numbers(values: list[int]) -> int:
    return sum(values)
```

Supported types include:
- Primitives: `int`, `float`, `str`, `bool`
- Collections: `list`, `dict`, `tuple`
- DataFrames: `pandas.DataFrame`, `polars.DataFrame`
- Files: `flyte.File`, `flyte.Directory`
- Custom: dataclasses, Pydantic models

See [Data classes and structures](https://www.union.ai/docs/v2/union/user-guide/task-programming/dataclasses-and-structures) for complex types.

## Tasks calling tasks

In Flyte 2, tasks can call other tasks directly. The called task runs in its own container:

```python
@env.task
def fetch_data(url: str) -> dict:
    # Runs in container 1
    ...

@env.task
def process_data(url: str) -> str:
    data = fetch_data(url)  # Calls fetch_data, runs in container 2
    return transform(data)
```

This is how you build workflows in Flyte 2. There's no separate `@workflow` decorator - just tasks calling tasks.

## The top-level task

The task you execute directly is the "top-level" or "driver" task. It orchestrates other tasks:

```python
@env.task
def step_one(x: int) -> int:
    return x * 2

@env.task
def step_two(x: int) -> int:
    return x + 10

@env.task
def pipeline(x: int) -> int:
    a = step_one(x)   # Run step_one
    b = step_two(a)   # Run step_two with result
    return b
```

When you run `pipeline`, it becomes the top-level task and orchestrates `step_one` and `step_two`.

## Running tasks locally

For quick testing, you can call a task like a regular function:

```python
# Direct call - runs locally, not in a container
result = greet("World")
print(result)  # "Hello, World!"
```

This bypasses Flyte entirely and is useful for debugging logic. However, local calls don't track data, use remote resources, or benefit from Flyte's features.

## Running tasks remotely

To run a task on your Flyte backend:

```python
import flyte

flyte.init_from_config()
result = flyte.run(greet, name="World")
print(result)  # "Hello, World!"
```

Or from the command line:

```bash
flyte run my_script.py greet --name World
```

This sends your code to the Flyte backend, runs it in a container, and returns the result.

## Next steps

Now that you can define and run tasks, let's understand how Flyte tracks executions with [runs and actions](./runs-and-actions).

=== PAGE: https://www.union.ai/docs/v2/union/user-guide/core-concepts/runs-and-actions ===

# Runs and actions

When you execute a task on Flyte, the system creates a **run** to track it. Each individual task execution within that run is an **action**. Understanding this hierarchy helps you navigate the UI and debug your workflows.

## What is a run?

A **run** is the execution of a task that you directly initiate, plus all its descendant task executions, considered as a single unit.

When you execute:

```bash
flyte run my_script.py pipeline --x 5
```

Flyte creates a run for `pipeline`. If `pipeline` calls other tasks, those executions are part of the same run.

## What is an action?

An **action** is the execution of a single task, considered independently. A run consists of one or more actions.

Consider this workflow:

```python
@env.task
def step_one(x: int) -> int:
    return x * 2

@env.task
def step_two(x: int) -> int:
    return x + 10

@env.task
def pipeline(x: int) -> int:
    a = step_one(x)
    b = step_two(a)
    return b
```

When you run `pipeline(5)`:

- **1 run** is created for the entire execution
- **3 actions** are created: one for `pipeline`, one for `step_one`, one for `step_two`

## Runs vs actions in practice

| Concept | What it represents | In the UI |
|---------|-------------------|-----------|
| **Run** | Complete execution initiated by user | Runs list, top-level view |
| **Action** | Single task execution | Individual task details, logs |

For details on how to run tasks locally and remotely, see [Tasks](./tasks#running-tasks-locally).

## Viewing runs in the UI

After running a task remotely, click the URL in the output to see your run in the UI:

```bash
flyte run my_script.py pipeline --x 5
```

Output:

```bash
abc123xyz
https://my-instance.example.com/v2/runs/project/my-project/domain/development/abc123xyz
Run 'a0' completed successfully.
```

In the UI, you can:

- See the overall run status and duration
- Navigate to individual actions
- View inputs and outputs for each task
- Access logs for debugging
- See the execution graph

## Understanding the execution graph

The UI shows how tasks relate to each other:

```
pipeline (action)
├── step_one (action)
└── step_two (action)
```

Each box is an action. Arrows show data flow between tasks. This visualization helps you understand complex workflows and identify bottlenecks.

## Checking run status

From the command line:

```bash
flyte get run <run-id>
```

From Python:

```python
import flyte

flyte.init_from_config()
run = flyte.run(pipeline, x=5)

# The run object has status information
print(run.status)
```

## Next steps

You now understand tasks and how Flyte tracks their execution. Next, let's learn about [apps](./introducing-apps) - Flyte's approach to long-running services.

=== PAGE: https://www.union.ai/docs/v2/union/user-guide/core-concepts/where-data-lives ===

A developer's map of what Flyte stores in the control plane database versus the data plane object store, and what "metadata" actually means.

# Where your data lives

When you run a Flyte task, your data ends up in two stores: a **database** in the control plane and an **object-store bucket** in the data plane.

## The two stores

| | Control plane database | Data plane object store |
|---|---|---|
| **Backing tech** | Postgres (plus a few internal coordination stores) | S3, GCS, or ABS bucket |
| **What's in it** | Every record Flyte uses to *describe* your runs, plus pointers to where each run's inputs and outputs live | Every run's inputs and outputs, and all bulk/offloaded content |
| **Lifetime** | Durable; long-lived history | Durable, but you can apply lifecycle/retention rules |

## Who manages the stores

* The control plane database (being part of the control plane) is always managed by Union.ai, regardless of whether you are using a BYOC or self-managed deployment.

* The data plane bucket lives in your cloud account. In a BYOC deployment, it is managed by Union.ai (as is the entire data plane). In a self-managed deployment, you manage the bucket yourself.

The database is the **source of truth for what executed**. The bucket is **where your runs' actual input and output values live**.

## What goes in the database

The control plane database holds everything Flyte needs to enumerate, schedule, and replay your work. Specifically:

- **Registrations** — every task you've deployed, every trigger you've registered, every project and domain. A task's definition includes its *default* input values, which are stored inline as part of the registration.
- **Execution records** — every run, every action (task / trace / condition) inside that run, attempts, phases, timing, error messages, parent/child relationships.
- **Schedules and triggers** — `Cron`, event triggers, and their revision history.
- **Pointers to runtime inputs and outputs** — the database stores the *URI* of each run's `inputs.pb` / `outputs.pb`, not the values themselves. (One exception: an awaited *condition* / approval action stores the value that satisfies it inline.)
- **Caches** — the cache key → output-URI mapping for `@env.task(cache=...)`.

The values your tasks actually pass at runtime — even a bare `int` — do **not** live in the database. They are written to `inputs.pb` / `outputs.pb` in the bucket, and the database keeps only the pointer. See the next section.

(Internally, Flyte uses several backing databases — Postgres for registrations and run history, separate stores for in-flight action coordination and caches. For developer purposes the only thing that matters is that they're all small-record, structured stores; none of them hold bulk content.)

## What goes in the bucket

Every run's inputs and outputs are written to the bucket as `inputs.pb` / `outputs.pb`, and the database stores a **pointer** (URI) to them. Within those files, small scalar values are inlined directly while large values are offloaded to separate objects and referenced by URI. The bucket holds:

- **Task inputs**, serialized as `inputs.pb` per run.
- **Task outputs**, serialized as `outputs.pb` per attempt.
- **Offloaded values** — `flyte.io.File`, `flyte.io.Dir`, `flyte.io.DataFrame`, pickled objects, models, anything large.
- **Decks** — the HTML reports your task renders.
- **Trace checkpoints** — used by `@flyte.trace` to resume partial work.
- **Fast-registered code bundles** — what `flyte deploy` and `flyte run --copy-style all` upload so the cluster can run your local Python.
- **Image-build contexts** — when Union.ai builds a container image from an `Image` definition that requires a build context.

The layout under your bucket is `<project>/<domain>/...`, with the bulk of execution artifacts under per-run, per-action subprefixes (`<run-name>/<action>/...` for outputs / Decks / checkpoints) and sibling prefixes for offloaded inputs and SDK uploads (code bundles, image-build contexts). You don't typically need to know the exact paths; you do need to know that **everything above lives behind one configured bucket prefix**.

## What "metadata" means

The word "metadata" appears in several places and means a different thing each time. The two senses that matter for developers:

### 1. "Metadata" as in the control plane database (Flyte's usage)

When Flyte documentation says **"metadata is preserved"** or **"metadata lives in the control plane,"** it means the database records above: registrations (including task default values), run history, and status. It does **not** mean "the contents of the bucket."

This is the sense most relevant to you: the database is durable, and losing the bucket does not lose your execution history — it loses the *large values* those history records pointed at.

### 2. "Metadata bucket" (a deployment/ops term you may see)

The Helm chart and some operational guides refer to a **"metadata bucket"** or `metadataContainer`. **This is a legacy name.** The bucket it refers to does *not* hold the database-style metadata above — it holds `inputs.pb`, `outputs.pb`, Decks, checkpoints, code bundles, and offloaded data. In other words, it holds exactly the "bucket" contents listed in the previous section.

If you see "metadata bucket" in an ops context, read it as **"the data plane object-store bucket."** The naming is unfortunate; the contents are what you'd expect from a data bucket.

You can largely ignore other appearances of the word in API surfaces (`TaskMetadata`, `ActionMetadata`, and `metadata_path` on `RunContext`, which is a local scratch directory used only by `from_local()` execution) — those are small property bags or local scratch paths and don't change where your data is stored.

## Per-run customization: `raw_data_path`

By default, offloaded values (`File`, `Dir`, `DataFrame`, checkpoints) land alongside everything else under the deployment's configured bucket prefix. You can route them to a different prefix — including a different bucket entirely — for a single run:

```python
import flyte

flyte.init_from_config()

run = flyte.with_runcontext(
    raw_data_path="s3://my-other-bucket/some/prefix",
).run(my_task, x=1)
```

This is the supported way to send a sensitive run to an isolated bucket, point at a bucket with different lifecycle rules, or otherwise route offloaded data per run. The `inputs.pb` / `outputs.pb` themselves still land in the deployment's bucket; only the *raw* offloaded contents move.

See [Run context](https://www.union.ai/docs/v2/union/user-guide/task-deployment/run-context) for the full set of `with_runcontext` options.

## What happens if the bucket is purged

If a retention rule deletes objects out of the bucket, the database records that pointed at them are **not** deleted — but their pointers now dangle. Concretely:

- Execution history, status, timing, structure: **still visible** in the UI. They come from the database.
- Input/output **previews, Deck views, artifact payloads**: show "not found" if the underlying bytes were purged.
- **Cache hits** for purged outputs: the cached pointer is dead, the task re-executes.
- **Trace resumption**: not possible if the checkpoint blob is gone.
- **Re-running an old execution**: fails if any input it needs has been purged.

This is the trade-off behind retention policies: you save storage cost at the price of being able to inspect or re-run old executions whose offloaded values have aged out. New executions are unaffected.

Lifecycle / retention rules should be scoped to the offloaded-data prefixes, **not** applied bucket-wide — `inputs.pb` and `outputs.pb` are needed for in-flight executions to complete, so purging them mid-run breaks things.

For how retention policies are configured in your deployment, see [BYOC data retention policy](https://www.union.ai/docs/v2/union/deployment/byoc/data-retention-policy) or [Self-managed data retention](https://www.union.ai/docs/v2/union/deployment/selfmanaged/configuration/data-retention).

## The short version

- **Database** = the system of record. Holds registrations (including task default values), run history, schedules, and pointers to each run's inputs/outputs.
- **Bucket** = the object-store bucket. Holds every run's `inputs.pb`/`outputs.pb`, Decks, checkpoints, code bundles, and offloaded `File` / `Dir` / `DataFrame` contents.
- **"Metadata" in docs** usually means database-side records. **"Metadata bucket" in Helm/ops** is legacy naming for the data plane bucket — it does *not* hold database metadata.
- **`flyte.with_runcontext(raw_data_path=...)`** is your knob to send offloaded data elsewhere per run.

=== PAGE: https://www.union.ai/docs/v2/union/user-guide/core-concepts/introducing-apps ===

# Apps

Now that you understand tasks, let's learn about apps - Flyte's way of running long-lived services.

## Tasks vs apps

You've already learned about **tasks**: Python functions that run to completion in containers. Tasks are great for data processing, training, and batch operations.

**Apps** are different. An app is a long-running service that stays active and handles requests over time. Apps are ideal for:

- REST APIs and webhooks
- Model inference endpoints
- Interactive dashboards
- Real-time data services

| Aspect | Task | App |
|--------|------|-----|
| Lifecycle | Runs once, then exits | Stays running indefinitely |
| Invocation | Called with inputs, returns outputs | Receives HTTP requests |
| Use case | Batch processing, training | APIs, inference, dashboards |
| Durability | Inputs/outputs stored, can resume | Stateless request handling |

## AppEnvironment

Just as tasks use `TaskEnvironment`, apps use `AppEnvironment` to configure their runtime.

An `AppEnvironment` specifies:

- **Hardware**: CPU, memory, GPU allocation
- **Software**: Container image with dependencies
- **App-specific settings**: Ports, scaling, authentication

Here's a simple example:

```python
import flyte
from flyte.app.extras import FastAPIAppEnvironment

env = FastAPIAppEnvironment(
    name="my-app",
    image=flyte.Image.from_debian_base().with_pip_packages("fastapi", "uvicorn"),
    limits=flyte.Resources(cpu="1", mem="2Gi"),
)
```

## A hello world app

Let's create a minimal FastAPI app to see how this works.

First, create `hello_app.py`:

```python
# /// script
# requires-python = "==3.13"
# dependencies = [
#    "flyte>=2.0.0b52",
#    "fastapi",
#    "uvicorn",
# ]
# ///

"""A simple "Hello World" FastAPI app example for serving."""

from fastapi import FastAPI
import pathlib
import flyte
from flyte.app.extras import FastAPIAppEnvironment

# Define a simple FastAPI application
app = FastAPI(
    title="Hello World API",
    description="A simple FastAPI application",
    version="1.0.0",
)

# Create an AppEnvironment for the FastAPI app
env = FastAPIAppEnvironment(
    name="hello-app",
    app=app,
    image=flyte.Image.from_debian_base(python_version=(3, 12)).with_pip_packages(
        "fastapi",
        "uvicorn",
    ),
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=False,
)

# Define API endpoints
@app.get("/")
async def root():
    return {"message": "Hello, World!"}

@app.get("/health")
async def health_check():
    return {"status": "healthy"}

# Serving this script will deploy and serve the app on your Union/Flyte instance.
if __name__ == "__main__":
    # Initialize Flyte from a config file.
    flyte.init_from_config(root_dir=pathlib.Path(__file__).parent)

    # Serve the app remotely.
    app_instance = flyte.serve(env)

    # Print the app URL.
    print(app_instance.url)
    print("App 'hello-app' is now serving.")
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/getting-started/serving/hello_app.py*

### Understanding the code

- **`FastAPI()`** creates the web application with its endpoints
- **`FastAPIAppEnvironment`** configures the container and resources
- **`@app.get("/")`** defines an HTTP endpoint that returns a greeting
- **`flyte.serve()`** deploys and starts the app on your Flyte backend

### Serving the app

With your config file in place, serve the app:

```bash
flyte serve hello_app.py env
```

Or run the Python file directly (which calls `flyte.serve()` in the main block):

```bash
python hello_app.py
```

You'll see output like:

```output
https://my-instance.flyte.com/v2/domain/development/project/my-project/apps/hello-app
App 'hello-app' is now serving.
```

Click the link to view your app in the UI. You can find the app URL there, or visit `/docs` for FastAPI's interactive API documentation.

## When to use apps vs tasks

Use **tasks** when:
- Processing takes seconds to hours
- You need durability (inputs/outputs tracked)
- Work is triggered by events or schedules
- Results need to be cached or resumed

Use **apps** when:
- Responses must be fast (milliseconds)
- You're serving an API or dashboard
- Users interact in real-time
- You need a persistent endpoint

## Common patterns

**Model serving with FastAPI**: Train a model with a Flyte pipeline, then serve predictions from it. During local development, the app loads the model from a local file. When deployed remotely, Flyte's `Parameter` system automatically resolves the model from the latest training run output. See [FastAPI app](https://www.union.ai/docs/v2/union/user-guide/native-app-integrations/fastapi-app) for the full example.

**Agent UI with Gradio**: Build an interactive UI that kicks off agent runs using `flyte.with_runcontext()`. A single `RUN_MODE` environment variable controls the deployment progression: fully local (rapid iteration), local UI with remote task execution (cluster compute), or fully remote (production). See [Build apps](../build-apps/_index) for details.

## Next steps

You now understand the core building blocks of Flyte:

- **TaskEnvironment** and **AppEnvironment** configure where code runs
- **Tasks** are functions that execute and complete
- **Apps** are long-running services
- **Runs** and **Actions** track executions

Before diving deeper, check out [Key capabilities](./key-capabilities) for an overview of what Flyte can do—from parallelism and caching to LLM serving and error recovery.

Then head to [Basic project](./basic-project) to build a RAG application with an embedding pipeline and a Streamlit app.

=== PAGE: https://www.union.ai/docs/v2/union/user-guide/core-concepts/projects-and-domains ===

# Projects and domains

Union.ai organizes work into a hierarchy of **organization**, **projects**, and **domains**.

- **Organization**: Your Union.ai instance, typically representing a company or department. Set up during onboarding and mapped to your endpoint URL (e.g., `my-org.my-company.com`). You do not create or manage organizations directly. The organization is normally determined automatically from your endpoint URL, but you can override it with the `--org` flag on any CLI command (e.g., `flyte --org my-org get project`). This is only relevant if you have a multi-organization installation.
- **Project**: A logical grouping of related workflows, tasks, launch plans, and executions. Projects are the primary unit you create and manage.
- **Domain**: An environment classification within each project. Three fixed domains exist: `development`, `staging`, and `production`. Domains cannot be created or deleted.

Every project contains all three domains, creating **project-domain pairs** like `my-project/development`, `my-project/staging`, and `my-project/production`. Workflows, executions, and data are scoped to a specific project-domain pair.

## How projects and domains are used

When you run or deploy workflows, you target a project and domain:

- **CLI**: Use `--project` and `--domain` flags with `flyte run` or `flyte deploy`, or set defaults in your [configuration file](https://www.union.ai/docs/v2/union/user-guide/run-modes/running-remote).

- **Python SDK**: Specify `project` and `domain` in `flyte.init` or `flyte.init_from_config`.

Projects and domains also determine:

- **Access control**: RBAC policies scope permissions to an organization, project, domain, or project-domain pair. See [User management](https://www.union.ai/docs/v2/union/user-guide/user-management).
- **Data isolation**: Storage and cache are isolated per project-domain pair.
- **Settings**: Default behavior — such as the task queue, resource requests, and environment variables — can be configured at org, domain, or project scope. See [Settings](./settings).

## Managing projects via CLI

### Create a project

```shell
flyte create project --id my-project --name "My Project"
```

The `--id` is a unique identifier used in CLI commands and configuration (immutable once set). The `--name` is a human-readable display name.

You can also add a description and labels:

```shell
flyte create project \
    --id my-project \
    --name "My Project" \
    --description "ML platform workflows" \
    -l team=ml-platform \
    -l env=prod
```

Labels are specified as `-l key=value` and can be repeated.

### List projects

List all active projects:

```shell
flyte get project
```

Get details of a specific project:

```shell
flyte get project my-project
```

List archived projects:

```shell
flyte get project --archived
```

### Update a project

Update the name, description, or labels of a project:

```shell
flyte update project my-project --description "Updated description"
flyte update project my-project --name "New Display Name"
flyte update project my-project -l team=ml -l env=staging
```

> [!NOTE]
> Setting labels replaces all existing labels on the project.

### Archive a project

Archiving a project hides it from default listings but does not delete its data:

```shell
flyte update project my-project --archive
```

### Unarchive a project

Restore an archived project to active status:

```shell
flyte update project my-project --unarchive
```

## Listing projects programmatically

You can list and retrieve projects from Python using `flyte.remote.Project`:

```python
import flyte

flyte.init_from_config()

# Get a specific project
project = flyte.remote.Project.get(name="my-project", org="my-org")

# List all projects
for project in flyte.remote.Project.listall():
    print(project.to_dict())

# List with filtering and sorting
for project in flyte.remote.Project.listall(sort_by=("created_at", "desc")):
    print(project.to_dict())
```

Both `get()` and `listall()` support async execution via `.aio()`:

```python
project = await flyte.remote.Project.get.aio(name="my-project", org="my-org")
```

> [!NOTE]
> The Python SDK provides read-only access to projects. To create or modify projects, use the `flyte` CLI or the UI.

## Managing projects via the UI

When you log in to your Union.ai instance, you land on the **Projects** page, which lists all projects in your organization. By default, the domain is set to `development`. You can change the active domain using the selector in the top left.

A **Recently viewed** list on the left sidebar provides quick access to your most commonly used projects.

From the project list you can:

* **Open a project**: Select a project from the list to navigate to it.
* **Create a project**: Click **+ New project** in the top right. In the dialog, specify a name and description. The project will be created across all three domains.
* **Archive a project**: Click the three-dot menu on a project's entry and select **Archive project**.

## Domains

Domains provide environment separation within each project. The three domains are:

| Domain | Purpose |
|--------|---------|
| `development` | For iterating on workflows during active development. |
| `staging` | For testing workflows before promoting to production. |
| `production` | For production workloads. |

Domains are predefined and cannot be created, renamed, or deleted.

### Targeting a domain

Set the default domain in your configuration file:

```yaml
task:
  domain: development
```

Or override per command:

```shell
flyte run --domain staging hello.py main
```

When using `flyte deploy`, the domain determines where the deployed workflows will execute:

```shell
flyte deploy --project my-project --domain production workflows
```

=== PAGE: https://www.union.ai/docs/v2/union/user-guide/core-concepts/key-capabilities ===

# Key capabilities

Now that you understand the core concepts -- `TaskEnvironment`, tasks, runs, and apps -- here's an overview of what Flyte can do. Each capability is covered in detail later in the documentation.

## Environment and resources

Configure how and where your code runs.

- **Multiple environments**: Create separate configurations for different use cases (dev, prod, GPU vs CPU)
  → [Multiple environments](https://www.union.ai/docs/v2/union/user-guide/task-configuration/multiple-environments)

- **Resource specification**: Request specific CPU, memory, GPU, and storage for your tasks
  → [Resources](https://www.union.ai/docs/v2/union/user-guide/task-configuration/resources)

- **Reusable containers**: Eliminate container startup overhead with pooled, warm containers for millisecond-level task scheduling
  → [Reusable containers](https://www.union.ai/docs/v2/union/user-guide/task-configuration/reusable-containers)

## Deployment

Get your code running remotely.

- **Cloud image building**: Build container images remotely without needing local Docker
  → [Container images](https://www.union.ai/docs/v2/union/user-guide/task-configuration/container-images)

- **Code packaging**: Your local code is automatically bundled and deployed to remote execution
  → [Packaging](https://www.union.ai/docs/v2/union/user-guide/task-deployment/packaging)

- **Local testing**: Test tasks locally before deploying with `flyte run --local`
  → [How task run works](https://www.union.ai/docs/v2/union/user-guide/task-deployment/how-task-run-works)

## Data handling

Pass data efficiently between tasks.

- **Files and directories**: Pass large files and directories between tasks using `flyte.io.File` and `flyte.io.Dir`
  → [Files and directories](https://www.union.ai/docs/v2/union/user-guide/task-programming/files-and-directories)

- **DataFrames**: Work with pandas, Polars, and other DataFrame types natively
  → [DataFrames](https://www.union.ai/docs/v2/union/user-guide/task-programming/dataframes)

## Parallelism and composition

Scale out and compose workflows.

- **Fanout parallelism**: Process items in parallel using `flyte.map` or `asyncio.gather`
  → [Fanout](https://www.union.ai/docs/v2/union/user-guide/task-programming/fanout)

- **Remote tasks**: Call previously deployed tasks from within your workflows
  → [Remote tasks](https://www.union.ai/docs/v2/union/user-guide/task-programming/remote-tasks)

## Security and automation

Manage credentials and automate execution.

- **Secrets**: Inject API keys, passwords, and other credentials securely into tasks
  → [Secrets](https://www.union.ai/docs/v2/union/user-guide/task-configuration/secrets)

- **Triggers**: Schedule tasks on a cron schedule or trigger them from external events
  → [Triggers](https://www.union.ai/docs/v2/union/user-guide/task-configuration/triggers)

- **Webhooks**: Build APIs that trigger task execution from external systems
  → [Hybrid graphs](https://www.union.ai/docs/v2/union/user-guide/build-apps/hybrid-graphs)

## Durability and reliability

Handle failures and avoid redundant work.

- **Error handling**: Catch failures and retry with different resources (e.g., more memory)
  → [Error handling](https://www.union.ai/docs/v2/union/user-guide/task-programming/error-handling)

- **Retries and timeouts**: Configure automatic retries and execution time limits
  → [Retries and timeouts](https://www.union.ai/docs/v2/union/user-guide/task-configuration/retries-and-timeouts)

- **Caching**: Add `cache="auto"` to any task and Flyte stores its outputs keyed on task name and inputs. Same inputs means instant results with no recomputation. This speeds up your development loop: skip re-downloading data, avoid replaying earlier steps in agentic chains, or bypass any expensive computation while you iterate.
  → [Caching](https://www.union.ai/docs/v2/union/user-guide/task-configuration/caching)

  ```python
  @env.task(cache="auto")
  async def load_data(data_dir: str = "./data") -> str:
      """Downloads once, then returns instantly on subsequent runs."""
      # ... expensive download ...
      return data_dir
  ```

- **Traces**: Use `@flyte.trace` to get visibility into the internal steps of a task without the overhead of making each step a separate task. Traced functions show up as child nodes under their parent task, each with their own timing, inputs, and outputs. This is particularly useful for AI agents where you want to see which tools were called.
  → [Traces](https://www.union.ai/docs/v2/union/user-guide/task-programming/traces)

  ```python
  @flyte.trace
  async def search(query: str) -> str:
      """Shows up as a child node under the parent task."""
      return await do_search(query)

  @env.task
  async def agent(request: str) -> str:
      results = await search(request)    # Traced
      answer = await summarize(results)   # Also traced if decorated
      return answer
  ```

- **Reports**: Add `report=True` to a task and it can generate an HTML report (charts, tables, images) saved alongside the task output. Combined with caching and persisted inputs/outputs, reports act as lightweight experiment tracking—each run produces a self-contained HTML file you can compare across runs and share with your team.
  → [Reports](https://www.union.ai/docs/v2/union/user-guide/task-programming/reports)

  ```python
  import flyte.report

  @env.task(report=True)
  async def evaluate(model_file: File, test_data: str) -> str:
      # ... run evaluation ...
      await flyte.report.replace.aio(
          f"<h2>Training Report</h2>"
          f"<h3>Test Results</h3>"
          f"<p>Accuracy: {accuracy:.4f}</p>"
      )
      await flyte.report.flush.aio()
      return f"Accuracy: {accuracy:.4f}"
  ```

## Apps and serving

Deploy long-running services.

- **FastAPI apps**: Deploy REST APIs and webhooks
  → [FastAPI app](https://www.union.ai/docs/v2/union/user-guide/native-app-integrations/fastapi-app)

- **LLM serving**: Serve large language models with vLLM or SGLang
  → [vLLM app](https://www.union.ai/docs/v2/union/user-guide/native-app-integrations/vllm-app), [SGLang app](https://www.union.ai/docs/v2/union/user-guide/native-app-integrations/sglang-app)

- **Autoscaling**: Scale apps up and down based on traffic, including scale-to-zero
  → [Autoscaling apps](https://www.union.ai/docs/v2/union/user-guide/configure-apps/auto-scaling-apps)

- **Streamlit dashboards**: Deploy interactive data dashboards
  → [Streamlit app](https://www.union.ai/docs/v2/union/user-guide/native-app-integrations/streamlit-app)

## Notebooks

Work interactively.

- **Jupyter support**: Author and run workflows directly from Jupyter notebooks, and fetch workflow metadata (inputs, outputs, logs)
  → [Notebooks](https://www.union.ai/docs/v2/union/user-guide/task-programming/notebooks)

## Next steps

Ready to put it all together? Head to [Basic project](./basic-project) to build an end-to-end RAG pipeline with embeddings and a Streamlit app.

=== PAGE: https://www.union.ai/docs/v2/union/user-guide/core-concepts/settings ===

# Settings

Union.ai provides a hierarchical settings system for configuring default behavior at each level of your [org → domain → project hierarchy](./projects-and-domains). Settings defined at a broader scope are inherited by narrower scopes, and any scope can override an inherited value.

Most settings are **defaults only** — they apply when a task or workflow does not specify a value directly. Per-task configuration (such as `Resources` or `TaskEnvironment`) takes precedence over scope-level settings. The exception is `task_resource.max.*` settings, which are enforced as hard limits that cannot be exceeded by per-task configuration.

## Scope hierarchy

Settings are stored at three scope levels:

| Scope | Who manages it | Inherits from |
|-------|---------------|---------------|
| **Org** | Org admins | — |
| **Domain** | Org admins | Org |
| **Project** | Project owners, org admins | Domain, then org |

When a setting has no value at the current scope, the next broader scope is consulted automatically.

## Editing settings

Use `flyte edit settings` to view and modify settings interactively. The command opens your `$EDITOR` with a structured YAML file showing the current state of that scope.

```shell
# Edit org-level settings
flyte edit settings

# Edit domain-level settings (inherits from org)
flyte edit settings --domain production

# Edit project-level settings (inherits from domain, then org)
flyte edit settings --domain production --project ml-pipeline
```

The editor file has three sections:

- **Local overrides** — values explicitly set at this scope.
- **Inherited settings** — values resolved from a parent scope, shown as comments with their origin.
- **Available settings** — all remaining keys, shown as commented placeholders with descriptions.

Example editor file for a domain that has one local override and one inherited setting:

```yaml
### Settings for scope: DOMAIN(production)
## Remove or comment out a line to inherit that setting from the parent scope.
## Set a value to ~unset to explicitly clear it, blocking parent inheritance.

### Local overrides
## Default queue for task runs
run.default_queue: fast-queue

### Inherited settings (uncomment to override at this scope)
## Kubernetes service account for task pods
# security.service_account: default  ## inherited from ORG

### Available settings (uncomment and edit to set at this scope)
## Base path for raw data storage (e.g. s3://my-bucket/prefix)
# storage.raw_data_path: ''
## CPU resource quantity (e.g. "500m", "2")
# task_resource.min.cpu: ''
## Memory resource quantity (e.g. "256Mi", "4Gi")
# task_resource.min.memory: ''
...
```

The three comment prefixes have distinct roles:

| Prefix | Meaning |
|--------|---------|
| `###` | Section header |
| `##` | Field description or metadata |
| `#` | Inactive setting — remove the leading `#` to activate |

To set or change a value, uncomment the relevant line and edit it. To stop overriding a setting and revert to inheriting from the parent scope, comment out or delete the line entirely.

When you save and close the editor, the CLI prints a summary of your changes and asks for confirmation before applying them:

```
Changes to apply:
  + run.default_queue: fast-queue
  ~ task_resource.min.cpu: 500m → 2

Apply these changes? [Y/n]:
```

> [!NOTE]
> If your YAML fails to parse after saving, the editor reopens with an error header above your content so you can correct the syntax without losing your work. If you decline to reopen, your edits are saved automatically to `~/.flyte/settings-edit-<timestamp>.yaml`.

### Apply settings from a file

For CI pipelines or scripted setup, use `--from-file` to skip the interactive editor entirely:

```shell
flyte edit settings --domain production --from-file settings.yaml
```

The file should be a plain YAML mapping of dot-notation keys to values. Changes are printed and applied immediately without a confirmation prompt.

## Editing settings from Python

The same scopes can be read and written from Python with `flyte.remote.Settings` — useful for scripted setup, audits, or wiring configuration into your own tooling. Fetch a scope with `get_settings_for_edit()`, inspect its values, then write them back with `update_settings()`.

```python
import flyte.remote as remote

# Fetch a scope. With no arguments you get the org scope; pass `domain`,
# or `domain` and `project`, to narrow it.
settings = remote.Settings.get_settings_for_edit(domain="production", project="ml-pipeline")

# Effective settings: every key resolved through inheritance, each tagged
# with the scope it came from (ORG, DOMAIN, or PROJECT).
for s in settings.effective_settings:
    print(s.key, "=", s.value, "from", s.origin)

# Local settings: only the keys explicitly overridden at *this* scope.
for s in settings.local_settings:
    print(s.key, "=", s.value)

# Every settable key, in dot notation:
print(remote.Settings.available_keys())
```

If you want plain dictionaries instead of the origin-annotated objects, `settings.effective_values()` and `settings.local_overrides()` each return a `{key: value}` mapping.

To change settings, pass a mapping of dot-notation keys to `update_settings()`:

```python
settings.update_settings({
    "run.default_queue": "fast-queue",
    "task_resource.min.cpu": "2",
})
```

`update_settings()` **replaces the complete set of local overrides** for the scope the object was fetched for: keys you include are set locally, and any key you omit reverts to inheriting from the parent scope. The call uses optimistic locking against the version returned by `get_settings_for_edit()` — if another writer changed the same scope in between, re-fetch the scope and re-apply. To explicitly clear a value so it blocks parent inheritance (rather than reverting to it), use the `~unset` token in the `flyte edit settings` editor described above.

> [!NOTE]
> Treat `available_keys()` as the source of truth for which keys are settable on your version — the set grows over time, so prefer it over hardcoding key names.

## Available settings

| Key | Type | Description |
|-----|------|-------------|
| `run.default_queue` | string | Default queue for task runs |
| `security.service_account` | string | Kubernetes service account for task pods |
| `storage.raw_data_path` | string | Base path for raw data storage (e.g. `s3://my-bucket/prefix`) |
| `task_resource.min.cpu` | quantity | Minimum CPU request applied to task pods (e.g. `500m`, `2`) |
| `task_resource.min.gpu` | quantity | Minimum GPU request applied to task pods (e.g. `1`) |
| `task_resource.min.memory` | quantity | Minimum memory request applied to task pods (e.g. `256Mi`, `4Gi`) |
| `task_resource.min.storage` | quantity | Minimum ephemeral storage request applied to task pods (e.g. `10Gi`) |
| `task_resource.max.cpu` | quantity | Maximum CPU limit enforced on task pods — cannot be overridden by per-task configuration |
| `task_resource.max.gpu` | quantity | Maximum GPU limit enforced on task pods — cannot be overridden by per-task configuration |
| `task_resource.max.memory` | quantity | Maximum memory limit enforced on task pods — cannot be overridden by per-task configuration |
| `task_resource.max.storage` | quantity | Maximum ephemeral storage limit enforced on task pods — cannot be overridden by per-task configuration |
| `task_resource.mirror_limits_request` | bool | When `true`, resource limits are set equal to requests |
| `labels` | map | Kubernetes labels applied to task pods |
| `annotations` | map | Kubernetes annotations applied to task pods |
| `environment_variables` | map | Environment variables injected into task pods |

Quantity values use Kubernetes resource quantity format. Examples: `500m` (0.5 CPU), `2` (2 CPU), `256Mi` (256 mebibytes), `4Gi` (4 gibibytes).

## Inheritance rules

**Scalar settings** — `run`, `security`, `storage`, and `task_resource` fields: the most specific scope with a value wins. If the project sets `run.default_queue`, that value is used. If not, the domain's value is checked, then the org's.

**Map settings** — `labels`, `annotations`, and `environment_variables`: entries merge across scopes, with parent entries applied first and child entries overriding on key conflict. For example, if the org sets `LOG_LEVEL=info` and the project sets `LOG_LEVEL=debug`, the project's value wins.

### Explicitly clearing a value

Set a value to `~unset` to stop inheritance at that scope without providing a value of its own. Child scopes can still define their own value. For map settings, `~unset` clears all accumulated entries from parent scopes.

```yaml
# This domain has no service account — child projects are free to set their own.
security.service_account: ~unset

# Clear all environment variables inherited from org at this domain level.
environment_variables: ~unset
```

## Relationship to task configuration

Most settings provide defaults only. Any value set directly on a task — through `Resources`, `TaskEnvironment`, task decorator arguments, or a per-invocation override — takes precedence over the scope-level setting for that task, regardless of which scope the setting was defined at.

For example, if the org sets `task_resource.min.cpu: "500m"` as a default, a task with `@task(resources=Resources(cpu="2"))` will still use `2` CPUs.

`task_resource.max.*` settings are the exception: they are enforced as hard limits and cannot be overridden by per-task configuration. If a task specifies a resource value above the configured maximum, the maximum takes precedence.

## Permissions

Settings operations require the following actions, scoped to the org, domain, or project being accessed:

| Operation | Required action |
|-----------|----------------|
| View settings (open the editor) | `view_flyte_inventory` |
| Create or update settings | `register_flyte_inventory` |

The **Contributor** built-in role includes both actions. The **Viewer** role includes `view_flyte_inventory` only. See [User management](https://www.union.ai/docs/v2/union/user-guide/user-management) for details on roles and how to assign them.

=== PAGE: https://www.union.ai/docs/v2/union/user-guide/core-concepts/basic-project ===

# Basic project: RAG

This example demonstrates a two-stage RAG (Retrieval-Augmented Generation) pattern:
an offline embedding pipeline that processes and stores quotes, followed by an online
serving application that enables semantic search.

## Concepts covered

- `TaskEnvironment` for defining task execution environments
- `Dir` artifacts for passing directories between tasks
- `AppEnvironment` for serving applications
- `Parameter` and `RunOutput` for connecting apps to task outputs
- Semantic search with sentence-transformers and ChromaDB

## Part 1: The embedding pipeline

The embedding pipeline fetches quotes from a public API, creates vector embeddings
using sentence-transformers, and stores them in a ChromaDB database.

### Setting up the environment

The `TaskEnvironment` defines the execution environment for all tasks in the pipeline.
It specifies the container image, required packages, and resource allocations:

```python
# Define the embedding environment
embedding_env = flyte.TaskEnvironment(
    name="quote-embedding",
    image=flyte.Image.from_debian_base(python_version=(3, 12)).with_pip_packages(
        "sentence-transformers>=2.2.0",
        "chromadb>=0.4.0",
        "requests>=2.31.0",
    ),
    resources=flyte.Resources(cpu=2, memory="4Gi"),
    cache="auto",
)
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/basic-project/embed.py*

The environment uses:
- `Image.from_debian_base()` to create a container with Python 3.12
- `with_pip_packages()` to install sentence-transformers and ChromaDB
- `Resources` to request 2 CPUs and 4GB of memory
- `cache="auto"` to enable automatic caching of task outputs

### Fetching data

The `fetch_quotes` task retrieves quotes from a public API:

```python
@embedding_env.task
async def fetch_quotes() -> list[dict]:
    """
    Fetch quotes from a public quotes API.

    Returns:
        List of quote dictionaries with 'quote' and 'author' fields.
    """
    import requests

    print("Fetching quotes from API...")
    response = requests.get("https://dummyjson.com/quotes?limit=100")
    response.raise_for_status()

    data = response.json()
    quotes = data.get("quotes", [])

    print(f"Fetched {len(quotes)} quotes")
    return quotes
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/basic-project/embed.py*

This task demonstrates:
- Async task definition with `async def`
- Returning structured data (`list[dict]`) from a task
- Using the `@embedding_env.task` decorator to associate the task with its environment

### Creating embeddings

The `embed_quotes` task creates vector embeddings and stores them in ChromaDB:

```python
@embedding_env.task
async def embed_quotes(quotes: list[dict]) -> Dir:
    """
    Create embeddings for quotes and store them in ChromaDB.

    Args:
        quotes: List of quote dictionaries with 'quote' and 'author' fields.

    Returns:
        Directory containing the ChromaDB database.
    """
    import chromadb
    from sentence_transformers import SentenceTransformer

    print("Loading embedding model...")
    model = SentenceTransformer("all-MiniLM-L6-v2")

    # Create ChromaDB in a temporary directory
    db_dir = tempfile.mkdtemp()
    print(f"Creating ChromaDB at {db_dir}...")

    client = chromadb.PersistentClient(path=db_dir)
    collection = client.create_collection(
        name="quotes",
        metadata={"hnsw:space": "cosine"},
    )

    # Prepare data for insertion
    texts = [q["quote"] for q in quotes]
    ids = [str(q["id"]) for q in quotes]
    metadatas = [{"author": q["author"], "quote": q["quote"]} for q in quotes]

    print(f"Embedding {len(texts)} quotes...")
    embeddings = model.encode(texts, show_progress_bar=True)

    # Add to collection
    collection.add(
        ids=ids,
        embeddings=embeddings.tolist(),
        metadatas=metadatas,
        documents=texts,
    )

    print(f"Stored {len(quotes)} quotes in ChromaDB")
    return await Dir.from_local(db_dir)
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/basic-project/embed.py*

Key points:
- Uses the `all-MiniLM-L6-v2` model from sentence-transformers (runs on CPU)
- Creates a persistent ChromaDB database with cosine similarity
- Returns a `Dir` artifact that captures the entire database directory
- The `await Dir.from_local()` call uploads the directory to artifact storage

### Orchestrating the pipeline

The main pipeline task composes the individual tasks:

```python
@embedding_env.task
async def embedding_pipeline() -> Dir:
    """
    Main pipeline that fetches quotes and creates embeddings.

    Returns:
        Directory containing the ChromaDB database with quote embeddings.
    """
    print("Starting embedding pipeline...")

    # Fetch quotes from API
    quotes = await fetch_quotes()

    # Create embeddings and store in ChromaDB
    db_dir = await embed_quotes(quotes)

    print("Embedding pipeline complete!")
    return db_dir
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/basic-project/embed.py*

### Running the pipeline

To run the embedding pipeline:

```python
if __name__ == "__main__":
    flyte.init_from_config()
    run = flyte.run(embedding_pipeline)
    print(f"Embedding run URL: {run.url}")
    run.wait()
    print(f"Embedding complete! Database directory: {run.outputs()}")
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/basic-project/embed.py*

```bash
uv run embed.py
```

The pipeline will:
1. Fetch 100 quotes from the API
2. Create embeddings using sentence-transformers
3. Store everything in a ChromaDB database
4. Return the database as a `Dir` artifact

## Part 2: The serving application

The serving application provides a Streamlit web interface for searching quotes
using the embeddings created by the pipeline.

### App environment configuration

The `AppEnvironment` defines how the application runs:

```python
# Define the app environment
env = AppEnvironment(
    name="quote-search-app",
    description="Semantic search over quotes using embeddings",
    image=flyte.Image.from_debian_base(python_version=(3, 12)).with_pip_packages(
        "streamlit>=1.41.0",
        "sentence-transformers>=2.2.0",
        "chromadb>=0.4.0",
    ),
    args=["streamlit", "run", "app.py", "--server.port", "8080"],
    port=8080,
    resources=flyte.Resources(cpu=2, memory="4Gi"),
    parameters=[
        Parameter(
            name="quotes_db",
            value=RunOutput(task_name="quote-embedding.embedding_pipeline", type="directory"),
            download=True,
            env_var="QUOTES_DB_PATH",
        ),
    ],
    include=["app.py"],
    requires_auth=False,
)
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/basic-project/serve.py*

Key configuration:
- `args` specifies the command to run the Streamlit app
- `port=8080` exposes the application on port 8080
- `parameters` defines inputs to the app:
  - `RunOutput` connects to the embedding pipeline's output
  - `download=True` downloads the directory to local storage
  - `env_var="QUOTES_DB_PATH"` makes the path available to the app
- `include=["app.py"]` bundles the Streamlit app with the deployment

### The Streamlit application

The app loads the ChromaDB database using the path from the environment variable:

```python
# Load the database
@st.cache_resource
def load_db():
    db_path = os.environ.get("QUOTES_DB_PATH")
    if not db_path:
        st.error("QUOTES_DB_PATH environment variable not set")
        st.stop()

    client = chromadb.PersistentClient(path=db_path)
    collection = client.get_collection("quotes")
    model = SentenceTransformer("all-MiniLM-L6-v2")
    return collection, model

collection, model = load_db()
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/basic-project/app.py*

The search interface provides a text input and result count slider:

```python
# Search interface
query = st.text_input("Enter your search query:", placeholder="e.g., love, wisdom, success")
top_k = st.slider("Number of results:", min_value=1, max_value=20, value=5)

col1, col2 = st.columns([1, 1])
with col1:
    search_button = st.button("Search", type="primary", use_container_width=True)
with col2:
    random_button = st.button("Random Quote", use_container_width=True)

st.divider()
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/basic-project/app.py*

When the user searches, the app encodes the query and finds similar quotes:

```python
if search_button and query:
    # Encode query and search
    query_embedding = model.encode([query])[0].tolist()
    results = collection.query(
        query_embeddings=[query_embedding],
        n_results=top_k,
    )

    if results["documents"] and results["documents"][0]:
        for i, (doc, metadata, distance) in enumerate(
            zip(results["documents"][0], results["metadatas"][0], results["distances"][0])
        ):
            similarity = 1 - distance  # Convert distance to similarity
            st.markdown(f'**{i+1}.** "{doc}"')
            st.caption(f"— {metadata['author']} | Similarity: {similarity:.2%}")
            st.write("")
    else:
        st.info("No results found.")
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/basic-project/app.py*

The app also includes a random quote feature:

```python
elif random_button:
    # Get a random quote from the collection
    all_data = collection.get(limit=100)
    if all_data["documents"]:
        idx = random.randint(0, len(all_data["documents"]) - 1)
        quote = all_data["documents"][idx]
        author = all_data["metadatas"][idx]["author"]
        st.markdown(f'**"{quote}"**')
        st.caption(f"— {author}")
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/basic-project/app.py*

### Deploying the app

To deploy the quote search application:

```python
if __name__ == "__main__":
    flyte.init_from_config()

    # Deploy the quote search app
    print("Deploying quote search app...")
    deployment = flyte.serve(env)
    print(f"App deployed at: {deployment.url}")
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/basic-project/serve.py*

```bash
uv run serve.py
```

The app will be deployed and automatically connected to the embedding pipeline's
output through the `RunOutput` parameter.

## Key takeaways

1. **Two-stage RAG pattern**: Separate offline embedding creation from online serving
   for better resource utilization and cost efficiency.

2. **Dir artifacts**: Use `Dir` to pass entire directories (like databases) between
   tasks and to serving applications.

3. **RunOutput**: Connect applications to task outputs declaratively, enabling
   automatic data flow between pipelines and apps.

4. **CPU-friendly embeddings**: The `all-MiniLM-L6-v2` model runs efficiently on CPU,
   making this pattern accessible without GPU resources.

=== PAGE: https://www.union.ai/docs/v2/union/user-guide/core-concepts/leases ===

# Leases

When you run a task on Union.ai, your code doesn't execute on the control plane — it runs on a compute cluster, which might be in your cloud account, in another region, or one of several clusters attached to your account. Union.ai has to hand a piece of work to a cluster, then keep track of whether that work is actually getting done.

It does this with **leases**.

You normally never have to think about leases. But occasionally you'll see a message like:

```
lease expired for action a0 ...
```

This page explains what that means, why it happens, and why — most of the time — it's the system protecting you rather than something going wrong.

## What a lease is

A lease is a time-bound grant of work. Union.ai hands a unit of work (an [action](./runs-and-actions)) to a worker on a cluster and says, in effect: *"You own this for now. Keep telling me you're still on it."*

Think of it like **renting an apartment**:

- The landlord (the Union.ai control plane) owns the property (your work) and hands a tenant (a cluster worker) the keys under a lease.
- The tenant doesn't pay rent once a year — they pay on a regular schedule. In Union.ai, the worker sends a **heartbeat** every few seconds: *"still working on it."*
- As long as the rent keeps coming in, the tenant keeps the apartment. As long as the heartbeats keep coming in, the worker keeps the work.

This heartbeat is the whole point. It's how Union.ai knows — continuously, not just at the start — that your work is alive and progressing.

## Why leases exist

Union.ai hands work to machines it doesn't fully control, and machines fail in messy ways:

- A **cluster goes offline** — a network partition, a dropped connection, a zone outage.
- A **component on the cluster fails** — the worker process crashes or hangs.
- The **control plane itself blips** — a brief restart or deployment.

The hard part isn't any single failure. It's that from the control plane's point of view, **all of these look the same**: the heartbeats stopped. When a worker goes silent, Union.ai cannot tell the difference between:

- The worker **died** (the work needs to be handed to someone else), and
- The worker is **alive but unreachable** and still grinding away on your task.

This is one of the genuinely hard problems in distributed systems. Leases are how Union.ai makes a safe decision without being able to see the truth directly.

## What happens when heartbeats stop

When a worker stops heartbeating, Union.ai does **not** evict it immediately. A missed heartbeat is usually nothing — a momentary network hiccup, a quick deployment, a transient glitch. Just like a landlord wouldn't change the locks the day rent is one hour late, the system gives a **grace period**.

During that grace period the lease is still valid. If heartbeats resume — the network heals, the worker reconnects — the worker simply picks up where it left off. **No work is lost and nothing is re-run.** This is the common case, and it's invisible to you.

But if enough time passes — a window that Union.ai configures — and the worker is *still* silent, the system reaches a point where it can no longer safely assume the work is in good hands. At that point it has to **reap the lease**. This is what you see as `lease expired`.

The system reaps a silent lease for two reasons, and both are about protecting you:

1. **To make forward progress.** If the original worker really is gone, your run would otherwise hang forever. Reaping the lease frees the work to be reassigned to a healthy worker so it can actually finish.
2. **To guarantee exactly one owner.** Union.ai must never have two workers both believing they own the same action — that would corrupt results and waste compute. Reaping the old lease, and refusing to accept any further reports from it, ensures that when the work is reassigned, **exactly one** worker owns it.

The mechanisms behind these are **lease expiration** (the old grant is invalidated) and **lease failover** (the work is handed to a new worker).

## Why this is the safe trade-off

Here's the key design decision, and the part most relevant to you as someone who cares about **failures and cost**:

> [!NOTE] Correctness over runaway compute
> Union.ai would rather **occasionally redo a small amount of work** than let a task it can no longer account for keep running unchecked.

Imagine the alternative. A worker becomes unreachable but is secretly still running your job — burning a node, maybe a multi-GPU node — and Union.ai can't see it, can't stop it, and can't trust its output. That's exactly the **runaway compute** and surprise cost that you don't want. By bounding every piece of work with a lease, Union.ai ensures there's no orphaned, unaccounted-for task quietly running up a bill.

So when a lease expires and the work is reassigned, the cost is at worst **redoing one action**. The benefit is a hard guarantee that compute is always accounted for, and results are always correct.

## How often does this actually happen?

Rarely. Union.ai uses sophisticated failure-detection and the grace periods are tuned so that the everyday churn of distributed systems — momentary network issues, routine deployments, brief glitches — is **absorbed automatically**. The system is designed to auto-heal in all of those cases, and in the overwhelming majority of runs you'll never see a lease expiration at all.

When you *do* see one, it generally means a real, sustained failure occurred — a cluster was unreachable for long enough that continuing to trust the lease would have been unsafe.

## What to do when you see `lease expired`

In most cases, **nothing** — the system has already done the right thing:

- The expired lease was reaped, and the action was retried on a healthy worker. The retry you see in the UI is the failover working as intended.
- If the run completed successfully after the retry, your results are correct and complete.

It's worth a closer look only if you see lease expirations **repeatedly** on the same run or cluster. That pattern points to an underlying environment problem — a chronically unhealthy cluster, persistent network issues, or under-provisioned nodes — rather than the lease mechanism itself. In that case, check the health of the affected cluster or reach out to your platform team.

## The takeaway

A lease expiration is **not a system failure** — it's the system doing its job. It's Union.ai refusing to let work run unaccounted-for, guaranteeing that every action has exactly one owner, and making sure your run still makes progress when a cluster lets it down.

It's the trade-off in your favor: a small chance of redoing work, in exchange for never paying for runaway compute and never trusting a corrupt result.

