Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions menu/navigation.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ import { cockpitMenu } from "../pages/cockpit/menu"
import { containerRegistryMenu } from "../pages/container-registry/menu"
import { cpanelHostingMenu } from "../pages/cpanel-hosting/menu"
import { dataLabMenu } from "../pages/data-lab/menu"
import { dataOrchestratorMenu } from "../pages/data-orchestrator/menu"
import { dataWarehouseMenu } from "../pages/data-warehouse/menu"
import { dediboxMenu } from "../pages/dedibox/menu"
import { dediboxAccountMenu } from "../pages/dedibox-account/menu"
Expand Down Expand Up @@ -155,6 +156,7 @@ export default [
{
icon: 'DataAndAnalyticsCategoryIcon',
items: [
dataOrchestratorMenu,
dataWarehouseMenu,
dataLabMenu,
clustersForKafkaMenu,
Expand Down
61 changes: 61 additions & 0 deletions pages/data-orchestrator/concepts.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
---
title: Data Orchestrator - Concepts
description: Learn the fundamental concepts of Scaleway Data Orchestrator.
tags: data-orchestrator
dates:
validation: 2026-04-01
---

## Orchestration

Orchestration is the automated coordination of tasks and workflows that keeps data operations reliable, scalable, and maintainable. In the context of Scaleway Data Orchestrator, it enables users to define, schedule, and manage complex data pipelines. It also handles dependencies, error recovery, and execution order seamlessly. Instead of manually triggering scripts or monitoring jobs, orchestration brings structure and intelligence, turning fragmented processes into unified, business-aligned workflows.

## Tasks

### Action task

An action task represents the executable unit within a workflow that performs concrete work. An action task can be:
- **Serverless Jobs**: Long-running batch processes that scale automatically without infrastructure management.
- **Serverless Functions**: Lightweight, event-driven code execution for quick transformations or API calls.
- **Spark Jobs**: Distributed data processing tasks for large-scale ETL or analytics using Apache Spark.
- Other compute-intensive or service-specific jobs (e.g., data validation, model inference).

These tasks are orchestrated in sequence or in parallel, forming the backbone of data processing pipelines.

### Logic task

A Logic task controls the flow and decision-making within a workflow, enabling dynamic behavior beyond simple linear execution. A logic task can be:
- **Switch**: Direct flow based on runtime conditions (e.g., file size, data quality).
- **Fork**: Split execution into parallel branches to process data concurrently.
- **Try catch**: Implement error-handling blocks to manage failures and enable retries or fallback logic.

These tasks allow users to embed business logic directly into pipelines, making them resilient and adaptable.

## Trigger

A trigger is the event that initiates a workflow execution. A trigger can be:
- **Manual**: User starts the run via the Scaleway Console or CLI (ideal for testing).
- **Schedule**: Automatic execution based on time (e.g., daily at 8:00 AM), set with a built-in scheduler.
- **Event**: Triggered by external signals (e.g., new file in object storage, message in a queue), enabling reactive, real-time data processing.

## Views

### Code view

Every workflow can be visualized as code, showing tasks and their dependencies.

### Graph view

Every workflow can be visualized as a Directed Acyclic Graph (DAG), showing the tasks and their dependencies.

## Workflow

A workflow is a structured sequence of action tasks and logical tasks that define an end-to-end data process.

### Workflow definition

The declarative blueprint of a workflow, typically described in code (e.g., YAML or Python) or designed visually. It specifies tasks, dependencies, conditions, and execution parameters. This definition is version-controlled, reusable, and portable across environments.

### Workflow execution / run

The runtime instance of a workflow definition. Each execution (or run) tracks the state, logs, and results of every task, providing full observability and auditability. Runs can succeed, fail, or be paused, with detailed insights for debugging.
35 changes: 35 additions & 0 deletions pages/data-orchestrator/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
title: Data Orchestrator Documentation
description: Comprehensive documentation on Scaleway Data Orchestrator.
---

<Message type="note">
Data Orchestrator is currently in Private Beta.
</Message>

<ProductHeader
productName="Data Orchestrator"
productLogo="data-orchestrator"
description="Scaleway Data Orchestrator is designed to help the user automate, schedule, and manage data workflows across various systems and environments."
url="/data-orchestrator/quickstart/"
label="Data Orchestrator Quickstart"
/>

## Getting Started

<Grid>
<SummaryCard
title="Concepts"
icon="info"
description="Core concepts that give you a better understanding of Scaleway Data Orchestrator."
label="View Concepts"
url="/data-orchestrator/concepts/"
/>
</Grid>

## Changelog

<ChangelogList
productName="data-orchestrator"
numberOfChanges={0}
/>
14 changes: 14 additions & 0 deletions pages/data-orchestrator/menu.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
export const dataOrchestratorMenu = {
items: [
{
label: 'Overview',
slug: '../data-orchestrator',
},
{
label: 'Concepts',
slug: 'concepts',
},
],
label: 'Data Orchestrator',
slug: 'data-orchestrator',
}
Loading