diff --git a/docs/encyclopedia/data-conversion/codec-server.mdx b/docs/encyclopedia/data-conversion/codec-server.mdx index 7aa1b41b06..25b8c51cca 100644 --- a/docs/encyclopedia/data-conversion/codec-server.mdx +++ b/docs/encyclopedia/data-conversion/codec-server.mdx @@ -2,19 +2,17 @@ id: codec-server title: Codec Server sidebar_label: Codec Server -description: A Codec Server is an HTTP server that provides remote encoding and decoding for Temporal Payloads. +description: + A Codec Server is an HTTP server that provides remote encoding and decoding for Temporal Payloads, enabling the Web UI + and CLI to display decoded data without exposing encryption keys to the Temporal Service. slug: /codec-server toc_max_heading_level: 4 keywords: - encryption - - explanation - - keys + - codec-server - payloads - - secrets - data-converters - - codec-server tags: - - codec-server - Concepts - Encryption - Data Converters @@ -23,43 +21,182 @@ tags: import { CaptionedImage } from '@site/src/components'; -This page discusses [Codec Server](#codec-server). +A Codec Server is an HTTP/HTTPS server that you host and operate. It runs your [Payload Codec](/payload-codec) logic to +encode and decode [Payloads](/dataconversion#payload) on behalf of the Temporal CLI and Web UI. The Codec Server is +independent of the Temporal Service. Encryption keys and codec logic remain in your environment. + +For setup instructions, see [Codec Server setup](/production-deployment/data-encryption#codec-server-setup). + +## Why use a Codec Server + +When you apply a custom [Payload Codec](/payload-codec) for encryption or compression, data stored in the Temporal +Service is encoded. The Temporal Service never has access to your encryption keys, so it cannot decode this data. +Without a Codec Server, the Web UI and CLI display raw encoded payloads. -## What is a Codec Server? {#codec-server} +A Codec Server solves this by giving the Web UI and CLI a way to decode payloads on demand, without exposing keys to the +Temporal Service. Common reasons to run a Codec Server include: -A Codec Server is an HTTP/HTTPS server that uses a [custom Payload Codec](/production-deployment/data-encryption) to decode your data remotely through endpoints. +- **Debugging Workflows.** View decoded Workflow inputs, outputs, and Event History in the Web UI instead of reading + base64-encoded or encrypted blobs. +- **Operating from the CLI.** Use commands like `temporal workflow show` and `temporal workflow execute` with readable + data, even when payloads are encrypted at rest. +- **Encoding inputs from the UI and CLI.** When you start or signal a Workflow from the Web UI or CLI, the Codec Server + can encode the input before it reaches the Temporal Service, so the Temporal Service never sees plaintext (the input + still travels from your browser or CLI to the Codec Server, which is why HTTPS matters in any non-loopback + deployment). +- **Compliance and access control.** Because the Codec Server runs in your environment, you control who can decode + payloads and under what conditions. You can layer authorization on top of the decode endpoint to restrict access per + user or per Namespace. -{/* This should not have changed with tctl-to-temporal */} +## How a Codec Server works + +A Codec Server follows the Temporal +[Codec Server Protocol](https://github.com/temporalio/samples-go/tree/main/codec-server#codec-server-protocol). It +exposes two HTTP POST endpoints: + +- **`/encode`** accepts plaintext payloads and returns encoded payloads. Used for sending payloads. +- **`/decode`** accepts encoded payloads and returns decoded payloads. Used for retrieving payloads. + +Both endpoints receive and respond with a JSON body containing a `payloads` array of [Payload](/dataconversion#payload) +objects. The Codec Server passes each payload through your [Payload Codec](/payload-codec), which applies the same +encoding or decoding logic that your Workers use. -A Codec Server follows the Temporal [Codec Server Protocol](https://github.com/temporalio/samples-go/tree/main/codec-server#codec-server-protocol). -It implements two endpoints: +When the Web UI or CLI needs to display decoded data, it sends the encoded payloads to your Codec Server's `/decode` +endpoint. The Codec Server decodes the payloads and returns them to the client. The Temporal Service never sees the +decoded data. -- `/encode` -- `/decode` +The `/encode` endpoint works in the other direction. When you start a Workflow or send a Signal from the Web UI or CLI, +the input is sent to the Codec Server's `/encode` endpoint first, so data reaches the Temporal Service in its encoded +form. -Each endpoint receives and responds with a JSON body that has a `payloads` property with an array of [Payloads](/dataconversion#payload). -The endpoints run the Payloads through a [Payload Codec](/payload-codec) before returning them. +Your Codec Server should use the same Payload Codec implementation as your Workers to ensure consistent encoding and +decoding. -Most SDKs provide example Codec Server implementation samples, listed here: +## Codec Server with External Storage {#external-storage} -- [Go](https://github.com/temporalio/samples-go/tree/main/codec-server) -- [Java](https://github.com/temporalio/sdk-java/tree/master/temporal-remote-data-encoder) -- [.NET](https://github.com/temporalio/samples-dotnet/tree/main/src/Encryption) -- [Python](https://github.com/temporalio/samples-python/blob/main/encryption/codec_server.py) -- [TypeScript](https://github.com/temporalio/samples-typescript/blob/main/encryption/src/codec-server.ts) +When your Workers and Clients use [External Storage](/external-storage), your storage drivers replace some payloads in +the Event History with small references that point to data in an external store like Amazon S3. The Temporal Service and +the Web UI only see these references, not the actual payload data. This is further complicated by setups where you run +Codecs in proxy that encode payloads after the Data Converter has returned on the Worker. Your Codec Server must be able +to handle downloading and decoding in the correct order for you to be able to view the Workflow data in the UI or CLI. + +To support External Storage, create a handler using `NewPayloadHTTPHandler` with `PayloadHTTPHandlerOptions`. The options +accept your storage drivers, your pre-storage codecs (the Payload Codecs configured in your Worker's Data Converter), +and any post-storage codecs (codecs applied by a proxy after external storage). The handler applies them in the correct +order across all endpoints automatically. When you configure the handler with storage drivers, the existing endpoints +become storage-aware and a new `/download` endpoint becomes available: + +:::caution + +`NewPayloadHTTPHandler` runs the full encode-store-encode and decode-retrieve-decode pipeline. Do not use it as a target +for a remote Data Converter or remote codec on your Workers. For remote codecs, use `NewPayloadCodecHTTPHandler` +separately. If you need both, set up `NewPayloadHTTPHandler` for the Web UI and CLI alongside +`NewPayloadCodecHTTPHandler` for your Workers, and configure both with the same codecs. + +::: + +- **`/download`** retrieves the actual payload data from external storage and decodes it through the Payload Codec. This + endpoint is used internally by `/decode` when it encounters storage references, but you can also call it directly from + to retrieve the decoded payload. The Temporal Web UI uses this endpoint when you click to view the full payload for a + storage reference. +- **`/decode`** still decodes encoded payloads, but also handles storage references. By default, `/decode` uses the + download logic internally to retrieve and decode any storage references in the request alongside regular payloads. + With the `?preserveStorageRefs=true` query parameter, `/decode` skips retrieval and returns storage references as-is. +- **`/encode`** applies the Payload Codec, then uploads payloads that exceed the size threshold to external storage and + replaces them with reference tokens. + + -#### Usage +The following example walks through how all three endpoints work together: -When you apply custom encoding with encryption or compression on your Workflow data, it is stored in the encrypted/compressed format on the Temporal Server. For details on what data is encoded, see [Securing your data](/production-deployment/data-encryption). +1. A user starts a Workflow from the CLI with a plaintext input. The CLI sends the input to the Codec Server's `/encode` + endpoint. +2. The Codec Server encodes the payload through the Payload Codec. The encoded payload exceeds the storage threshold, + so the Codec Server uploads it to external storage and returns a small reference token. +3. The CLI sends the reference token to the Temporal Service, which stores it in the Event History. +4. Later, a user views the Workflow in the Web UI. The Web UI retrieves the Event History from the Temporal Service and + sends the payloads to the Codec Server's `/decode` endpoint with the `?preserveStorageRefs=true` query parameter. +5. The Codec Server decodes any non-reference payloads through the Payload Codec, but returns storage references as-is. + The Web UI displays the reference metadata, indicating the payload is stored externally. +6. The user clicks to view the full payload. The Web UI sends the storage reference to the `/download` endpoint. +7. The Codec Server retrieves the encoded payload from external storage, decodes it through the Payload Codec, and + returns the plaintext result to the Web UI. -To see decoded data when using the Temporal CLI or Web UI to perform some operations on a Workflow Execution, configure the Codec Server endpoint in the Web UI and the Temporal CLI. -When you configure the Codec Server endpoints, the Temporal CLI and Web UI send the encoded data to the Codec Server, and display the decoded data received from the Codec Server. +## Codec Server vs. Payload Codec -For details on creating your Codec Server, see [Codec Server Setup](/production-deployment/data-encryption#codec-server-setup). +A Codec Server runs a [Payload Codec](/payload-codec) internally, so the two are directly connected. The difference is +where the codec logic runs and who calls it. + +| | Payload Codec | Codec Server | +| --------------------------------- | --------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- | +| **Purpose** | Encodes and decodes Payloads. Applies encryption, compression, or other byte-level transformations. | Hosts a Payload Codec as an HTTP service so the Web UI and CLI can encode and decode Payloads remotely. | +| **Runs where** | In-process, inside your Workers and Clients. Also runs inside the Codec Server. | As a standalone HTTP service in your environment, with a Payload Codec inside it. | +| **Called by** | The Temporal SDK, automatically on every serialization and deserialization. | The Web UI and CLI, over HTTP, when a user views or submits Payload data. | +| **Has access to encryption keys** | Yes. Keys are available in the Worker or Client process. | Yes. Must be configured with the same keys the Payload Codec uses. | + +You implement the transformation logic once in a Payload Codec, then host that logic in a Codec Server so the Web UI and +CLI can use it remotely. + +## Securing a Codec Server + +Because a Codec Server can decode sensitive data, treat it with the same trust as a Worker. Anyone who can call it has +effective decrypt access. Use HTTPS for any deployment that is not strictly loopback (`localhost`). + +### Network-level restrictions + +Restrict network access to the Codec Server. The Web UI can communicate with a Codec Server that is only accessible on +`localhost`, so running the Codec Server locally is a viable security pattern. For team access, place the Codec Server +behind a VPN. + +### Authentication + +When the Codec Server is accessible beyond `localhost`, authenticate requests to verify the identity of the caller. The +Web UI supports two approaches: + +**Include cross-origin credentials (recommended).** Enable **Include cross-origin credentials** in the Web UI Codec +Server settings. The browser sends cookies scoped to the Codec Server's domain with each request. Your Codec Server must +have its own authentication mechanism (its own login page and session cookies), so the user must have independently +authenticated with the Codec Server. This is the recommended approach because the Codec Server maintains its own auth +boundary, separate from the Temporal UI. + +**Pass access token.** Enable **Pass access token** in the Web UI Codec Server settings. The Web UI includes the same +JSON Web Token (JWT) the user used to log into the Temporal UI in the `Authorization` header of each request. Your Codec +Server validates the token signature against the OpenID Connect (OIDC) provider's JSON Web Key Set (JWKS) endpoint. On +Temporal Cloud, verify against the +[Temporal Cloud JWKS endpoint](https://login.tmprl.cloud/.well-known/jwks.json). On a self-hosted Temporal Service, the +token comes from whatever auth provider you have [configured for the Web UI](/references/web-ui-configuration#auth). +This approach requires less setup but reuses the same token across the Temporal UI and the Codec Server. + +### Namespace-level authorization + +Authentication identifies the caller, but does not confirm the caller is authorized to decode payloads for a specific +Namespace. Each request from the Web UI includes an `X-Namespace` header identifying the Namespace. To enforce +Namespace-level access control, your Codec Server must enforce an additional check on whether the authenticated user has +permissions for the requested Namespace. This applies regardless of which authentication approach you use. + +### Key management + +You may also need [key management infrastructure](/key-management) to share encryption keys between your Workers and the +Codec Server. + +## SDK Codec Server samples + +Most Temporal SDKs provide example Codec Server implementations: + +- [Go](https://github.com/temporalio/samples-go/tree/main/codec-server) +- [Java](https://github.com/temporalio/sdk-java/tree/master/temporal-remote-data-encoder) +- [Python](https://github.com/temporalio/samples-python/blob/main/encryption/codec_server.py) +- [TypeScript](https://github.com/temporalio/samples-typescript/blob/main/encryption/src/codec-server.ts) +- [.NET](https://github.com/temporalio/samples-dotnet/blob/main/src/Encryption/CodecServer/Program.cs) diff --git a/docs/production-deployment/data-encryption.mdx b/docs/production-deployment/data-encryption.mdx index 133aa1c59d..9545c93f94 100644 --- a/docs/production-deployment/data-encryption.mdx +++ b/docs/production-deployment/data-encryption.mdx @@ -1,6 +1,6 @@ --- id: data-encryption -title: Codec Server - Temporal Platform feature guide +title: Codecs and Encryption sidebar_label: Codecs and Encryption description: Encrypt data in Temporal Server to secure Workflow, Activity, and Worker information. Use custom Payload Codecs for encryption/decryption, set up Codec Servers for remote decoding, and ensure secure access. slug: /production-deployment/data-encryption @@ -18,10 +18,11 @@ tags: import { CaptionedImage } from '@site/src/components'; -Temporal Server stores and persists the data handled in your Workflow Execution. -Encrypting this data ensures that any sensitive application data is secure when handled by the Temporal Server. +The Temporal Service persists data from your Workflow Executions, including inputs, outputs, and results. To protect +sensitive data, use a [Payload Codec](/payload-codec) to encrypt payloads before they reach the Temporal Service. With +encryption enabled, data exists unencrypted only on the Client and the Worker process, on hosts that you control. -For example, if you have sensitive information passed in the following objects that are persisted in the Workflow Execution Event History, use encryption to secure it: +The following data is persisted in the Event History and can be encrypted: - Inputs and outputs/results in your [Workflow](/workflow-execution), [Activity](/activity-execution), and [Child Workflow](/child-workflows) - [Signal](/sending-messages#sending-signals) inputs @@ -30,37 +31,19 @@ For example, if you have sensitive information passed in the following objects t - [Query](/sending-messages#sending-queries) inputs and results - Results of [Local Activities](/local-activity) and [Side Effects](/workflow-execution/event#side-effect) - [Application errors and failures](/references/failures). - Failure messages and call stacks are not encoded as codec-capable Payloads by default; you must explicitly enable encoding these common attributes on failures. - For more details, see [Failure Converter](/failure-converter). + Failure messages and call stacks are not encoded as codec-capable Payloads by default; you must explicitly enable + encoding these common attributes on failures. For more details, see [Failure Converter](/failure-converter). -Using encryption ensures that your sensitive data exists unencrypted only on the Client and the Worker Process that is executing the Workflows and Activities, on hosts that you control. +To view encrypted data in the Web UI and CLI, set up a [Codec Server](/codec-server). The following sections cover how +to set up a Codec Server and configure the Web UI and CLI to use it. -By default, your data is serialized to a [Payload](/dataconversion#payload) by a [Data Converter](/dataconversion). -To encrypt your Payload, configure your custom encryption logic with a [Payload Codec](/payload-codec) and set it with a [custom Data Converter](/default-custom-data-converters#custom-data-converter). +For encryption implementation examples, see the following samples: -A Payload Codec does byte-to-byte conversion to transform your Payload (for example, by implementing compression and/or encryption and decryption) and is an optional step that happens between the Client and the [Payload Converter](/payload-converter): - - - -You can run your Payload Codec with a [Codec Server](/codec-server) and use the Codec Server endpoints in the Web UI and CLI to decode your encrypted Payload locally. -For details on how to set up a Codec Server, see [Codec Server setup](#codec-server-setup). - -However, if you plan to set up [remote data encoding](/remote-data-encoding) for your data, ensure that you consider all security implications of running encryption remotely before implementing it. - -When implementing a custom codec, it is recommended to perform your compression or encryption on the entire input Payload and store the result in the data field of a new Payload with a different encoding metadata field. -This ensures that the input Payload's metadata is preserved. -When the encoded Payload is sent to be decoded, you can verify the metadata field before applying the decryption. -If your Payload is not encoded, it is recommended to pass the unencoded data to the decode function instead of failing the conversion. - -Examples for implementing encryption: - -- [Go sample](https://github.com/temporalio/samples-go/tree/main/encryption) -- [Java sample](https://github.com/temporalio/samples-java/tree/main/core/src/main/java/io/temporal/samples/encryptedpayloads) -- [Python sample](https://github.com/temporalio/samples-python/tree/main/encryption) -- [TypeScript sample](https://github.com/temporalio/samples-typescript/tree/main/encryption) -- [.NET sample](https://github.com/temporalio/samples-dotnet/tree/main/src/Encryption) +- [Go](https://github.com/temporalio/samples-go/tree/main/encryption) +- [Java](https://github.com/temporalio/samples-java/tree/main/core/src/main/java/io/temporal/samples/encryptedpayloads) +- [Python](https://github.com/temporalio/samples-python/tree/main/encryption) +- [TypeScript](https://github.com/temporalio/samples-typescript/tree/main/encryption) +- [.NET](https://github.com/temporalio/samples-dotnet/tree/main/src/Encryption) ## Codec Server setup {#codec-server-setup} @@ -68,13 +51,10 @@ Use a Codec Server to programmatically decode your encoded [payloads](/dataconve A Codec Server is an HTTP server that uses your custom Codec logic to decode your data remotely. The Codec Server is independent of the Temporal Service and decodes your encrypted payloads through predefined endpoints. You create, operate, and manage access to your Codec Server in your own environment. -The Temporal CLI and the Web UI in turn provide built-in hooks to call the Codec Server to decode encrypted payloads on demand. - -The Codec Server is independent of the Temporal Server and decodes your encrypted payloads through endpoints. -When you configure a Codec Server endpoint in the Temporal Web UI or CLI, the Web UI and CLI use the remote endpoint to receive decoded payloads from the Codec Server. +When you configure a Codec Server endpoint in the Web UI or CLI, the Web UI and CLI use the remote endpoint to send and receive payloads from the Codec Server. See [API contract requirements](#api-contract-specifications). -Decoded payloads can then be displayed in the Workflow Execution Event History on the Web UI. Note that when you use a Codec Server, the decoded payloads are decoded and returned on the client side only; payloads on the Temporal Server (whether on Temporal Cloud or a self-hosted Temporal Service) remain encrypted. +Decoded payloads can then be displayed in the Workflow Execution Event History on the Web UI. When you use a Codec Server, the decoded payloads are decoded and returned on the client side only. Payloads on the Temporal Service (whether on Temporal Cloud or self-hosted) remain encrypted. Because you create, operate, and manage access to your Codec Server in your controlled environment, ensure that you consider the following: @@ -91,7 +71,13 @@ When you create your Codec Server to handle requests from the Web UI, the follow #### Endpoints -The Web UI and CLI send a POST to a `/decode` endpoint. In your Codec Server, create a `/decode` path and pass the incoming payload to the decode method in your Payload Codec. +The Web UI and CLI send POST requests to the following endpoints on your Codec Server: + +- `/decode` passes incoming payloads to the decode method in your Payload Codec. +- `/encode` passes incoming payloads to the encode method in your Payload Codec. +- `/download` retrieves and decodes payloads from [External Storage](/external-storage). This endpoint is only needed if + your Workers use External Storage. See [Codec Server with External Storage](/codec-server#external-storage) for + details. For examples on how to create your Codec Server, see the following Codec Server implementation samples: @@ -346,14 +332,12 @@ temporal workflow show \ --codec-auth 'auth-header' ``` -### Working with Large Payloads - -Codec Servers can be used for more than encryption and decryption of sensitive data. -Codec Server behavior is left up to implementers -- they can also call external services or perform other tasks, as long as they hook in at the encoding and decoding stages of a Workflow payload. +### Working with large payloads -By default, Temporal limits payload size to 4MB. -If this limitation is problematic for your use case, you could implement a codec that persists your payloads to an object store outside of workflow histories. -An example implementation is available from [DataDog](https://github.com/DataDog/temporal-large-payload-codec). +If your payloads exceed the Temporal Service's size limits, use [External Storage](/external-storage) to offload large +payloads to an external store like Amazon S3. When External Storage is configured, your Codec Server can also retrieve +and decode these payloads for viewing in the Web UI and CLI. See +[Codec Server with External Storage](/codec-server#external-storage) for details. ### Temporal Nexus diff --git a/static/diagrams/codec-server-dark.svg b/static/diagrams/codec-server-dark.svg new file mode 100644 index 0000000000..3573fd754d --- /dev/null +++ b/static/diagrams/codec-server-dark.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/static/diagrams/codec-server-with-external-storage-dark.svg b/static/diagrams/codec-server-with-external-storage-dark.svg new file mode 100644 index 0000000000..e7140fbeeb --- /dev/null +++ b/static/diagrams/codec-server-with-external-storage-dark.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/static/diagrams/codec-server-with-external-storage.svg b/static/diagrams/codec-server-with-external-storage.svg new file mode 100644 index 0000000000..726707c33a --- /dev/null +++ b/static/diagrams/codec-server-with-external-storage.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/static/diagrams/codec-server.svg b/static/diagrams/codec-server.svg new file mode 100644 index 0000000000..f943bf11d6 --- /dev/null +++ b/static/diagrams/codec-server.svg @@ -0,0 +1 @@ + \ No newline at end of file