Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
# More specific patterns at the bottom override general patterns above.

.claude/ @DataDog/apm-common-components-core
AGENTS.md @DataDog/apm-common-components-core
CLAUDE.md @DataDog/apm-common-components-core
.clang-format @DataDog/libdatadog
.codecov.yml @DataDog/apm-common-components-core
.cargo/* @DataDog/libdatadog-core
Expand Down
155 changes: 155 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
# Libdatadog - shared repository for Datadog rust

**libdatadog** is a Rust workspace of shared libraries and utilities for Datadog's instrumentation tooling (continuous profilers, crash tracking, APM tracing). It exposes C/C++ FFI bindings consumed by Datadog SDKs in other languages.

## Development Workflow

### Toolchains

- **Rust**: MSRV `1.84.1` (set in workspace `Cargo.toml`); use stable for build/clippy.
- **Nightly rustfmt**: `nightly-2026-02-08` — `rustfmt.toml` uses nightly-only features.
- **cargo-nextest**: `0.9.96` (required for running tests). Install with `cargo install --locked 'cargo-nextest@0.9.96'`.
- **cbindgen**: `0.29` (for FFI header generation).
- **System tools**: `cmake` and `protoc`.
Comment on lines +9 to +13
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should all probably get inferred from metadada

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, otherwise this is going to be drifting hard. Also, for most of them the version isn't that relevant (e.g. cargo nextest), though it is for some others.


### Validating after changes

Iterate fastest with `cargo check -p <crate>` while editing; the full validation steps below are what should be green before declaring work done.

1. **Compile** the touched crates or the workspace but only when doing repo-wide changes:
```bash
cargo check -p <crate> # fast iteration on a single crate
cargo build --workspace --exclude builder # full build
```
2. **Format and lint** — always run on every crate that was touched, before finishing:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another plug for pre-commit hook :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a bad think if some instructions are both in the pre-commit hook and AGENT.md given that not everyone will enable pre-commit hooks, for example it might not work properly out of the box for some people or some might find them too slow (ie me when touching one line in datadog-agent and it taking 10mins to run 🥲 )

```bash
cargo +nightly-2026-02-08 fmt --all -- --check
cargo +stable clippy --workspace --all-targets --all-features -- -D warnings -A clippy::manual_is_multiple_of
Copy link
Copy Markdown
Contributor

@yannham yannham May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this accept coming from? If there's some config clippy file, better link to it rather than hardcoding allowlisted lints.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the CI. The reason for it is that satisfying this lint requires rust 1.85, which is over the MSRV.
I could add a lint skip, with a comment to remove it when we update the MSRV

```
3. **Run tests** with nextest plus doc tests:
```bash
cargo nextest run --workspace
cargo nextest run --workspace --all-features --exclude builder --exclude test_spawn_from_lib
cargo test --doc
```
Run a single test by substring: `cargo nextest run -p <crate-name> <test-name>`.
4. **If FFI crates were touched**, build and run the C/C++ FFI examples:
```bash
cargo ffi-test
```
5. **If `tracing_integration_tests::` tests fail**, they require Docker. Prompt the user to start Docker and retry; to skip them locally use:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great example of filling the knowledge cracks which is not easly deduced

```bash
cargo nextest run -E '!test(tracing_integration_tests::)'
```
6. **If `Cargo.lock` was touched**, regenerate the third-party license CSV so `cargo deny` and the CI guard stay green:
```bash
./scripts/update_license_3rdparty.sh
cargo deny check
```

### Per-crate test notes

- **crashtracker**: needs `--features libdd-crashtracker/generate-unit-test-files` for unit tests.
- **http-client**: has two mutually-exclusive backend features (`reqwest-backend` is default, `hyper-backend` is the alternative). Both must be exercised when this crate is touched:
```bash
# Default (reqwest) backend — covered by the workspace test run
cargo nextest run -p libdd-http-client
# Hyper backend
cargo nextest run -p libdd-http-client --no-default-features --features hyper-backend,https
```
- **test_spawn_from_lib**: `cargo nextest run --package test_spawn_from_lib --features prefer-dynamic`.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add RUSTFLAGS to the spawn-from-lib test command

For test_spawn_from_lib on Unix/Linux, enabling the prefer-dynamic Cargo feature is not enough: the crate's own Cargo.toml notes these tests require prefer-dynamic, and the CI step in .github/workflows/test.yml sets RUSTFLAGS="-C prefer-dynamic" when running this package. Agents following this documented command will run a different validation path from CI and can get misleading failures when the trampoline test binary is not built dynamically.

Useful? React with 👍 / 👎.


## Architecture
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lately we've been adding , removing and splitting crates. Therefore I guess this section will become outdated in no time.


The workspace has ~50 crates organized into functional domains:

### Core Infrastructure
- **libdd-common** / **libdd-common-ffi** — HTTP/HTTPS connectors (rustls + ring or aws-lc-rs), container detection, tag validation, rate limiting, Unix/Windows platform helpers
- **libdd-alloc** — custom memory allocators for specialized allocation patterns (profiling, signal-safe contexts)
- **libdd-tinybytes** — `bytes::Bytes`-like type supporting zero-copy cloning and slicing
- **libdd-log** / **libdd-log-ffi** — bridge from Rust's `tracing` infrastructure for use by other languages
- **libdd-telemetry** / **libdd-telemetry-ffi** — telemetry client implementing Datadog's telemetry collection specification
- **libdd-shared-runtime** / **libdd-shared-runtime-ffi** — shared Tokio runtime with fork-safe worker management
- **libdd-capabilities** / **libdd-capabilities-impl** — portable capability traits and native implementations for cross-platform libdatadog
- **libdd-http-client** — HTTP client abstraction with `reqwest-backend` (default) and `hyper-backend` features
- **libdd-dogstatsd-client** — DogStatsD metrics client
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will infer this information, overall I suspect most of this section is redundant, but I haven't used this AGENTS.md yet, so I'm not 100% sure

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I'm not sure it's useful/worth the trouble putting this here. Domain-specific knowledge or datadog-specific constraints that can't be easily inferred, like the test section above, seems to be more valuable and less inclined to drift.

- **spawn_worker** — subprocess/worker spawning utilities with platform-specific (Unix/Windows) trampoline mechanisms

### Tracing / APM
- **libdd-data-pipeline** / **libdd-data-pipeline-ffi** — trace exporter; sends data to Trace Agent via msgpack
- **libdd-trace-protobuf** — protobuf definitions for traces
- **libdd-trace-utils** — span processing, MessagePack encoding/decoding, payload handling, HTTP transport with retry
- **libdd-trace-stats** — computes stats from Datadog traces
- **libdd-trace-normalization** — port of the Datadog trace-agent's trace normalization logic
- **libdd-trace-obfuscation** — trace obfuscator implementing Datadog's data security filtering rules
- **libdd-tracer-flare** — collects and transmits tracer diagnostic flares triggered via remote configuration
- **libdd-ddsketch** / **libdd-ddsketch-ffi** — DDSketch quantile estimation

### Profiling
- **libdd-profiling** / **libdd-profiling-ffi** — pprof-format continuous profiling; exports to Datadog via reqwest + tokio
- **libdd-profiling-protobuf** — pprof protobuf definitions (prost-based)
- **libdd-otel-thread-ctx** — OTel thread-level context publisher for profiling (OTEP #4947)
- **datadog-profiling-replayer** — tool that replays a pprof file using libdatadog commands

### Crash Tracking
- **libdd-crashtracker** / **libdd-crashtracker-ffi** — in-process crash detection and reporting; uses blazesym for symbolization on Unix; Windows collector via `collector_windows` feature; also exposes C++ bindings via `cxx` crate
- **symbolizer-ffi** — standalone C/FFI bindings for blazesym (not in workspace members)

### IPC & Sidecar
- **datadog-ipc** / **datadog-ipc-macros** — inter-process communication framework with memory-mapped channel support
- **datadog-sidecar** / **datadog-sidecar-ffi** / **datadog-sidecar-macros** — sidecar process supporting trace collection, profiling, crashtracking, remote config, and live debugging

### Configuration & Remote
- **libdd-library-config** / **libdd-library-config-ffi** — instrumentation library configuration
- **datadog-remote-config** — remote configuration management for dynamic instrumentation and feature toggles
- **datadog-live-debugger** / **datadog-live-debugger-ffi** — live debugger for dynamic inspection

### Feature Flags
- **datadog-ffe** / **datadog-ffe-ffi** — Feature Flags Experiment; includes Python/pyo3 bindings, hence `--all-features` requires Python

### Build, Tooling & Tests
- **builder** — generates release artifacts (C libraries, headers, pkg-config). Run with:
```bash
cargo run --bin release -- --out output-folder
```
Feature flags control what's built: `crashtracker`, `profiling`, `telemetry`, `data-pipeline`, `symbolizer`, `library-config`, `log`, `ddsketch`, `ffe`
- **build-common** — shared cbindgen helpers used by FFI crate `build.rs` scripts (not in workspace members)
- **tools** — dev binaries: header dedup, FFI test runner, JUnit attribute injection
- **tools/cc_utils** — lightweight C compiler utilities for build scripts (kept dependency-free, no `libdd-common`)
- **tools/sidecar_mockgen** — sidecar mock code generator
- **bin_tests** — binary integration test harness (crashtracker, profiling, crash tracking)
- **tests/spawn_from_lib** (package `test_spawn_from_lib`) — tests `spawn_worker` trampoline behavior; requires `prefer-dynamic` feature

## Key Conventions

### Reliability & integrability
libdatadog is integrated into many runtimes and languages via FFI, and runs in Datadog customers' environments. Code should be as reliable and integrable as possible:
- Avoid `unwrap`/`panic!` outside of tests; bubble errors up instead.
- Bubble errors up to the library caller with detail — prefer structured error enums (e.g. `thiserror`) over opaque strings.
Comment on lines +127 to +128
Copy link
Copy Markdown
Contributor

@pawelchcki pawelchcki May 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L127-128 should probably could be merged together

- Stay free of global effects unless a feature requires them: no spawning threads, no globals, no reading environment variables behind the caller's back.
- Care about performance, especially memory allocations on hot paths.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add something about being extra-careful regarding UB, given FFI and/or unsafe code?


### Cryptography
- Non-FIPS builds: ring as TLS crypto provider
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know about AI, but as a human who comes to this codebase, I would find this section inscrutable because acronyms aren't explained. Maybe should we quickly explain (one sentence) what FIPS is and why someone working on libdatadog should care?

- FIPS builds: aws-lc-rs via `fips` feature flag
- Windows FIPS requires env var: `AWS_LC_FIPS_SYS_NO_ASM=1`

### Commit Messages
PR titles and commits must follow **Conventional Commits**: `<type>[scope]: <description>`
Common types: `feat`, `fix`, `docs`, `refactor`, `perf`, `test`, `build`, `ci`, `chore`
Breaking changes: append `!` — e.g. `feat!: remove deprecated API`

### Licenses
All source files must have Apache 2.0 license headers (except `symbolizer-ffi`). The third-party license CSV (`LICENSE-3rdparty.csv`) is validated in CI. To regenerate:
```bash
./scripts/update_license_3rdparty.sh
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this probably would work better as pre-commit hook :)

```

### Release profiles
- `dev`: full debug info
- `release`: size-optimized (`opt-level = "s"`), LTO, single codegen unit
- `bench`: `opt-level = 3`
Comment on lines +148 to +151
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is readable from the (rather small) Cargo.toml and subject to drift. Not sure it's worth repeating here.


## Dev Containers

Two devcontainer configurations are provided (Ubuntu and Alpine). They pre-install all required dependencies including cmake, protoc, cbindgen, and Go. See `.devcontainer/`.
1 change: 1 addition & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
@AGENTS.md
Loading