Skip to content

feat(diagnostics): Support diagnostic operations with automated rules#1453

Open
Josh-Matsuoka wants to merge 1 commit intocryostatio:mainfrom
Josh-Matsuoka:diagnostics-automated-rules
Open

feat(diagnostics): Support diagnostic operations with automated rules#1453
Josh-Matsuoka wants to merge 1 commit intocryostatio:mainfrom
Josh-Matsuoka:diagnostics-automated-rules

Conversation

@Josh-Matsuoka
Copy link
Copy Markdown
Contributor

@Josh-Matsuoka Josh-Matsuoka commented Apr 8, 2026

Welcome to Cryostat! 👋

Before contributing, make sure you have:

  • Read the contributing guidelines
  • Linked a relevant issue which this PR resolves
  • Linked any other relevant issues, PR's, or documentation, if any
  • Resolved all conflicts, if any
  • Rebased your branch PR on top of the latest upstream main branch
  • Attached at least one of the following labels to the PR: [chore, ci, docs, feat, fix, test]
  • Signed all commits using a GPG signature

To recreate commits with GPG signature git fetch upstream && git rebase --force --gpg-sign upstream/main


Fixes: #1253

Depends on: cryostatio/cryostat-web#2181

Description of the change:

Adds support to the automated rules framework for triggering thread and heap dumps on targets matched by the Rule. 2 new fields are added to the Rule Object to track this and the RuleExectuor sends off a long running API request for thread/heap dumps when enabled.

How to manually test:

  1. Pull and build this PR as well as feat(diagnostics): Support diagnostic operations with automated rules cryostat-web#2181
  2. bash smoketest.bash -Ot quarkus-cryostat-agent
  3. Create an automated rule with the match expression true, enable thread and heap dumps in the form and submit
  4. Check the storage for thread and heap dumps, 3 thread dumps and one heap dump should be present.

Copy link
Copy Markdown
Member

@andrewazores andrewazores left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to have a design discussion about this feature before forging ahead with it - there are some technical and architectural things to consider first.


public boolean heapDump;

public boolean threadDump;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if these flags are sufficient to capture all of the functionality we might want to expose/implement. For example, rules currently have both an archivalPeriodSeconds and a preservedArchives, which rule execution use to determine when to execute the recording archival job associated with the rule, and how many prior copies of archived recordings from the same active source recording should be retained. It looks like this implementation simply enables rules to also perform a thread/heap dump alongside the recording archival execution on the archivalPeriodSeconds schedule, but without any equivalent handling of preservedArchives.

But then this raises an important feature design question: should preservedArchives apply equally and symmetrically to all three data types that can now be captured by a rule? Or should there be three different fields like preservedJfrArchives, preservedThreadDumps, preservedHeapDumps? Or, should a rule only be valid if it configures JFR archives OR thread dumps OR heap dumps, so if the user wants to have periodic capture of each it should be three different rule definitions? If we start down that path, this also raises the question of whether these should then remain as one Rule entity type or three?

preservedArchives integer not null,
matchExpression bigint unique,
threadDump boolean not null,
heapDump boolean not null,
Copy link
Copy Markdown
Member

@andrewazores andrewazores Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a wrong approach - if a Cryostat instance has already been installed and is being upgraded, the previously-run migration scripts will not re-execute. So patching up a migration file for an already-released Cryostat version is going to create some really messy headaches for fresh installs vs upgraded installs where they will now have divergent database scheme.

In fact, I think Flyway will even raise an error on upgrade and fail in this case. I'm pretty sure it does some migration script checksumming and will catch that this has been changed.

Any modifications to the database schema for a Cryostat vX.Y release feature should only be done in a net new VX.Y.0__cryostat.sql migration script corresponding to that release version, so that all schema updates for that release are done exactly once at upgrade time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feat New feature or request safe-to-test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Request] Automated Rules for Thread/Heap Dumps, async-profiler, etc.?

2 participants