feat(diagnostics): Support diagnostic operations with automated rules#1453
feat(diagnostics): Support diagnostic operations with automated rules#1453Josh-Matsuoka wants to merge 1 commit intocryostatio:mainfrom
Conversation
andrewazores
left a comment
There was a problem hiding this comment.
I think we need to have a design discussion about this feature before forging ahead with it - there are some technical and architectural things to consider first.
|
|
||
| public boolean heapDump; | ||
|
|
||
| public boolean threadDump; |
There was a problem hiding this comment.
I'm not sure if these flags are sufficient to capture all of the functionality we might want to expose/implement. For example, rules currently have both an archivalPeriodSeconds and a preservedArchives, which rule execution use to determine when to execute the recording archival job associated with the rule, and how many prior copies of archived recordings from the same active source recording should be retained. It looks like this implementation simply enables rules to also perform a thread/heap dump alongside the recording archival execution on the archivalPeriodSeconds schedule, but without any equivalent handling of preservedArchives.
But then this raises an important feature design question: should preservedArchives apply equally and symmetrically to all three data types that can now be captured by a rule? Or should there be three different fields like preservedJfrArchives, preservedThreadDumps, preservedHeapDumps? Or, should a rule only be valid if it configures JFR archives OR thread dumps OR heap dumps, so if the user wants to have periodic capture of each it should be three different rule definitions? If we start down that path, this also raises the question of whether these should then remain as one Rule entity type or three?
| preservedArchives integer not null, | ||
| matchExpression bigint unique, | ||
| threadDump boolean not null, | ||
| heapDump boolean not null, |
There was a problem hiding this comment.
This is a wrong approach - if a Cryostat instance has already been installed and is being upgraded, the previously-run migration scripts will not re-execute. So patching up a migration file for an already-released Cryostat version is going to create some really messy headaches for fresh installs vs upgraded installs where they will now have divergent database scheme.
In fact, I think Flyway will even raise an error on upgrade and fail in this case. I'm pretty sure it does some migration script checksumming and will catch that this has been changed.
Any modifications to the database schema for a Cryostat vX.Y release feature should only be done in a net new VX.Y.0__cryostat.sql migration script corresponding to that release version, so that all schema updates for that release are done exactly once at upgrade time.
Welcome to Cryostat! 👋
Before contributing, make sure you have:
mainbranch[chore, ci, docs, feat, fix, test]To recreate commits with GPG signature
git fetch upstream && git rebase --force --gpg-sign upstream/mainFixes: #1253
Depends on: cryostatio/cryostat-web#2181
Description of the change:
Adds support to the automated rules framework for triggering thread and heap dumps on targets matched by the Rule. 2 new fields are added to the Rule Object to track this and the RuleExectuor sends off a long running API request for thread/heap dumps when enabled.
How to manually test: