OSIDB-4738: Add context to async process in history.#1213
Conversation
| last_event_user = self._get_last_event_context_user() | ||
| actual_event_user = self._get_actual_user() | ||
|
|
||
| logger.info( |
There was a problem hiding this comment.
Im not sure if it should be info, error or debug.
ab6bc7a to
c3fc47c
Compare
8683b18 to
369072a
Compare
| def _get_actual_user(self): | ||
| ctx = getattr( | ||
| pghistory.runtime._tracker, "value", None | ||
| ) # active context for this thread, if any | ||
| actual_user = (ctx.metadata.get("user") if ctx else None) or "unknown" | ||
| return actual_user | ||
|
|
There was a problem hiding this comment.
the user is already logged on the access logs (check for /var/log/(prod|stage|uat)-access.log on splunk)
There was a problem hiding this comment.
I'm getting the user so people can tell from the error why their instance is outdated without having to go to the history or the logs.
There was a problem hiding this comment.
I'm not sure I follow the logic of adding the user here. If the message is for the person making the change, would _get_actual_user not just return that person?
There was a problem hiding this comment.
yes, is true. This was a leftover from the first change where I took the last change user from history to explain the user who or what changed the model.
@costaconrado now that the history is indexed, is it safe to take the last changed from history?
if not it true that it doesn't really make sense.
369072a to
4cd3ae7
Compare
|
Thanks for review @costaconrado 😄 I deleted to part you told me about so it should be no longer a problem. Could you give it a second look? |
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdded pghistory context management to Celery-driven sync flows for history attribution and updated TrackingMixin to compute and include the actual event user (from per-thread pghistory runtime metadata) in outdated-instance DataInconsistencyException messages. Changes
Sequence DiagramsequenceDiagram
actor Celery as Celery Task
participant SyncMgr as Sync Manager
participant PGHist as pghistory_context
participant Model as Model (Flaw/Tracker)
participant DB as Database
Celery->>SyncMgr: sync_task(task)
SyncMgr->>PGHist: enter context(action, user, celery_task_id)
PGHist->>PGHist: set pghistory.context(...)
SyncMgr->>Model: get / modify instance
Model->>DB: save() (within pghistory context)
DB->>DB: persist record + pghistory metadata
SyncMgr->>PGHist: exit context
PGHist->>PGHist: clear pghistory.context
SyncMgr-->>Celery: task complete
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
osidb/mixins.py (1)
75-82:⚠️ Potential issue | 🟠 MajorKeep user identifiers out of the conflict exception text.
This pulls the actor from
pghistorymetadata and injects it into aDataInconsistencyException.osidb/exception_handlers.pythen logsstr(exc)and returns it verbatim, whileosidb/middleware.pyshows thatusercan be an email/name. That turns a normal 409 into an application-log/API echo of a user identifier. The actor is already available in history metadata; it should not be serialized into the exception message.🔧 Suggested change
- actual_event_user = self._get_actual_user() - message = ( - f"Save operation based on an outdated model instance by " - f"{actual_event_user} with updated_dt {self.updated_dt} " - f"differs from the DB {db_self.updated_dt} check in history for details." + f"Save operation based on an outdated model instance with " + f"updated_dt {self.updated_dt} differs from the DB " + f"{db_self.updated_dt}; check history for details." ) raise DataInconsistencyException(f"{message} You need to refresh.") - - def _get_actual_user(self): - ctx = getattr( - pghistory.runtime._tracker, "value", None - ) # active context for this thread, if any - actual_user = (ctx.metadata.get("user") if ctx else None) or "unknown" - return actual_userAlso applies to: 91-96
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@osidb/mixins.py` around lines 75 - 82, The exception message currently includes the actor identifier retrieved by _get_actual_user() (actual_event_user) and is raised as a DataInconsistencyException; remove the actor from the serialized exception text and replace it with a generic message (e.g. "Save operation based on an outdated model instance; refresh and retry") while keeping the actor available only in pghistory metadata for internal logs; update both the block that builds message using self.updated_dt vs db_self.updated_dt and the similar block around lines 91-96 to stop interpolating actual_event_user into the exception string.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@osidb/sync_manager.py`:
- Around line 1025-1035: The code calls
JiraTrackerDownloadManager.link_tracker_with_affects(tracker_id) twice (once
inside the pghistory_context block when tracker exists and again afterward),
causing duplicate writes and audit entries; remove the redundant call outside
the pghistory_context and ensure link_tracker_with_affects(tracker_id) is
invoked only once inside the pghistory_context where tracker.save() runs so the
relation changes are recorded under the intended pghistory_context (locate
usages of JiraTrackerDownloadManager.link_tracker_with_affects and the
surrounding pghistory_context/celery_task_id block to make this change).
---
Duplicate comments:
In `@osidb/mixins.py`:
- Around line 75-82: The exception message currently includes the actor
identifier retrieved by _get_actual_user() (actual_event_user) and is raised as
a DataInconsistencyException; remove the actor from the serialized exception
text and replace it with a generic message (e.g. "Save operation based on an
outdated model instance; refresh and retry") while keeping the actor available
only in pghistory metadata for internal logs; update both the block that builds
message using self.updated_dt vs db_self.updated_dt and the similar block around
lines 91-96 to stop interpolating actual_event_user into the exception string.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: ae6982c3-730e-4bf1-915d-21bad8981824
📒 Files selected for processing (2)
osidb/mixins.pyosidb/sync_manager.py
Jincxz
left a comment
There was a problem hiding this comment.
LGTM, just some comments.
This PR is for adding logging correct? Is / was there one for handling the sync issue mentioned in OSIDB-4738?
| def _get_actual_user(self): | ||
| ctx = getattr( | ||
| pghistory.runtime._tracker, "value", None | ||
| ) # active context for this thread, if any | ||
| actual_user = (ctx.metadata.get("user") if ctx else None) or "unknown" | ||
| return actual_user | ||
|
|
There was a problem hiding this comment.
I'm not sure I follow the logic of adding the user here. If the message is for the person making the change, would _get_actual_user not just return that person?
There was a problem hiding this comment.
🧹 Nitpick comments (2)
osidb/sync_manager.py (2)
20-34: Type hint mismatch:celery_task_idcan beNone.The parameter is typed as
strbut Line 30 explicitly handles falsy values, and all call sites passgetattr(getattr(task, "request", None), "id", None)which can returnNone.Suggested fix
`@contextmanager` def pghistory_context( action: str, - celery_task_id: str, + celery_task_id: Optional[str], *, source: str = "celery", user: str = "celery_task", **extra_context: Any, ) -> Iterator[None]:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@osidb/sync_manager.py` around lines 20 - 34, The pghistory_context signature types celery_task_id as str but callers can pass None; update the function to accept Optional[str] (i.e., change celery_task_id: str to celery_task_id: Optional[str]) and add Optional to the typing imports so the runtime None check (if celery_task_id:) remains valid; leave the body and pghistory.context usage unchanged.
1004-1007: Inconsistent implementation: no pghistory context wrapping.Unlike
BZTrackerDownloadManager.link_tracker_with_affects(lines 582-588), this method does not accept ataskparameter and does not wrap its atomic block inpghistory_context. This creates inconsistent history tracking between Bugzilla and Jira tracker syncs.Consider either:
- Having the caller wrap this call in
pghistory_context(requires fixing lines 1025-1031), or- Adding the
taskparameter and internal wrapping to match the BZ implementation🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@osidb/sync_manager.py` around lines 1004 - 1007, This method updates tracker.affects inside a transaction.atomic() without using pghistory_context, causing inconsistent pghistory entries compared to BZTrackerDownloadManager.link_tracker_with_affects; add a task parameter to this method and wrap the transaction.atomic() block in pghistory_context(task=task) (mirroring BZTrackerDownloadManager.link_tracker_with_affects) so that tracker.affects.clear(), tracker.affects.add(*affects) and tracker.save(...) are executed inside the pghistory_context; alternatively, if you prefer the caller to control context, update the caller code path (the code around lines 1025-1031) to wrap this call in pghistory_context — but prefer the former for parity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@osidb/sync_manager.py`:
- Around line 20-34: The pghistory_context signature types celery_task_id as str
but callers can pass None; update the function to accept Optional[str] (i.e.,
change celery_task_id: str to celery_task_id: Optional[str]) and add Optional to
the typing imports so the runtime None check (if celery_task_id:) remains valid;
leave the body and pghistory.context usage unchanged.
- Around line 1004-1007: This method updates tracker.affects inside a
transaction.atomic() without using pghistory_context, causing inconsistent
pghistory entries compared to
BZTrackerDownloadManager.link_tracker_with_affects; add a task parameter to this
method and wrap the transaction.atomic() block in pghistory_context(task=task)
(mirroring BZTrackerDownloadManager.link_tracker_with_affects) so that
tracker.affects.clear(), tracker.affects.add(*affects) and tracker.save(...) are
executed inside the pghistory_context; alternatively, if you prefer the caller
to control context, update the caller code path (the code around lines
1025-1031) to wrap this call in pghistory_context — but prefer the former for
parity.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 0cd0cff1-a627-4c1e-a46f-af153b6119af
📒 Files selected for processing (1)
osidb/sync_manager.py
|
@Jincxz to better answer your question: This issue is for several reasons that touch at the point of multi instance.
This will give a better insight for the history on the async processes. |
Jincxz
left a comment
There was a problem hiding this comment.
LGTM, I would squash some of these commits.
a512512 to
b93fb33
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@osidb/sync_manager.py`:
- Around line 1024-1032: The Jira flow loses pghistory attribution because
JiraTrackerDownloadManager.link_tracker_with_affects does not accept the task
nor wrap the affects clear/add/save in pghistory_context; update
JiraTrackerDownloadManager.link_tracker_with_affects to accept a task parameter,
wrap the tracker.affects.clear()/add()/save() block with
pghistory_context(action="link_tracker_with_affects",
celery_task_id=getattr(getattr(task, "request", None), "id", None)), and then
modify the caller in sync_task to pass the current task through when calling
JiraTrackerDownloadManager.link_tracker_with_affects(tracker_id, task) so
history entries include source/action/celery_task_id.
- Around line 20-33: The pghistory_context signature declares celery_task_id as
str but callers pass None; update the parameter type to Optional[str] (and
import Optional if needed) so it becomes celery_task_id: Optional[str], keeping
the existing conditional handling (if celery_task_id:) and no other logic
changes; reference the pghistory_context function to locate the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 019dc08c-dd3e-4470-92da-009343a9d77e
📒 Files selected for processing (2)
osidb/mixins.pyosidb/sync_manager.py
🚧 Files skipped from review as they are similar to previous changes (1)
- osidb/mixins.py
| @contextmanager | ||
| def pghistory_context( | ||
| action: str, | ||
| celery_task_id: str, | ||
| *, | ||
| source: str = "celery", | ||
| user: str = "celery_task", | ||
| **extra_context: Any, | ||
| ) -> Iterator[None]: | ||
| ctx = {"source": source, "user": user, "action": action, **extra_context} | ||
| if celery_task_id: | ||
| ctx["celery_task_id"] = celery_task_id | ||
| with pghistory.context(**ctx): | ||
| yield |
There was a problem hiding this comment.
celery_task_id should be typed as Optional[str].
Every call site passes getattr(getattr(task, "request", None), "id", None), which can legitimately return None, and the body already handles that case with if celery_task_id:. Declaring the parameter as str misrepresents the contract and will trip type checkers/IDE warnings on every caller.
Proposed fix
`@contextmanager`
def pghistory_context(
action: str,
- celery_task_id: str,
+ celery_task_id: Optional[str],
*,
source: str = "celery",
user: str = "celery_task",
**extra_context: Any,
) -> Iterator[None]:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@osidb/sync_manager.py` around lines 20 - 33, The pghistory_context signature
declares celery_task_id as str but callers pass None; update the parameter type
to Optional[str] (and import Optional if needed) so it becomes celery_task_id:
Optional[str], keeping the existing conditional handling (if celery_task_id:)
and no other logic changes; reference the pghistory_context function to locate
the change.
| if tracker: | ||
| tracker.save() | ||
| with pghistory_context( | ||
| action="jira_tracker_download", | ||
| celery_task_id=getattr(getattr(task, "request", None), "id", None), | ||
| ): | ||
| tracker.save() | ||
|
|
||
| # Link this Jira tracker to OSIDB Affects as part of the sync process. | ||
| result = JiraTrackerDownloadManager.link_tracker_with_affects(tracker_id) |
There was a problem hiding this comment.
Jira tracker path loses pghistory attribution for the affects re-link — inconsistent with the BZ flow.
For the BZ flow, BZTrackerDownloadManager.link_tracker_with_affects itself opens a pghistory_context(action="link_tracker_with_affects", celery_task_id=...) around the clear()/add()/save() (lines 581‑588), and sync_task forwards task into it (lines 605‑607). For the Jira flow:
JiraTrackerDownloadManager.link_tracker_with_affects(lines 1004‑1007) performs the sametracker.affects.clear()/add()/save()with nopghistory_contextwrapping at all.- Here,
link_tracker_with_affects(tracker_id)is called outside thepghistory_contextblock at line 1032 and is not passedtask.
Net effect: every Jira-driven tracker re-link writes history entries without source=celery / action / celery_task_id, which is exactly the attribution gap this PR is meant to close. Please mirror the BZ approach — accept task in JiraTrackerDownloadManager.link_tracker_with_affects and wrap its atomic block in pghistory_context, then forward task from sync_task.
Proposed fix
`@staticmethod`
- def link_tracker_with_affects(tracker_id):
+ def link_tracker_with_affects(tracker_id, task=None):
# Code adapted from collectors.jiraffe.convertors.JiraTrackerConvertor.affects
...
with transaction.atomic():
- tracker.affects.clear()
- tracker.affects.add(*affects)
- tracker.save(raise_validation_error=False, auto_timestamps=False)
+ with pghistory_context(
+ action="link_tracker_with_affects",
+ celery_task_id=getattr(getattr(task, "request", None), "id", None),
+ ):
+ tracker.affects.clear()
+ tracker.affects.add(*affects)
+ tracker.save(raise_validation_error=False, auto_timestamps=False)And in sync_task:
- # Link this Jira tracker to OSIDB Affects as part of the sync process.
- result = JiraTrackerDownloadManager.link_tracker_with_affects(tracker_id)
+ # Link this Jira tracker to OSIDB Affects as part of the sync process.
+ result = JiraTrackerDownloadManager.link_tracker_with_affects(
+ tracker_id, task
+ )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@osidb/sync_manager.py` around lines 1024 - 1032, The Jira flow loses
pghistory attribution because
JiraTrackerDownloadManager.link_tracker_with_affects does not accept the task
nor wrap the affects clear/add/save in pghistory_context; update
JiraTrackerDownloadManager.link_tracker_with_affects to accept a task parameter,
wrap the tracker.affects.clear()/add()/save() block with
pghistory_context(action="link_tracker_with_affects",
celery_task_id=getattr(getattr(task, "request", None), "id", None)), and then
modify the caller in sync_task to pass the current task through when calling
JiraTrackerDownloadManager.link_tracker_with_affects(tracker_id, task) so
history entries include source/action/celery_task_id.
There was a problem hiding this comment.
It was left like this so only when an actual change in the tracker happens it will use the context.
The linking is happening always so it will only create noise.
There was a problem hiding this comment.
@roduran-dev, understood — the deliberate choice to limit pghistory_context to the conditional tracker.save() block (where an actual tracker change occurs) and exclude the always-running link_tracker_with_affects call makes sense. Annotating routine/unconditional operations would reduce the signal-to-noise ratio of the history context. I'll withdraw the suggestion.
📝 Learnings were identified and not saved (knowledge base disabled). Enable
This PR adds a bit more context for the async tasks in async manager. The idea is to have a better view on events conflicts.
It also adds more contexts into the history for the sync_manager making easier to understand if a celery process did a change.
Example of how context looks (First lines are empty because Flaw was created with custom code)
Closes OSIDB-4738