DOC: Clarify jacobian parameter and overall docstring for Model.logp and similar#8185
DOC: Clarify jacobian parameter and overall docstring for Model.logp and similar#8185avm19 wants to merge 3 commits intopymc-devs:mainfrom
Conversation
| *************** | ||
|
|
||
| .. currentmodule:: pymc.distributions.transforms | ||
| .. module:: pymc.distributions.transforms |
There was a problem hiding this comment.
Just to clarify. :module:: must be specified at least once (and probably only once) for Sphinx to create an "index entry". Without it, references :currentmodule::... and :mod:... would lead to nowhere. It so happens that module:: pymc.distributions.transforms was absent, even though currentmodule was present, and Sphinx does not check this.
There was a problem hiding this comment.
Using the module directive here is probably fine, I am not sure how relevant it is though. Is pymc.distributions.transforms as a module something relevant to users we'll want to reference? If so we need to switch, otherwise using currentmodule is perfectly fine but using module won't hurt either.
Full context:
.. currentmodule:: is basically syntactic sugar, its only role is defining the module name to prepend all the autodoc/autosummary entries in that file. That is, we can use circular or ordered in the autosummary directive below instead of needing to specify pymc.distributions.transforms.circular...
.. module:: does the same but also generates a target that can be referenced from other places using the :mod:`pymc.distributions.transforms` role; .. currentmodule:: does not reference the module directive with the same name nor needs it to exist to work properly though.
The module directive can also take a :synopsis: to manually add the module docstring. In our case however, using this optional argument wouldn't make much sense, it would be much better to use .. automodule:: instead which defines a module directive pulling the docstring directly from the module in question (like we do for classes or functions).
Sphinx does not check this because module/automodule should indeed be used at most once, but it isn't really necessary to use it at least once, that depends on how the package maintainers decide to define and document the public API.
There was a problem hiding this comment.
Using the module directive here is probably fine, I am not sure how relevant it is though.
Without it references did not work, which is why I made the change.
P.S. I don't remember if Sphinx threw an error or just no hyperlink without the module. I don't know much about this, but I guess you were not using automodule because this module is accompanied by a lot of text, which was placed in transform.rst (better have it there, than a huge module-level docstring). Some other entries, e.g. shape_utils.rst, also contain text in addition to Sphinx directives.
Is
pymc.distributions.transformsas a module something relevant to users we'll want to reference?
Yes. The module page explains how PyMC deals with transformations, and I reference it as a good source for further details in a method's docstring.
| Random variables or potential terms whose contribution to logp is to be included. | ||
| If None, use all basic (free or observed) variables and potentials defined in the model. | ||
| jacobian : bool, default True | ||
| If True, add Jacobian contributions associated with automatic variable transformations, |
There was a problem hiding this comment.
Include jacobian correction term for transformed variables ?
There was a problem hiding this comment.
I don't think we should explain why, specially because the why is context specific. Maybe you don't want it because you want to do optimization on the constrained space, using unconstrained variables.
There was a problem hiding this comment.
It is the goal of this PR to give a concise and clear description of what jacobian=True does, so this bit is central. I am trying to share my perspective of an outsider to make the docstrings friendlier to users who are not used to PyMC parlance.
I don't think we should explain why
Here I am explaining not so much "why", as "what". The phrase "Jacobian correction term" might sound precise and unquestionable to people working on PyMC or specialising in Bayesian modelling, but to someone coming from a slightly different field it is very ambiguous and uninformative. Which Jacobian? Why correction? There are so many Jacobians, it is not clear which one it is about. Any gradient, such as Model.dlogp, is also a Jacobian (neglect the differential geometry subtleties of co- and contra-variance), and one would naturally think that the Jacobian refers to dlogp, because d2logp is called a Hessian in the same file. Further, what is meant by "correction"? Who made the error that needs to be corrected? Or is it like the prediction-correction optimisation methods used in engineering? They use first and second derivatives, so sounds about right... Frankly, I never saw the change-of-variables Jacobian determinant being referred to as "correction", so it is the last connection I would make.
I am trying to explain what jacobian does in a way that does not allow misinterpretations. I am not try trying to convince the reader why they should use it.
I thought about using "which" here:
jacobian : bool, default True
If True, add Jacobian contributions associated with automatic variable transformations,
so thatwhich make the resultisthe true density of transformed random variables.
See :py:mod:pymc.distributions.transformsfor details.
but then it is not clear if "which" refers to "transformations" (incorrect) or to "contributions" (correct). This is why I chose "so that" as a conjunction -- it refers to "add", i.e. the purpose of the parameter jacobian.
There was a problem hiding this comment.
They may not be automatic, the user can override behavior with
transform=xwhen defining a variable.
They are still automatic in the sense that you don't have to change variables in the model definition. Maybe it would be correct to say that all these transformations are automatic, which includes both the default transformations and the user-specified transformations? Here is a relevant quote:
pymc/docs/source/api/distributions/transforms.rst
Lines 82 to 85 in 8a1896b
true thing still, doesn't sound quite right.
Oh, I misunderstood your previous message, I thought you were referring to somewhat redundant "If True ...". Let me think if there are ways to re-phrase "true density".
There was a problem hiding this comment.
I'm not sure about the true thing still, doesn't sound quite right
You are right, "true density" doesn't sound right, neither does "proper", and I haven't found anything better. How about:
jacobian : bool, default True
If True, add Jacobian contributions associated with automatic variable transformations,
so that the result is the log-probability density of transformed random variables.
See :py:mod:pymc.distributions.transformsfor details.
That is, if I convinced you that user-specified and default transforms are all "automatic". Otherwise, remove the word "automatic".
There was a problem hiding this comment.
Agreed that "true" is not right, it makes a subjective judgement where none exists. You can see an example here of a case where applying it leads to the "false" logp. It applies a change of variables correction implied by the model's transformation, if one exists. I would write a docstring that says something like that:
jacobian: bool, default True
If True, the change-of-variable correction implied by each random variable's transformation, if applicable. This correction accounts for the distortion of probability mass induced by the application of a non-linear function to a random variable. See :py:mod:pymc.distributions.transforms for details.
There was a problem hiding this comment.
You can see an example here of a case where applying it leads to the "false" logp.
After a cursory reading, I tend to think that the confusion in there was between true" was a bad choice (it may also suggest ground truth vs approximation). But I think "density of transformed random variables" is a good part, which pushes the reader in the right direction.
Maybe we should also add that with jacobian=False the result is "density of untransformed variables"? (I am like 95% sure, but I can check the maths and get a toy example)
There was a problem hiding this comment.
(I tend to think of transformed/untransformed as unconstrained/constrained, but that's giving an interpretation of why the transform was applied)
Maybe we should also add that with jacobian=False the result is "density of untransformed variables"? (I am like 95% sure, but I can check the maths and get a toy example)
I would maybe try to emphasize this function always takes a point in transformed space and "untransforms" it to evaluate the logprob function of each random variable in it's "natural"/original space. What's optional is whether we add the terms corresponding to this transformation.
This is not a specific suggestion, just sharing how I think mechanistically about it.
| jacobian: bool = True, | ||
| ) -> Variable: | ||
| """Gradient of the models log-probability w.r.t. ``vars``. | ||
| """Gradient of the joint log-probability density of the model. |
There was a problem hiding this comment.
density is a strong word, model may be discrete or a mix
There was a problem hiding this comment.
Technically, a probability mass function is a probability density (i.e. a Radon-Nikodym derivative) with respect to the counting measure, in contrast to a "regular" PDF of an absolutely continuous variable, which is wrt the standard Lebesgue-Borel measure; and so the mixed case should also be a density wrt a certain product measure.
I agree that people might find this confusing, and some may even assume that discrete variables are marginalised out or not allowed at all in this context. On the other hand, I used the word "density" because Jacobians arise only in (absolutely) continuous variables; if it were not a density, there would be no Jacobians.
There was a problem hiding this comment.
We do constant use the term log-probability though. Yes jacobian corrections don't exist for discrete variables, I don't think that's going to be what makes it confusing to understand jacobian kwarg here
There was a problem hiding this comment.
Actually, the word "density" was there before me in compile_logp, compile_dlogp, compile_d2logp only. I only put it everywhere else for consistency.
Line 613 in 8a1896b
Should I make it "log-probability" (hyphenated, noun) and remove "density" from everywhere (except jacobian parameter where "density" is relevant)?
Description
Improve docstrings for
Model.logp()and related methods (dlogp(),d2logp(),compile_*()) and clarify what parameterjacobiandoes.Related Issue
Checklist
Type of change