-
Notifications
You must be signed in to change notification settings - Fork 2.2k
DOC: Clarify jacobian parameter and overall docstring for Model.logp and similar #8185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -582,20 +582,37 @@ def compile_logp( | |||||||||
| sum: bool = True, | ||||||||||
| **compile_kwargs, | ||||||||||
| ) -> PointFunc: | ||||||||||
| """Compiled log probability density function. | ||||||||||
|
|
||||||||||
| The function expects as input a dictionary with the same structure as self.initial_point() | ||||||||||
| """Compiled joint log-probability density of the model or joint log-probability contributions. | ||||||||||
|
|
||||||||||
| Parameters | ||||||||||
| ---------- | ||||||||||
| vars : list of random variables or potential terms, optional | ||||||||||
| Compute the gradient with respect to those variables. If None, use all | ||||||||||
| free and observed random variables, as well as potential terms in model. | ||||||||||
| jacobian : bool | ||||||||||
| Whether to include jacobian terms in logprob graph. Defaults to True. | ||||||||||
| sum : bool | ||||||||||
| Whether to sum all logp terms or return elemwise logp for each variable. | ||||||||||
| Defaults to True. | ||||||||||
| vars : Variable, sequence of Variable or None, default None | ||||||||||
| Random variables or potential terms whose contribution to logp is to be included. | ||||||||||
| If None, use all basic (free or observed) variables and potentials defined in the model. | ||||||||||
| jacobian : bool, default True | ||||||||||
| If True, add Jacobian contributions associated with automatic variable transformations, | ||||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Needs some working.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Anything specific?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think we should explain why, specially because the why is context specific. Maybe you don't want it because you want to do optimization on the constrained space, using unconstrained variables.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is the goal of this PR to give a concise and clear description of what
Here I am explaining not so much "why", as "what". The phrase " I am trying to explain what I thought about using "which" here:
but then it is not clear if "which" refers to "transformations" (incorrect) or to "contributions" (correct). This is why I chose "so that" as a conjunction -- it refers to "add", i.e. the purpose of the parameter
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
They are still automatic in the sense that you don't have to change variables in the model definition. Maybe it would be correct to say that all these transformations are automatic, which includes both the default transformations and the user-specified transformations? Here is a relevant quote: pymc/docs/source/api/distributions/transforms.rst Lines 82 to 85 in 8a1896b
Oh, I misunderstood your previous message, I thought you were referring to somewhat redundant "
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
You are right, "
That is, if I convinced you that user-specified and default transforms are all "automatic". Otherwise, remove the word "automatic".
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed that "true" is not right, it makes a subjective judgement where none exists. You can see an example here of a case where applying it leads to the "false" logp. It applies a change of variables correction implied by the model's transformation, if one exists. I would write a docstring that says something like that:
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
After a cursory reading, I tend to think that the confusion in there was between Maybe we should also add that with
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (I tend to think of transformed/untransformed as unconstrained/constrained, but that's giving an interpretation of why the transform was applied)
I would maybe try to emphasize this function always takes a point in transformed space and "untransforms" it to evaluate the logprob function of each random variable in it's "natural"/original space. What's optional is whether we add the terms corresponding to this transformation. This is not a specific suggestion, just sharing how I think mechanistically about it. |
||||||||||
| so that the result is the true density of transformed random variables. | ||||||||||
| See :py:mod:`pymc.distributions.transforms` for details. | ||||||||||
| sum : bool, default True | ||||||||||
| If True, return the sum of the relevant logp terms as a single Variable. | ||||||||||
| If False, return a list of logp terms corresponding to `vars`. | ||||||||||
| **compile_kwargs : dict | ||||||||||
| Extra arguments passed to :meth:`self.compile_fn() <Model.compile_fn>`. | ||||||||||
|
|
||||||||||
| Returns | ||||||||||
| ------- | ||||||||||
| PointFunc | ||||||||||
| The function expects as input a dictionary with the same structure as | ||||||||||
| :meth:`self.initial_point() <Model.initial_point>`. | ||||||||||
|
|
||||||||||
| See Also | ||||||||||
| -------- | ||||||||||
| :py:meth:`logp` : | ||||||||||
| log-probability density as a Variable (in a symbolic form). | ||||||||||
| :py:meth:`compile_dlogp` : | ||||||||||
| gradient of log-probability density as a compiled function. | ||||||||||
| :py:meth:`compile_d2logp` : | ||||||||||
| Hessian of log-probability density as a compiled function. | ||||||||||
| """ | ||||||||||
| compile_kwargs.setdefault("on_unused_input", "ignore") | ||||||||||
| return self.compile_fn( | ||||||||||
|
|
@@ -610,18 +627,34 @@ def compile_dlogp( | |||||||||
| jacobian: bool = True, | ||||||||||
| **compile_kwargs, | ||||||||||
| ) -> PointFunc: | ||||||||||
| """Compiled log probability density gradient function. | ||||||||||
|
|
||||||||||
| The function expects as input a dictionary with the same structure as self.initial_point() | ||||||||||
|
|
||||||||||
| """Compiled gradient of the joint log-probability density of the model. | ||||||||||
|
|
||||||||||
| Parameters | ||||||||||
| ---------- | ||||||||||
| vars : list of random variables or potential terms, optional | ||||||||||
| Compute the gradient with respect to those variables. If None, use all | ||||||||||
| free and observed random variables, as well as potential terms in model. | ||||||||||
| jacobian : bool | ||||||||||
| Whether to include jacobian terms in logprob graph. Defaults to True. | ||||||||||
| vars : Variable, sequence of Variable or None, default None | ||||||||||
| Compute the gradient with respect to values of these variables. | ||||||||||
| If None, use all continuous free (unobserved) variables defined in the model. | ||||||||||
| jacobian : bool, default True | ||||||||||
| If True, add Jacobian contributions associated with automatic variable transformations, | ||||||||||
| so that the result is the true density of transformed random variables. | ||||||||||
| See :py:mod:`pymc.distributions.transforms` for details. | ||||||||||
| **compile_kwargs : dict | ||||||||||
| Extra arguments passed to :meth:`self.compile_fn() <Model.compile_fn>`. | ||||||||||
|
|
||||||||||
| Returns | ||||||||||
| ------- | ||||||||||
| PointFunc | ||||||||||
| The function expects as input a dictionary with the same structure as | ||||||||||
| :meth:`self.initial_point() <Model.initial_point>`. | ||||||||||
|
|
||||||||||
| See Also | ||||||||||
| -------- | ||||||||||
| :py:meth:`dlogp` : | ||||||||||
| gradient of log-probability density as a Variable (in a symbolic form). | ||||||||||
| :py:meth:`compile_logp` : | ||||||||||
| log-probability density as a compiled function. | ||||||||||
| :py:meth:`compile_d2logp` : | ||||||||||
| Hessian of log-probability density as a compiled function. | ||||||||||
| """ | ||||||||||
| compile_kwargs.setdefault("on_unused_input", "ignore") | ||||||||||
| return self.compile_fn( | ||||||||||
|
|
@@ -637,17 +670,36 @@ def compile_d2logp( | |||||||||
| negate_output=True, | ||||||||||
| **compile_kwargs, | ||||||||||
| ) -> PointFunc: | ||||||||||
| """Compiled log probability density hessian function. | ||||||||||
|
|
||||||||||
| The function expects as input a dictionary with the same structure as self.initial_point() | ||||||||||
| """Compiled Hessian of the joint log-probability density of the model. | ||||||||||
|
|
||||||||||
| Parameters | ||||||||||
| ---------- | ||||||||||
| vars : list of random variables or potential terms, optional | ||||||||||
| Compute the gradient with respect to those variables. If None, use all | ||||||||||
| free and observed random variables, as well as potential terms in model. | ||||||||||
| jacobian : bool | ||||||||||
| Whether to include jacobian terms in logprob graph. Defaults to True. | ||||||||||
| vars : Variable, sequence of Variable or None, default None | ||||||||||
| Compute the gradient with respect to values of these variables. | ||||||||||
| If None, use all continuous free (unobserved) variables defined in the model. | ||||||||||
| jacobian : bool, default True | ||||||||||
| If True, add Jacobian contributions associated with automatic variable transformations, | ||||||||||
| so that the result is the true density of transformed random variables. | ||||||||||
| See :py:mod:`pymc.distributions.transforms` for details. | ||||||||||
| negate_output : bool, default True | ||||||||||
| If True, change the sign of the output and return the opposite of the Hessian. | ||||||||||
| **compile_kwargs : dict | ||||||||||
| Extra arguments passed to :meth:`self.compile_fn() <Model.compile_fn>`. | ||||||||||
|
|
||||||||||
| Returns | ||||||||||
| ------- | ||||||||||
| PointFunc | ||||||||||
| The function expects as input a dictionary with the same structure as | ||||||||||
| :meth:`self.initial_point() <Model.initial_point>`. | ||||||||||
|
|
||||||||||
| See Also | ||||||||||
| -------- | ||||||||||
| :py:meth:`d2logp` : | ||||||||||
| Hessian of log-probability density as a Variable (in a symbolic form). | ||||||||||
| :py:meth:`compile_logp` : | ||||||||||
| log-probability density as a compiled function. | ||||||||||
| :py:meth:`compile_dlogp` : | ||||||||||
| gradient of log-probability density as a compiled function. | ||||||||||
| """ | ||||||||||
| compile_kwargs.setdefault("on_unused_input", "ignore") | ||||||||||
| return self.compile_fn( | ||||||||||
|
|
@@ -662,22 +714,46 @@ def logp( | |||||||||
| jacobian: bool = True, | ||||||||||
| sum: bool = True, | ||||||||||
| ) -> Variable | list[Variable]: | ||||||||||
| """Elemwise log-probability of the model. | ||||||||||
| """Joint log-probability density of the model or joint log-probability contributions. | ||||||||||
|
|
||||||||||
| Parameters | ||||||||||
| ---------- | ||||||||||
| vars : list of random variables or potential terms, optional | ||||||||||
| Compute the gradient with respect to those variables. If None, use all | ||||||||||
| free and observed random variables, as well as potential terms in model. | ||||||||||
| jacobian : bool | ||||||||||
| Whether to include jacobian terms in logprob graph. Defaults to True. | ||||||||||
| sum : bool | ||||||||||
| Whether to sum all logp terms or return elemwise logp for each variable. | ||||||||||
| Defaults to True. | ||||||||||
| vars : Variable, sequence of Variable or None, default None | ||||||||||
| Random variables or potential terms whose contribution to logp is to be included. | ||||||||||
| If None, use all basic (free or observed) variables and potentials defined in the model. | ||||||||||
| jacobian : bool, default True | ||||||||||
| If True, add Jacobian contributions associated with automatic variable transformations, | ||||||||||
| so that the result is the true density of transformed random variables. | ||||||||||
| See :py:mod:`pymc.distributions.transforms` for details. | ||||||||||
| sum : bool, default True | ||||||||||
| If True, return the sum of the relevant logp terms as a single Variable. | ||||||||||
| If False, return a list of logp terms corresponding to `vars`. | ||||||||||
|
|
||||||||||
| Returns | ||||||||||
| ------- | ||||||||||
| Logp graph(s) | ||||||||||
| Variable or list of Variable | ||||||||||
|
|
||||||||||
| See Also | ||||||||||
| -------- | ||||||||||
| :py:meth:`compile_logp` : | ||||||||||
| log-probability density as a compiled function. | ||||||||||
| :py:meth:`dlogp` : | ||||||||||
| gradient of log-probability density as a Variable (in a symbolic form). | ||||||||||
| :py:meth:`d2logp` : | ||||||||||
| Hessian of log-probability density as a Variable (in a symbolic form). | ||||||||||
| :py:meth:`logp_dlogp_function` : | ||||||||||
| compile logp and its gradient as a single function. | ||||||||||
| :py:attr:`varlogp` : | ||||||||||
| convenience property for logp of all free (unobserved) RVs. | ||||||||||
| :py:attr:`varlogp_nojac` : | ||||||||||
| convenience property for logp of all free (unobserved) RVs without transformation | ||||||||||
| corrections. | ||||||||||
| :py:attr:`observedlogp` : | ||||||||||
| convenience property for logp of all observed RVs. | ||||||||||
| :py:attr:`potentiallogp`. | ||||||||||
| convenience property for all additional logp terms (potentials). | ||||||||||
| :py:attr:`point_logps` : | ||||||||||
| convenience property for numerical evaluation of local logps at a point. | ||||||||||
| """ | ||||||||||
| varlist: list[TensorVariable] | ||||||||||
| if vars is None: | ||||||||||
|
|
@@ -742,19 +818,30 @@ def dlogp( | |||||||||
| vars: Variable | Sequence[Variable] | None = None, | ||||||||||
| jacobian: bool = True, | ||||||||||
| ) -> Variable: | ||||||||||
| """Gradient of the models log-probability w.r.t. ``vars``. | ||||||||||
| """Gradient of the joint log-probability density of the model. | ||||||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. density is a strong word, model may be discrete or a mix
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Technically, a probability mass function is a probability density (i.e. a Radon-Nikodym derivative) with respect to the counting measure, in contrast to a "regular" PDF of an absolutely continuous variable, which is wrt the standard Lebesgue-Borel measure; and so the mixed case should also be a density wrt a certain product measure. I agree that people might find this confusing, and some may even assume that discrete variables are marginalised out or not allowed at all in this context. On the other hand, I used the word "density" because Jacobians arise only in (absolutely) continuous variables; if it were not a density, there would be no Jacobians.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We do constant use the term log-probability though. Yes jacobian corrections don't exist for discrete variables, I don't think that's going to be what makes it confusing to understand jacobian kwarg here
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, the word "density" was there before me in Line 613 in 8a1896b
Should I make it "log-probability" (hyphenated, noun) and remove "density" from everywhere (except |
||||||||||
|
|
||||||||||
| Parameters | ||||||||||
| ---------- | ||||||||||
| vars : list of random variables or potential terms, optional | ||||||||||
| Compute the gradient with respect to those variables. If None, use all | ||||||||||
| free and observed random variables, as well as potential terms in model. | ||||||||||
| jacobian : bool | ||||||||||
| Whether to include jacobian terms in logprob graph. Defaults to True. | ||||||||||
| vars : Variable, sequence of Variable or None, default None | ||||||||||
| Compute the gradient with respect to values of these variables. | ||||||||||
| If None, use all continuous free (unobserved) variables defined in the model. | ||||||||||
| jacobian : bool, default True | ||||||||||
| If True, add Jacobian contributions associated with automatic variable transformations, | ||||||||||
| so that the result is the true density of transformed random variables. | ||||||||||
| See :py:mod:`pymc.distributions.transforms` for details. | ||||||||||
|
|
||||||||||
| Returns | ||||||||||
| ------- | ||||||||||
| dlogp graph | ||||||||||
| Variable | ||||||||||
|
|
||||||||||
| See Also | ||||||||||
| -------- | ||||||||||
| :py:meth:`compile_dlogp` : | ||||||||||
| gradient of log-probability density as a compiled function. | ||||||||||
| :py:meth:`logp` : | ||||||||||
| log-probability density as a Variable (in a symbolic form). | ||||||||||
| :py:meth:`d2logp` : | ||||||||||
| Hessian of log-probability density as a Variable (in a symbolic form). | ||||||||||
| """ | ||||||||||
| if vars is None: | ||||||||||
| value_vars = self.continuous_value_vars | ||||||||||
|
|
@@ -782,19 +869,32 @@ def d2logp( | |||||||||
| jacobian: bool = True, | ||||||||||
| negate_output=True, | ||||||||||
| ) -> Variable: | ||||||||||
| """Hessian of the models log-probability w.r.t. ``vars``. | ||||||||||
| """Hessian of the joint log-probability density of the model. | ||||||||||
|
|
||||||||||
| Parameters | ||||||||||
| ---------- | ||||||||||
| vars : list of random variables or potential terms, optional | ||||||||||
| Compute the gradient with respect to those variables. If None, use all | ||||||||||
| free and observed random variables, as well as potential terms in model. | ||||||||||
| jacobian : bool | ||||||||||
| Whether to include jacobian terms in logprob graph. Defaults to True. | ||||||||||
| vars : Variable, sequence of Variable or None, default None | ||||||||||
| Compute the gradient with respect to values of these variables. | ||||||||||
| If None, use all continuous free (unobserved) variables defined in the model. | ||||||||||
| jacobian : bool, default True | ||||||||||
| If True, add Jacobian contributions associated with automatic variable transformations, | ||||||||||
| so that the result is the true density of transformed random variables. | ||||||||||
| See :py:mod:`pymc.distributions.transforms` for details. | ||||||||||
| negate_output : bool, default True | ||||||||||
| If True, change the sign of the output and return the opposite of the Hessian. | ||||||||||
|
|
||||||||||
| Returns | ||||||||||
| ------- | ||||||||||
| d²logp graph | ||||||||||
| Variable | ||||||||||
|
|
||||||||||
| See Also | ||||||||||
| -------- | ||||||||||
| :py:meth:`compile_d2logp` : | ||||||||||
| Hessian of log-probability density as a compiled function. | ||||||||||
| :py:meth:`logp` : | ||||||||||
| log-probability density as a Variable (in a symbolic form). | ||||||||||
| :py:meth:`dlogp` : | ||||||||||
| gradient of log-probability density as a Variable (in a symbolic form). | ||||||||||
| """ | ||||||||||
| if vars is None: | ||||||||||
| value_vars = self.continuous_value_vars | ||||||||||
|
|
||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @OriolAbril
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify.
:module::must be specified at least once (and probably only once) for Sphinx to create an "index entry". Without it, references:currentmodule::...and:mod:...would lead to nowhere. It so happens thatmodule:: pymc.distributions.transformswas absent, even thoughcurrentmodulewas present, and Sphinx does not check this.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the module directive here is probably fine, I am not sure how relevant it is though. Is
pymc.distributions.transformsas a module something relevant to users we'll want to reference? If so we need to switch, otherwise usingcurrentmoduleis perfectly fine but usingmodulewon't hurt either.Full context:
.. currentmodule::is basically syntactic sugar, its only role is defining the module name to prepend all the autodoc/autosummary entries in that file. That is, we can usecircularororderedin the autosummary directive below instead of needing to specifypymc.distributions.transforms.circular..... module::does the same but also generates a target that can be referenced from other places using the:mod:`pymc.distributions.transforms`role;.. currentmodule::does not reference themoduledirective with the same name nor needs it to exist to work properly though.The
moduledirective can also take a:synopsis:to manually add the module docstring. In our case however, using this optional argument wouldn't make much sense, it would be much better to use.. automodule::instead which defines amoduledirective pulling the docstring directly from the module in question (like we do for classes or functions).Sphinx does not check this because
module/automoduleshould indeed be used at most once, but it isn't really necessary to use it at least once, that depends on how the package maintainers decide to define and document the public API.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without it references did not work, which is why I made the change.
P.S. I don't remember if Sphinx threw an error or just no hyperlink without the
module. I don't know much about this, but I guess you were not usingautomodulebecause this module is accompanied by a lot of text, which was placed intransform.rst(better have it there, than a huge module-level docstring). Some other entries, e.g.shape_utils.rst, also contain text in addition to Sphinx directives.Yes. The module page explains how PyMC deals with transformations, and I reference it as a good source for further details in a method's docstring.