Skip to content
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
d07e964
Optimize inverse index mapping in groups.py (Fixes Issue #3387)
aygarwal Feb 26, 2026
463a1ec
Add changelog entry and authors entry for PR #5252
aygarwal Feb 26, 2026
7724ba3
Merge branch 'develop' into issue3387-inverse-map-optimization
aygarwal Feb 26, 2026
d62eecd
Added thresholding to using cython function
aygarwal Mar 2, 2026
fa8ad6c
Number of unique values threshold for choosing cython function
aygarwal Mar 2, 2026
9ae78cb
Optimised pythonic calculation of inverse index mapping
aygarwal Mar 2, 2026
fd97d19
Removed unique value number threshold for benchmarking
aygarwal Mar 2, 2026
f56d246
Trying 10 for cython threshold
aygarwal Mar 2, 2026
fcb4931
Removed thresholding altogether for benchmarking
aygarwal Mar 2, 2026
e22709e
Corrected AUTHORS file
aygarwal Mar 11, 2026
97f112c
Corrected CHANGELOG
aygarwal Mar 11, 2026
50ac687
Corrected CHANGELOG
aygarwal Mar 13, 2026
eede57e
Merge branch 'develop' into issue3387-inverse-map-optimization
orbeckst Mar 13, 2026
e1df729
Added test for cython function correctness
aygarwal Mar 24, 2026
111da56
Merge branch 'develop' into issue3387-inverse-map-optimization
aygarwal Mar 24, 2026
fc3f507
Added standard numpy docstring to inverse_int_index
aygarwal Mar 25, 2026
c1c39a3
Added two time benchmarks for asunique with sorted=True/False
aygarwal Mar 27, 2026
e697b56
Merge branch 'develop' into issue3387-inverse-map-optimization
aygarwal Mar 27, 2026
fd67b0f
Merge branch 'develop' into issue3387-inverse-map-optimization
orbeckst Apr 9, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion package/AUTHORS
Original file line number Diff line number Diff line change
Expand Up @@ -323,4 +323,4 @@ Logo

The MDAnalysis 'Atom' logo was designed by Christian Beckstein; it is
Copyright (c) 2011 Christian Beckstein and made available under a
Creative Commons Attribution-NoDerivs 3.0 Unported License.
Creative Commons Attribution-NoDerivs 3.0 Unported License.
Comment thread
orbeckst marked this conversation as resolved.
5 changes: 4 additions & 1 deletion package/CHANGELOG
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@ Fixes
DSSP by porting upstream PyDSSP 0.9.1 fix (Issue #4913)

Enhancements
* Improved performance of inverse index mapping in AtomGroup using an optimized
Cython implementation in lib._cutils.inverse_int_index()
(Issue #3387, PR #5252)
* MOL2Parser now reads unit cell dimensions from @<TRIPOS>CRYSIN records (Issue #3341)
* Reduces duplication of code in _apply() function (Issue #5247, PR #5294)
* Added new top-level `MDAnalysis.fetch` module (PR #4943)
Expand Down Expand Up @@ -3622,4 +3625,4 @@ Testsuite
licenses

11/12/07 naveen
* prepared for release outside lab
* prepared for release outside lab
Comment thread
orbeckst marked this conversation as resolved.
6 changes: 2 additions & 4 deletions package/MDAnalysis/core/groups.py
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@
from ..exceptions import NoDataError
from . import topologyobjects
from ._get_readers import get_writer_for, get_converter_for
from ..lib._cutil import inverse_int_index


def _unpickle(u, ix):
Expand Down Expand Up @@ -912,10 +913,7 @@ def _asunique(self, group, sorted=False, set_mask=False):

indices = unique_int_1d_unsorted(self.ix)
if set_mask:
mask = np.zeros_like(self.ix)
for i, x in enumerate(indices):
values = np.where(self.ix == x)[0]
mask[values] = i
mask = inverse_int_index(self.ix, indices)
self._unique_restore_mask = mask

issorted = int_array_is_sorted(indices)
Expand Down
40 changes: 38 additions & 2 deletions package/MDAnalysis/lib/_cutil.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ from cython.operator cimport dereference as deref

cnp.import_array()

__all__ = ['unique_int_1d', 'make_whole', 'find_fragments',
__all__ = ['unique_int_1d', 'inverse_int_index', 'make_whole', 'find_fragments',
'_sarrus_det_single', '_sarrus_det_multiple']

cdef extern from "calc_distances.h":
Expand Down Expand Up @@ -91,6 +91,42 @@ def unique_int_1d(cnp.intp_t[:] values):

return np.array(result)

@cython.boundscheck(False)
@cython.wraparound(False)
def inverse_int_index(cnp.intp_t[:] values,
cnp.intp_t[:] unique_vals):
"""
Construct inverse index map such that:

unique_vals[mask] == values
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mask is not defined — do you need to say "Construct inverse index mask such that ..."? Properly explain what this code does.

Add a simple example in the docs below (look at https://userguide.mdanalysis.org/stable/contributing_code.html#working-with-the-code-documentation for how to structure doc strings using the numpy docstring standard)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now added a proper docstring for this function which includes the description of the function, the parameters, the return value and an example.


Parameters
----------
values : numpy.ndarray
1D array of integers.
unique_vals : numpy.ndarray
1D array of unique integers (unsorted).

Returns
-------
numpy.ndarray
Integer mask mapping values -> index in unique_vals.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a

.. versionadded:: 2.11.0

Make sure that there are TWO blank lines separating this line from the rest.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a Notes section in the docstring the .. versionadded:: 2.11.0 with the appropriate two blank lines before and after it.

"""

cdef Py_ssize_t n = values.shape[0]
cdef Py_ssize_t m = unique_vals.shape[0]
cdef Py_ssize_t i

cdef dict lookup = {}
cdef cnp.intp_t[:] mask = np.empty(n, dtype=np.intp)

for i in range(m):
lookup[unique_vals[i]] = i

for i in range(n):
mask[i] = lookup[values[i]]

return np.array(mask)

@cython.boundscheck(False)
def _in2d(cnp.intp_t[:, :] arr1, cnp.intp_t[:, :] arr2):
Expand Down Expand Up @@ -515,4 +551,4 @@ def find_fragments(atoms, bondlist):
# Add fragment to output
frags.append(np.asarray(this_frag))

return frags
return frags
Comment thread
orbeckst marked this conversation as resolved.
Loading