Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
d07e964
Optimize inverse index mapping in groups.py (Fixes Issue #3387)
aygarwal Feb 26, 2026
463a1ec
Add changelog entry and authors entry for PR #5252
aygarwal Feb 26, 2026
7724ba3
Merge branch 'develop' into issue3387-inverse-map-optimization
aygarwal Feb 26, 2026
d62eecd
Added thresholding to using cython function
aygarwal Mar 2, 2026
fa8ad6c
Number of unique values threshold for choosing cython function
aygarwal Mar 2, 2026
9ae78cb
Optimised pythonic calculation of inverse index mapping
aygarwal Mar 2, 2026
fd97d19
Removed unique value number threshold for benchmarking
aygarwal Mar 2, 2026
f56d246
Trying 10 for cython threshold
aygarwal Mar 2, 2026
fcb4931
Removed thresholding altogether for benchmarking
aygarwal Mar 2, 2026
e22709e
Corrected AUTHORS file
aygarwal Mar 11, 2026
97f112c
Corrected CHANGELOG
aygarwal Mar 11, 2026
50ac687
Corrected CHANGELOG
aygarwal Mar 13, 2026
eede57e
Merge branch 'develop' into issue3387-inverse-map-optimization
orbeckst Mar 13, 2026
e1df729
Added test for cython function correctness
aygarwal Mar 24, 2026
111da56
Merge branch 'develop' into issue3387-inverse-map-optimization
aygarwal Mar 24, 2026
fc3f507
Added standard numpy docstring to inverse_int_index
aygarwal Mar 25, 2026
c1c39a3
Added two time benchmarks for asunique with sorted=True/False
aygarwal Mar 27, 2026
e697b56
Merge branch 'develop' into issue3387-inverse-map-optimization
aygarwal Mar 27, 2026
fd67b0f
Merge branch 'develop' into issue3387-inverse-map-optimization
orbeckst Apr 9, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions package/AUTHORS
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,7 @@ Chronological list of authors
- Mohammad Ayaan
- Khushi Phougat
- Kushagar Garg
- Ayush Agarwal
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add yourself at the end (although I believe you already contributed another PR so you might just have to ensure that you're not showing up twice).

- Jeremy M. G. Leung

External code
Expand Down
3 changes: 3 additions & 0 deletions package/CHANGELOG
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@ Enhancements
* Adds support for parsing `.tpr` files produced by GROMACS 2026.0
* Enables parallelization for analysis.diffusionmap.DistanceMatrix
(Issue #4679, PR #4745)
* Improve performance of inverse index mapping in AtomGroup by
replacing O(n^2) logic with optimized Cython implementation.
(Issue #3387, PR #5252)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add at the top of each section — that's just how we've done things historically. One of the little things you learn in PRs...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mention the new function lib._cutils.inverse_int_index() by name


Changes
* The msd.py inside analysis is changed, and ProgressBar is implemented inside
Expand Down
6 changes: 2 additions & 4 deletions package/MDAnalysis/core/groups.py
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@
from ..exceptions import NoDataError
from . import topologyobjects
from ._get_readers import get_writer_for, get_converter_for
from ..lib._cutil import inverse_int_index


def _unpickle(u, ix):
Expand Down Expand Up @@ -912,10 +913,7 @@ def _asunique(self, group, sorted=False, set_mask=False):

indices = unique_int_1d_unsorted(self.ix)
if set_mask:
mask = np.zeros_like(self.ix)
for i, x in enumerate(indices):
values = np.where(self.ix == x)[0]
mask[values] = i
mask = inverse_int_index(self.ix, indices)
self._unique_restore_mask = mask

issorted = int_array_is_sorted(indices)
Expand Down
40 changes: 38 additions & 2 deletions package/MDAnalysis/lib/_cutil.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ from cython.operator cimport dereference as deref

cnp.import_array()

__all__ = ['unique_int_1d', 'make_whole', 'find_fragments',
__all__ = ['unique_int_1d', 'inverse_int_index', 'make_whole', 'find_fragments',
'_sarrus_det_single', '_sarrus_det_multiple']

cdef extern from "calc_distances.h":
Expand Down Expand Up @@ -91,6 +91,42 @@ def unique_int_1d(cnp.intp_t[:] values):

return np.array(result)

@cython.boundscheck(False)
@cython.wraparound(False)
def inverse_int_index(cnp.intp_t[:] values,
cnp.intp_t[:] unique_vals):
"""
Construct inverse index map such that:

unique_vals[mask] == values
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mask is not defined — do you need to say "Construct inverse index mask such that ..."? Properly explain what this code does.

Add a simple example in the docs below (look at https://userguide.mdanalysis.org/stable/contributing_code.html#working-with-the-code-documentation for how to structure doc strings using the numpy docstring standard)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now added a proper docstring for this function which includes the description of the function, the parameters, the return value and an example.


Parameters
----------
values : numpy.ndarray
1D array of integers.
unique_vals : numpy.ndarray
1D array of unique integers (unsorted).

Returns
-------
numpy.ndarray
Integer mask mapping values -> index in unique_vals.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a

.. versionadded:: 2.11.0

Make sure that there are TWO blank lines separating this line from the rest.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a Notes section in the docstring the .. versionadded:: 2.11.0 with the appropriate two blank lines before and after it.

"""

cdef Py_ssize_t n = values.shape[0]
cdef Py_ssize_t m = unique_vals.shape[0]
cdef Py_ssize_t i

cdef dict lookup = {}
cdef cnp.intp_t[:] mask = np.empty(n, dtype=np.intp)

for i in range(m):
lookup[unique_vals[i]] = i

for i in range(n):
mask[i] = lookup[values[i]]

return np.array(mask)

@cython.boundscheck(False)
def _in2d(cnp.intp_t[:, :] arr1, cnp.intp_t[:, :] arr2):
Expand Down Expand Up @@ -515,4 +551,4 @@ def find_fragments(atoms, bondlist):
# Add fragment to output
frags.append(np.asarray(this_frag))

return frags
return frags
Comment thread
orbeckst marked this conversation as resolved.
Loading