MAPED data merging by cophus · Pull Request #169 · electronmicroscopy/quantem

cophus · 2026-02-02T01:15:23Z

What does this PR do?

This pull request adds support for multi angle precession electron diffraction (MAPED) data processing, described in this paper: https://arxiv.org/abs/2506.11327

Typical usage is something like:

ds = []
for file in files:
    ds.append(
        em.core.io.read_4dstem(
            path + file,
            file_type = 'arina',
        )    
    )
maped = em.diffraction.MAPED.from_datasets(ds)
maped.preprocess();
maped.diffraction_origin()
maped.diffraction_align()
maped.real_space_align()
dataset_merged = maped.merge_datasets()
dataset_merged.save(
    file_output,
    mode = 'o',
)

Example workflow:

Notebook example applied to this data from @smribet https://drive.google.com/drive/folders/1EtNWeWZSO8TZ7Qibak2SVxQiAC9IsCCb?usp=sharing
maped01.ipynb

Instructions for reviewers

Please check this code for accuracy and test on other MAPED datasets. Likely the diffraction space origin finding will need some work - it uses a "real space biased cross correlation" which its the best solution I've come up with so far (pytest added for that function).

TODO

Convert to torch
(optional) speed up torch version?
Test on other MAPED datasets
Improve automated alignment for both real space and diffraction space.

cedriclim1 · 2026-02-02T01:54:27Z

src/quantem/diffraction/maped.py

+
+
+def shift_images(
+    images,


Should probably be moved to core/imaging_utils.py

Good point - I put it here for quick iteration, haven't moved it yet.

ehrhardtkm · 2026-02-03T00:02:05Z

When looking at experimental data with dead pixels (and not preprocessed data like above), be sure to filter them before maped.diffraction_origin().

Can fix with something like this:

em.visualization.show_2d(
    mask,
    vmax = 1.0,
)
mask = ds[0].dp_mean.array > 1e4
em.visualization.show_2d(
    mask,
    vmax = 1.0,
)

for d in ds:
    d.median_filter_masked_pixels(mask)

Attached is a notebook for a working example. Just replace the file names and path to run. maped_test.ipynb

henrygbell · 2026-02-03T23:26:58Z

I tried this code with my Si membrane MAPED data.

Notebook attached here: [notebook] (https://github.com/user-attachments/files/25058505/HB_pr169_maped_test_nb.ipynb)
Data uploaded here if you are curious: [data] (https://drive.google.com/drive/folders/1CPl6Z0rKKleaQnG2Nkhj3ix8bZmh35nL?usp=drive_link)

This dataset has a lot of misalignment between the diffraction patterns in a single tilt, which this code does not deal with. I think we should add this functionality in the MAPED code base or the 4DSTEM dataset class (if it is not already).

Even with these misaligned DPs, the real space alignment worked well, I just needed around 15 alignment iterations.

Figure after maped.real_space_align()

cophus · 2026-02-05T18:24:34Z

Thank you both for the testing! @henrygbell I should have clarified - MAPED stores the global diffraction shift, it doesn't apply shifts to the list of MAPED.datasets. So you need to look at the shifts afterwards

from quantem.core.visualization import show_2d
rc = np.array(
    [
        [80, 80],
        [100, 80],
        [80, 100],
        [100, 100],
    ],
    dtype=int,
)

shifts_rc = -np.rint(maped.real_space_shifts).astype(int)
tiles = []
titles = []

for r, c in rc:
    row = []
    for i in range(5):
        dr, dc = shifts_rc[i]
        row.append(ds[i].array[r + dr, c + dc])
    row.append(dataset_merged.array[r, c])
    tiles.append(row)
    titles.append([f"ds[{i}] @ ({r+shifts_rc[i,0]},{c+shifts_rc[i,1]})" for i in range(5)] + [f"merged @ ({r},{c})"])

show_2d(
    tiles,
    title=titles,
    norm={
        "lower_quantile": 0.3,
        "upper_quantile": 0.999,
        "power": 0.5,
    },
    cmap="turbo_black",
    axsize=(2.5, 2.5),
);

So I think it's working as expected (though not handling the de-scan yet).

henrygbell

Testing this code on my Si MAPED dataset, it works very well and required zero tuning.

I do think some minor changes can be made to improve the code, see my comments below.

I think next steps I can take are porting it to torch to speed up the merge_datasets method for large datasets and correcting for de-scan.

henrygbell · 2026-02-05T21:24:34Z

src/quantem/diffraction/maped.py

+        Stores
+        ------
+        self.diffraction_origins : np.ndarray
+            Array of shape (n, 2) with integer (row, col) origins.


Suggested change

Array of shape (n, 2) with integer (row, col) origins.

Array of shape (n, 2) with integer (row, col) origins; n = len(datasets).

henrygbell · 2026-02-05T21:25:30Z

src/quantem/diffraction/maped.py

+        Stores
+        ------
+        self.scales : np.ndarray
+            Per-dataset scaling factors (n,).


Suggested change

Per-dataset scaling factors (n,).

Per-dataset scaling factors (n,); n = len(datasets)

henrygbell · 2026-02-05T22:26:59Z

src/quantem/diffraction/maped.py

+            shift_rc, G_shift = weighted_cross_correlation_shift(
+                im_ref=G_ref,
+                im=G,
+                weight_real=im_weight * 0.0 + 1.0,


im_weight is not used here. Is that on purpose?

henrygbell · 2026-02-05T22:31:57Z

src/quantem/diffraction/maped.py

+            else:
+                dp_arr = np.asarray(dp.array if hasattr(dp, "array") else dp)
+
+            arr = np.asarray(d.array)


This is a double conversion, I think one of them is not needed.

Also on line 112, 109.

henrygbell · 2026-02-05T22:33:44Z

src/quantem/diffraction/maped.py

+            else:
+                dp_arr = np.asarray(dp.array if hasattr(dp, "array") else dp)
+
+            arr = np.asarray(d.array)


Suggested change

arr = np.asarray(d.array)

arr = np.asarray(d)

henrygbell · 2026-02-05T22:47:31Z

src/quantem/diffraction/maped.py

+
+        H, W = np.asarray(self.dp_mean[0]).shape
+
+        w = tukey(H, alpha=2.0 * float(edge_blend) / float(H))[:, None] * tukey(


I think we should clamp the Tukey alphas as it is done in other parts of the code.

henrygbell · 2026-02-05T22:48:58Z

src/quantem/diffraction/maped.py

+            raise RuntimeError("Run diffraction_origin() first so self.diffraction_origins exists.")
+
+        H, W = np.asarray(self.dp_mean[0]).shape
+


Suggested change

alpha = min(1.0, 2.0 * float(edge_blend) / float(H))

henrygbell · 2026-02-05T22:49:19Z

src/quantem/diffraction/maped.py

+
+        H, W = np.asarray(self.dp_mean[0]).shape
+
+        w = tukey(H, alpha=2.0 * float(edge_blend) / float(H))[:, None] * tukey(


Suggested change

w = tukey(H, alpha=2.0 * float(edge_blend) / float(H))[:, None] * tukey(

w = tukey(H, alpha=alpha)[:, None] * tukey(

henrygbell · 2026-02-05T22:49:56Z

src/quantem/diffraction/maped.py

+        H, W = np.asarray(self.dp_mean[0]).shape
+
+        w = tukey(H, alpha=2.0 * float(edge_blend) / float(H))[:, None] * tukey(
+            W, alpha=2.0 * float(edge_blend) / float(W)


Suggested change

W, alpha=2.0 * float(edge_blend) / float(W)

W, alpha=alpha,

henrygbell · 2026-02-05T22:54:11Z

src/quantem/diffraction/maped.py

+        return dataset_merged
+
+
+def shift_images(


I think the naming of this function is a bit misleading because it shifts a stack of images -- as well as blending them. It's obvious if you glance at the docstring, but I would still change the name to "shift_blend_images" or something like this so there's no confusion. Another way to improve this and make it more general for use elsewhere would be to make flags for return_stack and/or return_blend.

cophus added 6 commits January 28, 2026 20:18

initial class

d3d3aa5

fixing DFT upsampling / image correlation, adding tests

ed44789

initial maped class commit

f249c6c

Updating with weighted correlation

494dc24

maped output

38a3fc2

datatype control for merged data

1e14d70

cophus marked this pull request as draft February 2, 2026 01:31

adding docstrings

3196aca

cophus marked this pull request as ready for review February 2, 2026 01:42

cedriclim1 reviewed Feb 2, 2026

View reviewed changes

henrygbell approved these changes Feb 5, 2026

View reviewed changes

	Array of shape (n, 2) with integer (row, col) origins.
	Array of shape (n, 2) with integer (row, col) origins; n = len(datasets).

	Per-dataset scaling factors (n,).
	Per-dataset scaling factors (n,); n = len(datasets)


		H, W = np.asarray(self.dp_mean[0]).shape

		w = tukey(H, alpha=2.0 * float(edge_blend) / float(H))[:, None] * tukey(

		raise RuntimeError("Run diffraction_origin() first so self.diffraction_origins exists.")

		H, W = np.asarray(self.dp_mean[0]).shape

	w = tukey(H, alpha=2.0 * float(edge_blend) / float(H))[:, None] * tukey(
	w = tukey(H, alpha=alpha)[:, None] * tukey(

Conversation

cophus commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Instructions for reviewers

TODO

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ehrhardtkm commented Feb 3, 2026

Uh oh!

henrygbell commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cophus commented Feb 5, 2026

Uh oh!

henrygbell left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cophus commented Feb 2, 2026 •

edited

Loading

henrygbell commented Feb 3, 2026 •

edited

Loading