Skip to content

MAPED data merging#169

Open
cophus wants to merge 7 commits intoelectronmicroscopy:nanobeamfrom
cophus:nanobeam
Open

MAPED data merging#169
cophus wants to merge 7 commits intoelectronmicroscopy:nanobeamfrom
cophus:nanobeam

Conversation

@cophus
Copy link
Contributor

@cophus cophus commented Feb 2, 2026

What does this PR do?

This pull request adds support for multi angle precession electron diffraction (MAPED) data processing, described in this paper: https://arxiv.org/abs/2506.11327

Typical usage is something like:

ds = []
for file in files:
    ds.append(
        em.core.io.read_4dstem(
            path + file,
            file_type = 'arina',
        )    
    )
maped = em.diffraction.MAPED.from_datasets(ds)
maped.preprocess();
maped.diffraction_origin()
maped.diffraction_align()
maped.real_space_align()
dataset_merged = maped.merge_datasets()
dataset_merged.save(
    file_output,
    mode = 'o',
)

Example workflow:

image image image image image

Notebook example applied to this data from @smribet https://drive.google.com/drive/folders/1EtNWeWZSO8TZ7Qibak2SVxQiAC9IsCCb?usp=sharing
maped01.ipynb

Instructions for reviewers

Please check this code for accuracy and test on other MAPED datasets. Likely the diffraction space origin finding will need some work - it uses a "real space biased cross correlation" which its the best solution I've come up with so far (pytest added for that function).

TODO

  • Convert to torch
  • (optional) speed up torch version?
  • Test on other MAPED datasets
  • Improve automated alignment for both real space and diffraction space.

@cophus cophus marked this pull request as draft February 2, 2026 01:31
@cophus cophus marked this pull request as ready for review February 2, 2026 01:42


def shift_images(
images,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably be moved to core/imaging_utils.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - I put it here for quick iteration, haven't moved it yet.

@ehrhardtkm
Copy link

When looking at experimental data with dead pixels (and not preprocessed data like above), be sure to filter them before maped.diffraction_origin().

Can fix with something like this:

em.visualization.show_2d(
    mask,
    vmax = 1.0,
)
mask = ds[0].dp_mean.array > 1e4
em.visualization.show_2d(
    mask,
    vmax = 1.0,
)

for d in ds:
    d.median_filter_masked_pixels(mask)

Attached is a notebook for a working example. Just replace the file names and path to run. maped_test.ipynb

@henrygbell
Copy link

henrygbell commented Feb 3, 2026

I tried this code with my Si membrane MAPED data.

Notebook attached here: [notebook] (https://github.com/user-attachments/files/25058505/HB_pr169_maped_test_nb.ipynb)
Data uploaded here if you are curious: [data] (https://drive.google.com/drive/folders/1CPl6Z0rKKleaQnG2Nkhj3ix8bZmh35nL?usp=drive_link)

This dataset has a lot of misalignment between the diffraction patterns in a single tilt, which this code does not deal with. I think we should add this functionality in the MAPED code base or the 4DSTEM dataset class (if it is not already).

image

Even with these misaligned DPs, the real space alignment worked well, I just needed around 15 alignment iterations.

Figure after maped.real_space_align()
image

@cophus
Copy link
Contributor Author

cophus commented Feb 5, 2026

Thank you both for the testing! @henrygbell I should have clarified - MAPED stores the global diffraction shift, it doesn't apply shifts to the list of MAPED.datasets. So you need to look at the shifts afterwards

image
from quantem.core.visualization import show_2d
rc = np.array(
    [
        [80, 80],
        [100, 80],
        [80, 100],
        [100, 100],
    ],
    dtype=int,
)

shifts_rc = -np.rint(maped.real_space_shifts).astype(int)
tiles = []
titles = []

for r, c in rc:
    row = []
    for i in range(5):
        dr, dc = shifts_rc[i]
        row.append(ds[i].array[r + dr, c + dc])
    row.append(dataset_merged.array[r, c])
    tiles.append(row)
    titles.append([f"ds[{i}] @ ({r+shifts_rc[i,0]},{c+shifts_rc[i,1]})" for i in range(5)] + [f"merged @ ({r},{c})"])

show_2d(
    tiles,
    title=titles,
    norm={
        "lower_quantile": 0.3,
        "upper_quantile": 0.999,
        "power": 0.5,
    },
    cmap="turbo_black",
    axsize=(2.5, 2.5),
);

So I think it's working as expected (though not handling the de-scan yet).

Copy link

@henrygbell henrygbell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing this code on my Si MAPED dataset, it works very well and required zero tuning.

I do think some minor changes can be made to improve the code, see my comments below.

I think next steps I can take are porting it to torch to speed up the merge_datasets method for large datasets and correcting for de-scan.

Stores
------
self.diffraction_origins : np.ndarray
Array of shape (n, 2) with integer (row, col) origins.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Array of shape (n, 2) with integer (row, col) origins.
Array of shape (n, 2) with integer (row, col) origins; n = len(datasets).

Stores
------
self.scales : np.ndarray
Per-dataset scaling factors (n,).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Per-dataset scaling factors (n,).
Per-dataset scaling factors (n,); n = len(datasets)

shift_rc, G_shift = weighted_cross_correlation_shift(
im_ref=G_ref,
im=G,
weight_real=im_weight * 0.0 + 1.0,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im_weight is not used here. Is that on purpose?

else:
dp_arr = np.asarray(dp.array if hasattr(dp, "array") else dp)

arr = np.asarray(d.array)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a double conversion, I think one of them is not needed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also on line 112, 109.

else:
dp_arr = np.asarray(dp.array if hasattr(dp, "array") else dp)

arr = np.asarray(d.array)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
arr = np.asarray(d.array)
arr = np.asarray(d)


H, W = np.asarray(self.dp_mean[0]).shape

w = tukey(H, alpha=2.0 * float(edge_blend) / float(H))[:, None] * tukey(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should clamp the Tukey alphas as it is done in other parts of the code.

raise RuntimeError("Run diffraction_origin() first so self.diffraction_origins exists.")

H, W = np.asarray(self.dp_mean[0]).shape

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
alpha = min(1.0, 2.0 * float(edge_blend) / float(H))


H, W = np.asarray(self.dp_mean[0]).shape

w = tukey(H, alpha=2.0 * float(edge_blend) / float(H))[:, None] * tukey(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
w = tukey(H, alpha=2.0 * float(edge_blend) / float(H))[:, None] * tukey(
w = tukey(H, alpha=alpha)[:, None] * tukey(

H, W = np.asarray(self.dp_mean[0]).shape

w = tukey(H, alpha=2.0 * float(edge_blend) / float(H))[:, None] * tukey(
W, alpha=2.0 * float(edge_blend) / float(W)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
W, alpha=2.0 * float(edge_blend) / float(W)
W, alpha=alpha,

return dataset_merged


def shift_images(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the naming of this function is a bit misleading because it shifts a stack of images -- as well as blending them. It's obvious if you glance at the docstring, but I would still change the name to "shift_blend_images" or something like this so there's no confusion. Another way to improve this and make it more general for use elsewhere would be to make flags for return_stack and/or return_blend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants