Fix ValueError by nonuniform grid initialization (src/GriTS.py)#208
Open
Mennaa-Ayman wants to merge 3 commits into
Open
Fix ValueError by nonuniform grid initialization (src/GriTS.py)#208Mennaa-Ayman wants to merge 3 commits into
Mennaa-Ayman wants to merge 3 commits into
Conversation
…r zeroes Co-authored-by: Copilot <copilot@github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When evaluating model predictions using GriTS, the model's output is often inconsistent and may not perfectly tile the ground truth grid. This leads to holes in the table structure (grid coordinates with no assigned cells).
Currently,
cells_to_gridandcells_to_relspan_gridinitialize the cell_grid as a 2D matrix of single floats (0.0). When a cell is found, the code overwrites that 0.0 with a list (bbox/relspan) or a string (text). If a cell is missing which is common in model predictions, the 0.0 remains.The result: np.array() fails with a ValueError because it cannot cast a mixture of floats and lists/strings into a fixed-dimension numerical array.
Changes
1. Uniform Grid Initialization
Replaced
np.zeros(...).tolist()with nested list comprehensions to ensure every cell in the grid starts with a Null value that matches the expected data type.cell_grid = [[[0, 0, 0, 0] for _ in range(num_columns)] for _ in range(num_rows)]Text: Now initialized with empty strings.
cell_grid = [["" for _ in range(num_columns)] for _ in range(num_rows)]2. PyMuPDF Compatibility
List Casting when creating Rect objects to ensure compatibility with modern PyMuPDF versions