Skip to content
This repository was archived by the owner on Nov 16, 2020. It is now read-only.
This repository was archived by the owner on Nov 16, 2020. It is now read-only.

Invalid PAGE XML caused by PrintSpace with negative PageCoords #45

@stweil

Description

@stweil

The NZZ PAGE XML file was created by Transkribus, and it contains data which is reported as invalid:

<Page imageFilename="0111_nzz_18901222_0_0_a1_p1_1.tif" imageWidth="3839" imageHeight="5551">
    <PrintSpace>
        <Coords points="4,-27 3842,-27 3842,5524 4,5524"/>
    </PrintSpace>

ocr-validate.py reports that negative values are invalid here.
PRImA page viewer refuses to load PAGE XML with such data.

See also issue #38 with other PAGE XML related problems.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions