The Greek ano teleia (mid dot) is a sentence boundary. We successfully added the semicolon (which is a question mark) as a sentence breaker in load config, but are not sure whether that space can hold Unicode code points like the word breakers category.
It's Unicode 00B7. or UTF-8 C2B7, or "MIDDLE DOT"
I have vague memories that there is some interaction with "word breakers" elsewhere in the config, but don't remember the details.
The Greek ano teleia (mid dot) is a sentence boundary. We successfully added the semicolon (which is a question mark) as a sentence breaker in load config, but are not sure whether that space can hold Unicode code points like the word breakers category.
It's Unicode 00B7. or UTF-8 C2B7, or "MIDDLE DOT"
I have vague memories that there is some interaction with "word breakers" elsewhere in the config, but don't remember the details.