-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Hunspell happens to have a rather non-transparent and user unfriendly validation of the input files. It just interprets various suspicious constructs in the input files in some deterministic way without saying anything. For example:
- the
SFX/PFXdirectives can be indented by whitespace characters and are still recognized - some of the other directives, such as
NEEDAFFIX, are ignored if they are indented - the words counter in the first line of a
.dicfile is ignored and words are loaded regardless of the actual counter value - the indented words in a
.dicfile are ignored - the stems in a
.dicfile may have invalid non-existing flags listed in the flags affix field. Hunspell seems to just ignore that particular invalid flag but process the rest of the data. - ... and many other peculiarities like this
This situation is not ideal. Hunaftool needs to precisely emulate the Hunspell's behaviour to interpret the input data in the same way. On the other hand, all suspicious/ambiguous constructs in the .aff and .dic input files need to be reported to the user. If the user is a dictionary maintainer, then they may be encouraged to resolve these ambiguities in the input files.
Metadata
Metadata
Assignees
Labels
No labels