machine learning - spell checker uses language model -
i spell checker use language model.
i know there lot of spell checkers such hunspell, see doesn't relate context, token-based spell checker.
for example,
i lick eating banana
so here @ token-based level no misspellings @ all, words correct, there no meaning in sentence. "smart" spell checker recognize "lick" correctly written word, may author meant "like" , there meaning in sentence.
i have bunch of correctly written sentences in specific domain, want train "smart" spell checker recognize misspelling , learn language model, such recognize thought "lick" written correctly, author meant "like".
i don't see hunspell has such feature, can suggest other spell checker, so.
see "the design of proofreading software service" raphael mudge. describes both data sources (wikipedia, blogs etc) , algorithm (basically comparing probabilities) of approach. source of system, after deadline, available, it's not actively maintained anymore.
Comments
Post a Comment