A Context-Sensitive Approach to Find Optimum Language Model for Automatic Bangla Spelling Correction
Abstract: Automated spelling correction is an important phenomenon in typing that has intense effect on aiding both literate and semi-literate people while using keyboard or other similar devices. Such automated spelling correction technique also helps students significantly in learning process through applying proper words during word processing. A lot of work has been conducted for English language, but for Bangla, it is still not adequate. All work done so far in Bangla is context-free. Bangla is one of the mostly spoken languages (3.05% of world population) and considered seventh language of all languages in the world. In this paper, we propose a context-sensitive approach for automated spelling correction in Bangla. We make combined use of edit distance and stochastic, i.e. N-gram language model. We use six N-gram models in total. A novel approach is deployed in order to find the optimum language model in terms of performance. In addition, for finding out better performance, a large Bangla corpus of different word types is used. We have achieved a satisfactory and promising accuracy of 87.58%.
Spelling correction; non-word error; N-gram; edit distance; magnifying search; accuracy
Muhammad Ifte Khairul Islam, Md. Tarek Habib, Md. Sadekur Rahman, Md. Riazur Rahman, Farruk Ahmed
International Journal of Advanced Computer Science and Applications
