×
This study aims to develop a state-of-the-art word segmenter for Pashto, with a proofing tool to identify and correct the position of space in a noisy text.
Sep 1, 2023 · The first word segmenter for Pashto language. · Using CRF for Pashto word segmentation. · Correction of whitespace in noisy Pashto text.
Semantic Scholar extracted view of "Correction of whitespace and word segmentation in noisy Pashto text using CRF" by Ijazul Haq et al.
Oct 22, 2024 · Request PDF | On Aug 1, 2023, Ijazul Haq and others published Correction of Whitespace and Word Segmentation in Noisy Pashto Text using CRF ...
H. Ijazul, Q. Weidong, G. Jie, and T. Peng, "Correction of whitespace and word segmentation in noisy Pashto text using CRF," Speech Communication, vol. 153, p.
Using CRF for Pashto word segmentation. Correction of whitespace in noisy Pashto text. Annotated text corpus for the low-resource language Pashto. Pashto text ...
Oct 22, 2024 · This initial version of the toolkit offers models for spelling correction, word segmentation, part-of-speech tagging, and offensive language ...
Oct 18, 2023 · The inconsistent use of whitespace introduces noise into the text, and therefore an arbitrary Pashto text is noisier than English or other ...
Ijazul Haq, Weidong Qiu, Jie Guo, et al. Correction of whitespace and word segmentation in noisy Pashto text using CRF. Speech Communication, 2023, 153 ...
Correction of whitespace and word segmentation in noisy Pashto text using CRF ... word segmentation and named entity recognition, is proposed in this paper ...