- Kocaeli Journal of Science and Engineering
- Vol: 4 Issue: 2
- Named Entity Recognition in Turkish Bank Documents
Named Entity Recognition in Turkish Bank Documents
Authors : Osman Kabasakal, Alev Mutlu
Pages : 86-92
Doi:10.34088/kojose.871873
View : 33 | Download : 18
Publication Date : 2021-11-30
Article Type : Research
Abstract :Named Entity Recognition (NER) is the process of automatically recognizing entity names such as person, organization, and date in a document. In this study, we focus on bank documents written in Turkish and propose a Conditional Random Fields (CRF) model to extract named entities. The main contribution of this study is twofold: (i) we propose domain-specific features to extract entity names such as law, regulation, and reference which frequently appear in bank documents; and (ii) we contribute to NER research in Turkish document which is not as mature as other languages such as English and German. Experimental results based on 10-fold cross validation conducted on 551 real-life, anonymized bank documents show the proposed CRF-NER model achieves 0.962 micro average F1 score. More specifically, F1 score for the identification of law names is 0.979, regulation name is 0.850, and article no is 0.850.Keywords : Bank Document, Conditional Random Fields, Named Entity Recognition, Natural Language Processing, Turkish Documents