Abdul-Jabbar, Safa and George, Loay (2017) A Comparative Study for String Metrics and the Feasibility of Joining them as Combined Text Similarity Measures. ARO-The Scientific Journal of Koya University, 5 (2). pp. 6-18. ISSN 24109355
| ![[img]](http://eprints.koyauniversity.org/style/images/fileicons/archive.png) | Archive ARO.10180-VOL5.No2.2017.ISSUE09-PP6-18.pdf - Published Version Available under License Creative Commons Attribution Non-commercial Share Alike. Download (859kB) | 
Abstract
This paper aims to introduce an optimized Damerau– Levenshtein and dice-coefficients using enumeration operations (ODADNEN) for providing fast string similarity measure with maintaining the results accuracy; searching to find specific words within a large text is a hard job which takes a lot of time and efforts. The string similarity measure plays a critical role in many searching problems. In this paper, different experiments were conducted to handle some spelling mistakes. An enhanced algorithm for string similarity assessment was proposed. This algorithm is a combined set of well-known algorithms with some improvements (e.g. the dice-coefficient was modified to deal with numbers instead of characters using certain conditions). These algorithms were adopted after conducting on a number of experimental tests to check its suitability. The ODADNN algorithm was tested using real data; its performance was compared with the original similarity measure. The results indicated that the most convincing measure is the proposed hybrid measure, which uses the Damerau–Levenshtein and dice-distance based on n-gram of each word to handle; also, it requires less processing time in comparison with the standard algorithms. Furthermore, it provides efficient results to assess the similarity between two words without the need to restrict the word length.
| Item Type: | Article | 
|---|---|
| Uncontrolled Keywords: | Word classification, Word clustering, String distance, String matching operation, String similarity metric | 
| Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science Q Science > QA Mathematics > QA76 Computer software | 
| Divisions: | ARO-The Scientific Journal of Koya University > VOL 5, NO 2 (2017) | 
| Depositing User: | Dr Salah Ismaeel Yahya | 
| Date Deposited: | 23 Oct 2017 20:17 | 
| Last Modified: | 17 Apr 2018 08:01 | 
| URI: | http://eprints.koyauniversity.org/id/eprint/111 | 
Actions (login required)
|  | View Item | 
 
        