Bridging the Gap

Badawi, Soran S. (2024) Bridging the Gap. ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY, 12 (1). pp. 100-107. ISSN 2410-9355

[img] Text (Research Article)
ARO.11519.VOL12.NO1.2024.ISSUE22-PP100-107.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (742kB)
Official URL: http://dx.doi.org/10.14500/aro.11519

Abstract

Effective organization and retrieval of news content are heavily reliant on accurate news classification. While the mountainous research has been conducted in resourceful languages like English and Chinese, the researches on under-resourced languages like the Kurdish language are severely lacking. To address this challenge, we introduce a hybrid approach called RFO-CNN in this paper. The proposed method combines an improved version of red fox optimization algorithm (RFO) and convolutional neural network (CNN) for finetuning CNN’s parameters. Our model’s efficacy was tested on two widely used Kurdish news datasets, KNDH and KDC-4007, both of which contain news articles classified into various categories. We compared the performance of RFO-CNN to other cutting-edge deep learning models such as bidirectional long short-term memory networks and bidirectional encoder representations from transformers (BERT) transformers, as well as classical machine learning approaches such as multinomial naive bayes, support vector machine, and K-nearest neighbors. We trained and tested our datasets using four different scenarios: 60:40, 70:30, 80:20, and 90:10. Our experimental results demonstrate the superiority of the RFO-CNN model across all scenarios, outperforming the benchmark BERT model and other machine learning models in terms of accuracy and F1-score.

Item Type: Article
Additional Information: Ahmadi, S., 2020. KLPT-Kurdish Language Processing Toolkit. In Proceedings of the Second Workshop for NLP Open Source Software (NLP-OSS), pp.72-84. DOI: https://doi.org/10.18653/v1/2020.nlposs-1.11 Al-Tahrawi, M.M., 2015. Arabic text categorization using logistic regression. International Journal of Intelligent Systems and Applications, 7(6), pp.71-78. DOI: https://doi.org/10.5815/ijisa.2015.06.08 Azad, R., Mohammed, B., Mahmud, R., Zrar, L., and Sdiqa, S.J., 2021. Fake news detection in low resourced languages ”Kurdish language” using machine learning algorithms. Journal of Computational Science Education, 12(6), pp.4219-4225. Badawi, S., 2023. Data augmentation for Sorani Kurdish news headline classification using back-translation and deep learning model. Kurdistan Journal of Applied Research, 8(1), pp.27-34. DOI: https://doi.org/10.24017/science/2023.1.4 Badawi, S., 2024. Deep learning-based cyberbullying detection in Kurdish language. The Computer Journal, p.bxae024. DOI: https://doi.org/10.1093/comjnl/bxae024 Badawi, S., Saeed, A.M., Ahmed, S.A., Abdalla, P.A., and Hassan, D.A., 2023. Kurdish News Dataset Headlines (KNDH) through multiclass classification. Data in Brief, 48, p.109120. DOI: https://doi.org/10.1016/j.dib.2023.109120 Badawi, S.S., 2023. Using multilingual bidirectional encoder representations from transformers on medical corpus for Kurdish text classification. ARO-The Scientific Journal of Koya University, 11(1), pp.10-15. DOI: https://doi.org/10.14500/aro.11088 Bouras, C., and Tsogkas, V., 2009. Personalization Mechanism for Delivering News Articles on the User’s Desktop. In: 2009 Fourth International Conference on Internet and Web Applications and Services, pp.157-162. DOI: https://doi.org/10.1109/ICIW.2009.30 Chen, X., Cong, P., and Lv, S., 2022. A Long-text classification method of Chinese news based on BERT and CNN. IEEE Access, 10, pp.34046-34057. DOI: https://doi.org/10.1109/ACCESS.2022.3162614 Cleger-Tamayo, S., Fernandez-Luna, J.M., and Huete, J.F., 2012. Top-N news recommendations in digital newspapers. Knowledge-Based Systems, 27, pp.180-189. DOI: https://doi.org/10.1016/j.knosys.2011.11.017 Dai, Y., and Wang, T., 2021. Prediction of customer engagement behaviour response to marketing posts based on machine learning. Connection Science, 33(4), pp.891-910. DOI: https://doi.org/10.1080/09540091.2021.1912710 Garrido, A.L., Gomez, O., Ilarri, S., and Mena, E., 2011. NASS: News Annotation Semantic System. IN: 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence, pp.904-905. DOI: https://doi.org/10.1109/ICTAI.2011.149 Jing, W., and Bailong, Y., 2021. News Text Classification and Recommendation Technology Basedon Wide and Deep-Bert Model. In: 2021 IEEE International Conference on Information Communication and Software Engineering (ICICSE), pp.209-216. DOI: https://doi.org/10.1109/ICICSE52190.2021.9404101 Jugovac, M., Jannach, D., and Karimi, M., 2018. Streamingrec. In: Proceedings of the 12th ACM Conference on Recommender Systems, pp.269-273. DOI: https://doi.org/10.1145/3240323.3240384 Kaliyar, R.K., Goswami, A., and Narang, P., 2021. FakeBERT: Fake news detection in social media with a BERT-based deep learning approach. Multimedia Tools and Applications, 80(8), pp.11765-11788. DOI: https://doi.org/10.1007/s11042-020-10183-2 Khorami, E., Mahdi Babaei, F., and Azadeh, A., 2021. Optimal diagnosis of COVID-19 based on convolutional neural network and red fox optimization algorithm. Computational Intelligence and Neuroscience, 2021, p.4454507. DOI: https://doi.org/10.1155/2021/4454507 Liu, J., Xia, C., Yan, H., Xie, Z., and Sun, J., 2019. Hierarchical Comprehensive Context Modeling for Chinese Text Classification. IEEE Access, 7, pp.154546-154559. DOI: https://doi.org/10.1109/ACCESS.2019.2949175 Mahesh, P.C.S., and Hemalatha, S., 2022. An efficient android malware detection using adaptive red fox optimization based CNN. Wireless Personal Communications, 126(1), pp.679-700. DOI: https://doi.org/10.1007/s11277-022-09765-0 Połap, D., and Wozniak, M., 2021. Red fox optimization algorithm. Expert Systems with Applications, 166, p.114107. DOI: https://doi.org/10.1016/j.eswa.2020.114107 Pugal Priya, R., Saradadevi Sivarani, T., and Gnana Saravanan, A., 2022. Deep long and short term memory based Red Fox optimization algorithm for diabetic retinopathy detection and classification. International Journal for Numerical Methods in Biomedical Engineering, 38(3), p.e3560. DOI: https://doi.org/10.1002/cnm.3560 Rashid, T.A., Mustafa, A.M., and Saeed, A.M., 2017. Automatic Kurdish Text Classification Using KDC 4007 Dataset. In: International Conference on Emerging Intelligent Data and Web Technologies. DOI: https://doi.org/10.1007/978-3-319-59463-7_19 Reddy, S., Nalluri, S., Kunisetti, S., Ashok, S., and Venkatesh, B., 2019. Content Based Movie Recommendation System Using Genre Correlation. Springer, Singapore, pp.391-397. DOI: https://doi.org/10.1007/978-981-13-1927-3_42 Saeed, A.M., Badawi, S., Ahmed, S.A., and Hassan, D.A., 2023. Comparison of feature selection methods in Kurdish text classification. Iran Journal of Computer Science, 7, pp.55-64. DOI: https://doi.org/10.1007/s42044-023-00159-4 Salh, D.A., and Nabi, R.M., 2023. Kurdish fake news detection based on machine learning approaches. Passer Journal of Basic and Applied Sciences, 5(2), pp.262-271. DOI: https://doi.org/10.24271/psr.2023.380132.1226 Tan, Y., 2018. An Improved KNN Text Classification Algorithm Based on K-Medoids and Rough Set. In: 2018 10th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), pp.109-113. DOI: https://doi.org/10.1109/IHMSC.2018.00032 Verma, P.K., Agrawal, P., Amorim, I., and Prodan, R., 2021. WELFake: Word embedding over linguistic features for fake news detection. IEEE Transactions on Computational Social Systems,8(4), pp.881-893. DOI: https://doi.org/10.1109/TCSS.2021.3068519 Xie, J., Chen, B., Gu, X., Liang, F., and Xu, X., 2019. Self-attention-based BiLSTM model for short text fine-grained sentiment classification. IEEE Access, 7, pp.180558-180570. DOI: https://doi.org/10.1109/ACCESS.2019.2957510 Zhang, C., Gupta, A., Kauten, C., Deokar, A.V., and Qin, X.J., 2019. Detecting fake news for reducing misinformation risks using analytics approaches. European Journal of Operational Research, 279(3), pp.1036-1052. DOI: https://doi.org/10.1016/j.ejor.2019.06.022 Zhang, Y., Xu, B., and Zhao, T., 2020. Convolutional multi-head self-attention on memory for aspect sentiment classification. IEEE/CAA Journal of Automatica Sinica, 7(4), pp.1038-1044. DOI: https://doi.org/10.1109/JAS.2020.1003243 Zhu, Y., 2021. Research on news text classification based on deep learning convolutional neural network. Wireless Communications and Mobile Computing, 2021, p.1508150 DOI: https://doi.org/10.1155/2021/1508150
Uncontrolled Keywords: News Classification, Kurdish Language, Red fox optimization-Convolutional neural network, Bidirectional long short-term memory, Bidirectional encoder representations from transformers
Subjects: Q Science > QA Mathematics > QA76 Computer software
Divisions: ARO-The Scientific Journal of Koya University > VOL 12, NO 1 (2024)
Depositing User: Dr Salah Ismaeel Yahya
Date Deposited: 02 Sep 2024 06:58
Last Modified: 02 Sep 2024 06:58
URI: http://eprints.koyauniversity.org/id/eprint/475

Actions (login required)

View Item View Item