Khaled Abdalgader

Sohar University Oman

1chapters authored

Chapters authored

By Khaled Abdalgader

Conventional lexical-clustering algorithms treat text fragments as a mixed collection of words, with a semantic similarity between them calculated based on the term of how many the particular word occurs within the compared fragments. Whereas this technique is appropriate for clustering large-sized textual collections, it operates poorly when clustering small-sized texts such as sentences. This is due to compared sentences that may be linguistically similar despite having no words in common. This chapter presents a new version of the original k-means method for sentence-level text clustering that is relay on the idea of use of the related synonyms in order to construct the rich semantic vectors. These vectors represent a sentence using linguistic information resulting from a lexical database founded to determine the actual sense to a word, based on the context in which it occurs. Therefore, while traditional k-means method application is relay on calculating the distance between patterns, the new proposed version operates by calculating the semantic similarity between sentences. This allows it to capture a higher degree of semantic or linguistic information existing within the clustered sentences. Experimental results illustrate that the proposed version of clustering algorithm performs favorably against other well-known clustering algorithms on several standard datasets.

Part of the book: Recent Applications in Data Clustering

Khaled Abdalgader

Chapters authored

Related collaborators

Reda R. Gharieb

Hadeel Aljobouri

Hussain A. Jaber

Ilyas Çankaya

Milan Vukicevic

Vladimir Urosevic

F. Marta L. Di Lascio

Uğurhan Kutbay

Ana Kovacevic

Firas Kaddachi