Hybridization of Fuzzy Clustering and Hierarchical Method for Link Discovery

Enseih Davoodi Jam, Mohammadali Nematbakhsh, Mojgan Askarizade

Abstract


Clustering is an active research topic in data mining and different methods have been proposed in the literature. Most of these methods are based on numerical attributes. Recently, there have been several proposals to develop clustering methods that support mixed attributes. There are
three basic groups of clustering methods: partitional methods, hierarchical methods and densitybased methods. This paper proposes a hybrid clustering algorithm that combines the advantages of hierarchical clustering and fuzzy clustering techniques and considers mixed attributes. The proposed algorithms improve the fuzzy algorithm by making it less dependent on the initial parameters such as randomly chosen initial cluster centers, and it can determine the number of clusters based on the complexity of cluster structure. Our approach is organized in two phases: first, the division of data in two clusters; then the determination of the worst cluster and splitting. The number of clusters is unknown, but our algorithms can find this parameter based on the complexity of cluster structure. We demonstrate the effectiveness of the clustering approach by evaluating datasets of linked data. We applied the proposed algorithms on three different datasets. Experimental results the proposed algorithm is suitable for link discovery between datasets of linked data. Clustering can decrease the number of comparisons before link discovery.

Keywords


Hierarchical method, Fuzzy Clustering, similarity measure, Linked Data

Full Text:

PDF


(C) 2010-2017 EduSoft