- Celal Bayar Üniversitesi Fen Bilimleri Dergisi
- Vol: 13 Issue: 4
- Web Proxy Log Data Mining System for Clustering Users and Search Keywords
Web Proxy Log Data Mining System for Clustering Users and Search Keywords
Authors : Turgay Bilgin, Mustafa Aytekin
Pages : 873-881
Doi:10.18466/cbayarfbe.330088
View : 15 | Download : 3
Publication Date : 2017-12-29
Article Type : Research
Abstract :In this study, Internet users were clustered by the search keywords which they type into search bars of search engines. Our proposed software is called UQCS (User Queries Clustering System) and it was developed to demonstrate the efficiency of our hypothesis. UQCS co-operates with the Strehl’s relationship based clustering toolkit and performs segmentation on users based on the keywords they use for searching the web. Internet Proxy server logs were parsed and query strings were extracted from the search engine URL’s and the resulting IP-Term matrix was converted into a similarity matrix using Euclidean, Jaccard, Cosine Distance and Pearson Correlation Distance metrics. K- Means and graph-based OPOSSUM algorithm were used to perform clustering on the similarity matrices. Results were illustrated by using CLUSION visualization toolkit.Keywords : Data mining, Document clustering, Graph clustering, web mining