A statistical approach for clustering in streaming data

Niloofar Mozafari, Sattar Hashemi, Ali Hamzeh


Recently data stream has been extensively explored due to its emergence in large deal of applications such as sensor networks,web click streams and network flows. Vast majority of researches in the context of data stream mining are devoted to superviselearning, whereas, in real word human practice label of data are rarely available to the learning algorithms. Hence, clustering asthe most important unsupervised learning has been in the gravity of focus of quite a lot number of the researchers in data streamcommunity. Clustering paradigms basically place the similar objects together and separate the dissimilar ones into differentclusters.In this paper, we propose a Statistical framework for data Stream Clustering, which abbreviated as StatisStreamClust that makesuse of two components to find clusters in data stream. The first component especially designed to detect concept change wheredata underlying distributions change from time to time. Upon detection of concept change by the first component, the secondcomponent is triggered to update the whole clustering model.

StatisStreamClust brings great benefits to data stream clusteringincluding no sensitivity to the number of clusters and dimensions, reasonable complexity and in the meantime desirable performance,and finally no need to determine window size a priori. To explore the advantages of our approach, quite a lot ofexperiments with different settings and specifications are conducted. The obtained results are very promising.


Full Text:


DOI: https://doi.org/10.5430/air.v3n1p38


  • There are currently no refbacks.

Artificial Intelligence Research

ISSN 1927-6974 (Print)   ISSN 1927-6982 (Online)

Copyright © Sciedu Press 
To make sure that you can receive messages from us, please add the 'Sciedupress.com' domain to your e-mail 'safe list'. If you do not receive e-mail in your 'inbox', check your 'bulk mail' or 'junk mail' folders.