This project is read-only.

DG2CEP: A Density-Grid Stream Clustering Platform.

Hyuga is an implementation of DG2CEP on-line stream-processing algorithm for the ESPER CEP engine.


The concentration (cluster) of mobile entities in a certain region, e.g., a mass street protest, a rock concert, or a traffic jam, is an information that can benefit several distributed applications. Nevertheless, cluster detection in on-line scenarios is a challenging task, primary because it requires efficient and complex algorithms to handle the high volume of position data in a timely manner.

To address this issue, we proposed DG2CEP, an on-line algorithm inspired by data mining algorithms and based on Complex Event Processing stream-oriented concepts for on-line detection of such clusters. Our experiments indicates that DG2CEP can rapidly detected, in less than few seconds, the cluster formation and dispersion. In addition, the required time to detect such clusters scale linearly with the number of nodes. Finally, regarding accuracy, several experiments shows that the cluster detected by DG2CEP presented a very high degree of similarity with the classic data mining DBSCAN density-clustering algorithm.

The main idea of DG2CEP is to mitigate the clustering process by first mapping the position data to CEP context partitions, and then clustering the partitions rather than the nodes (using a DBSCAN-like expansion). However, this process only occurs if the given context partition has at least the minimum number minPts mapped to it (as in DBSCAN core points). Further, since Context Partitions are adjacent and follow a grid-like scheme, their clustering expansion is trivial (i.e., the adjacent cells). The overall processing flow is illustrated below:




Hyuga uses a configuration file (following Java properties standard) for its parameters. Some parameters, such as eps and minPts are directly connected to DBSCAN and the semantics desired by the algorithm, while the latitude and longitude ranges represent the monitored domain. The size of the time window win represents the time (seconds) in which the positions of the mobile nodes should be considered. Finally, developers can also configure the communication middleware (for distributed deployment), with the desired DDS implementation, input and output topics, and deployed EPAs.

For example, the following configuration file assumes a DBSCAN semantic of minPts 10 and a 50x50 grid (eps). The context partition grid is placed in the monitored region delimited by the (39.817173, 40.004673) and (116.244621, 116.558418) range, respectively the latitude and longitude intervals. Events in a timewindow (win) of 30 seconds are considered. It uses the OpenSplice DDS implementation and subscribes to LocationUpdatesEvent, and output CellClusterEvents. Finally, it outpus all EPAs in this machine.

### DBSCAN variables###############
eps    = 10
minPts = 50

# Latitude
minlat = 39.817173
maxlat = 40.004673

# Longitude
minlng = 116.244621
maxlng = 116.558418

# Window Period (in sec)
win = 30

### DDS parameter #################
dds         = OpenSplice
distributed = true
publish     = Hyuga
subscribe   = LocationUpdateEvent

### Deploy ###
deploy = ALL


Hyuga execution is done through Apache Maven. To build the system and run do:

$ mvn package
$ mvn run



Hyuga: A Density-Grid Stream Clustering Platform.

Copyright (C) 2014 PUC-Rio/Laboratory for Advanced Collaboration

Hyuga is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

Hyuga is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with Hyuga. If not, see

Last edited Apr 15, 2016 at 7:06 AM by mroriz, version 12