With the increasing popularity of mobile devices and location-based services, massive amount of geo-textual data (e.g., geo-tagged tweets) is being generated every day. Compared with traditional spatial data, the textual dimension of geo-textual data greatly enriches the data. Meanwhile, the spatial dimension of geo-textual data also adds a semantically rich new aspect to textual data. The large volume, together with its rich semantics, calls for the need for data exploration. Firstly, there are many possible applications in retrieving a region for exploration that satisfies user-specified conditions (e.g., the size and shape of the region) while maximizing some other conditions (e.g., the relevance to the query keywords of the objects in the region). Secondly, it is useful to mine and explore the topics of the geo-textual data within a (specified or retrieved) region and perhaps within a particular timespan.
Figure 1 shows the system architecture of our technology. The Query Processor module handles the region queries, while the Topic Miner module handles the topic exploration queries. Our technology adopts the browser client-server model. A user can submit a query region through the web browser. The query is then sent to the Query Processor module. The Query Processor module accesses the Indexing module and finds a rectangular region according to users’ requirements. The Query Processor module then passes the region to the Topic Miner. By accessing the Indexing module, the Topic Miner module extracts the topics in the region. In addition, users can submit different time intervals and regions to the Topic Miner module to extract popular topics.
a) Travel planning apps (e.g., Google Trips, Tripadvisor, mafengwo.cn, etc.) – search for regions that satisfy some key concepts (e.g., garden, shopping) for example, users wish to visit places they have never been to, but constrained by limited time and budget.
b) Event planners (e.g., Cvent) – eg. user may wish to find a place in a big city to hold particular events, which satisfies specific criteria (e.g., coffee, museum)
c) Business data analytics apps – eg. user may wish to setup a stationery chain store in a region near schools or educational institutions. In addition, the topic and point of interest (POI) information within the region provide valuable information for the data analytics apps.
Big spatial data has attracted much attention recently. Many research papers are published on the mining of such informative data. Apart from academic research, industries (e.g., Facebook, Uber, Alibaba, Tencent) are engaged more and more in spatial data and location-based services. Most of them have established R&D departments in urban computing, location selection for business, etc.
Although many Online Analytical Processing (OLAP) systems are proposed for use in analysing various types of data, there is no such system for spatial databases in the market. Existing systems are not optimized for spatial objects. In addition, their system does not support rich query criteria such as submodular aggregate functions. There is also a lack of technologies in the market for mining topics from the documents within an arbitrarily given region and timespan.
End users can enjoy a responsive system for searching regions that satisfy different kinds of criteria. They may also easily explore occurrence that took place at different times in the region, and for analysing social events, topics, human behaviors, etc.