Navigating Your Graph Through Space and Time
Our connected world is intrinsically bound to time, and in many domains the geo-spatial aspect plays an equally crucial role. In this article, we introduce insightful visualizations for relational data that has also a geographic and a temporal dimension. We show how these three dimensions can be assembled into one pivotable data visualization, giving you a powerful stepping stone for your own specific business domain. The source code of an accompanying demo application is available on GitHub.
We are in a golden era of data-driven business opportunities, nearly every aspect of an enterprise is affected by data analysis and optimization. All industries, from retail to healthcare over banking and aerospace, are embracing machine learning (ML) and artificial intelligence (AI) in some way. This fabulous explosion of machine intelligence does not, however, discard the need for human intelligence and insights gained from meticulous (re)search and expertise. In fact, AI should be seen as an augmentation of human intelligence leveraging our unique creative skills rather than replacing us. Traditional business intelligence (BI), for example, is nowadays amplified by means of ML and cloud-powered AI but does not remove the need for dashboard design and a broad understanding of the business. Making intelligent decisions requires tools that at the same time pivot all the dimensions of the data and highlight the relationships between features.
The data affluence (aka the new oil) has also revealed how events and data features are interconnected; nothing stands on its own. Whether you look at fraud detection, internet of things (IoT) automation, cybersecurity or law enforcement (to name a few), events are often linked causally (in time) or statistically or in space. The human expertise adds to this by injecting business context, thus emphasizing aspects of the data which are not contained in the raw figures. Fighting the rogue offshore industry (money laundering), for instance, is as much a technological challenge (software and forensic tools) as an analytical one.
Due to the amount of data and its relational aspects, graph solutions have flourished in recent years, both with respect to storage and presentation. Graph stores with millions of nodes are common and used across all industries. A rich universe of graph implementations and algorithms has bifurcated into semantic networks, property graphs, and distributed graph processing (Spark), among others. At the same time, all these backend solutions do not (or not yet) offer a unified view on this linked data: graph traversals focus on the relational aspects, but are oblivious to temporal correlations; singling out a node (or a cluster of nodes) is based on centrality measures which overlook the spatial data contained inside the nodes. As such, the need to visually navigate heterogeneous data and tools to intuitively comprehend it remains very much necessary. Not only because cross-dimensional aspects are not expressed in backend solutions, but also, as highlighted above, because human expertise is mandatory to fully grasp what is going on. One can object that there are also databases that intrinsically deal with time series (e.g. the popular InfluxDB) or the fact that many relational systems have spatial data types (and associated geo-API), but they do not coherently present the data, much less visualize it.
With the commercial programming library yFiles, we at yWorks are uniquely positioned to approach the visualization of multi-faceted data. Graphs are in our DNA and our yFiles framework can articulate as good as any type of graph you can envision. The visualization of relational, geographic and time-like data, in particular, is something you can easily assemble in diverse ways.
While there are solutions on the market which make it all possible (from backend to frontend and the analytics in between), we give you a rich framework on top of which you can create amazing interactive applications. Unlike the all-in-one solutions, you can brand, customize and integrate things in any way you like.
In the following, we introduce a demo application that focuses on the presentation of data with the aforementioned dimensions: geographic, temporal and relational. The business context of the application has been kept as generic as possible, so you can use your own data and adapt the demo to your needs without much effort. There are, however, many alternative cases one could just as well explore with slightly different datasets: anti-terrorism, fraud detection, research collaboration networks (including citation networks), outlining historical events and even genealogical research and heir location research.
The underlying data is a graph (network) where each node is tagged with geographical and/or date/time information. It’s not mandatory to have this information in each node, but the more it’s present across the data, the more hidden relations can be discovered, of course. The app consists of three panels:
the main panel presenting the graph in three ways; overlaid on a map, centered around a particular node (the centric view) or as a horizontal hierarchy which possibly also can emphasize lineage of the data,
the property panel presenting the data of the selected node, including some basic graph analysis like directly connected nodes (the neighborhood of the selected node),
the timeline panel presenting the events within a time window comprised of the earliest and latest date in the dataset.
The geographic view emphasizes geographic affinity as well as global topology and breadth of the data across the globe. The following initial questions are typical in spatial analysis and can be supported by the application:
If the amount of data is too large, one can either recourse to backend preprocessing, or use semantic zoom as demonstrated in this sample map application.
The smooth transition between the geographic mapping of the nodes and the centric layout reveals how the two different views map neighboring nodes around a particular item. This is an example where animation is more than a gimmick and conveys information by guiding the eyes in a particular way.
More components can be added to the main panel, depending on the integration of this view into a larger application, for example:
Rich infoboxes (pop-ups) can add additional information that does not fit the visual constraints of the nodes: paragraphs of text, images or even sound and video. Of course, if you do wish to see larger blobs of information, you can add these as well.
The property panel is where you can display the data directly coming from the nodes, but you can also integrate global or related information here. The information can come from backend services (security details, realtime weather…) or from the graph topology (say, centrality measures). The panel can furthermore be an alternative way to drill down into the data by hyperlinking properties to graph filtering, graph traversals, and the like. Deciding what to display in the properties panel for your business context is a creative process involving multiple roles: design, functional behavior, data, and end-user comprehension. What is true in all cases is that creative solutions should not be dimmed by technical constraints: at yWorks, we firmly believe that yFiles is uniquely rich and flexible to support the largest range of diagramming solutions around.
Last but not least, the timeline is where you can see how the data is spread across time. The upper part of the timeline corresponds to the time-window selection in the lower part. The time-window filters out the data in the main panel and, hence, allows you to see how data evolves in time. An alternative timeline is part of the fraud detection example where you can, in addition, play the evolution automatically. This is a feature one could also implement in this app, or use the Leaflet TimeDimension component, or another variant for visualizing the evolution. The way the data dots are stacked to indicate a temporal incidence can also be altered in many ways: one could use a bar chart, an area chart and so on. The time-window in conjunction with the filtering functionality really embodies the essence of a dynamic network: graph relations are often dynamic, and a graph on its own cannot represent this.
The source code of the demo application is available on GitHub. See the included readme file for usage instructions and implementation notes.