yFiles in Scientific Publications

Computational Biology

yFiles is a powerful tool for visualizing real-life data such as health surveys (epidemiological surveys, clinical trials, etc.), bioinformatics data, databases, climate data, social-network visualizations, fraud detection, etc.

The following visualizations are built upon data that aim to detect whether close proximity relations among patients and/or healthcare workers in long-term care facilities facilitate the transmission of Staphylococcus aureus. The research is performed by Thomas Obadia et al. from the University Pierre et Marie Curie, Paris and is published at Plos Computational Biology [1]. The interactions between persons were measured by wireless electronic devices within a hospital with 329 patients and 261 healthcare workers for 4 months. The visualization corresponds to one day of this period.

Some of the yFiles library features applied in this visualization are the following:

  • Circular layout algorithms (single-circle and multi-circle layout) that are widely used for the visualization of large data-sets, since they facilitate the simultaneous visualization of a large portion of information.
  • Edge bundling in order to avoid visual cluttering due to the large number of edges and in order to bundle together edges that come from/go to the same cluster.
  • k-means clustering algorithm to cluster the nodes of the graph based on specific features.
  • Different visual styles for nodes such that two features (node-type and spa-type or ward) are visualized simultaneously.
  • Different visual styles for edges with gradient colours and different width.
Multi-circle visualization showing proximity relations of patients and healthcare workers of different hospital wards.

Each node corresponds either to a patient or to a healthcare worker and is accompanied by a coloured rectangle representing (i) in the multi-circle layout the spa-type that was detected and, (ii) in the single-circle layout the ward in which a person is hospitalized or works. Since many different spa-types were detected, the spa-types are coloured based on their frequencies: (i) grey colour for non-colonized or spa-types with only 1 occurrence, (ii) green colour for spa-types with frequency between 2 and 5, (iii) blue colour for spa-types with frequency between 5 and 10 and, (iv) red colour for all other spa-type frequencies.

Single-circle visualization showing proximity relations of patients and healthcare workers.

The edges represent the interactions between two different persons. The width of an edge represents the duration of each close proximity relation, such that "fatter" edges represent longer-time interactions. The edges are drawn with gradient colours such that the colour of each endpoint matches the colour of the frequency of the detected spa-type.

The multi-circle visualization consists of 6 clusters, each of which represents a different hospital ward. In the datasets, we distinguish 5 hospital wards which are represented by the larger circular clusters and a smaller one consisting of healthcare workers with night-shifts. In the single-circle layout, nodes that belong to the same ward are drawn next to each other with some minor exceptions.

In both visualizations, the ordering of the nodes along the circles is determined by k-means clustering algorithm that results in placing nodes which interact more closer to each other.

From the visualizations, one can run to the following conclusions:

  • The interactions among persons play a significant role for the transmission of Staphylococcus aureus. This can be derived from the colour of the edge that connects two persons and the fact that most of the edges have the same colour at both endpoints (excluding the grey ones that may represent non-colonized strains). This means that both related persons are eventually infected, and maybe by the same spa-type.
  • Most of the interactions occur in the same ward, since there exist only a few edges between nodes of different clusters.
  • The network is also dense within the same cluster which is also expected in long-term care facilities.
  • The interactions in the same ward have longer duration, since the inner-cluster edges are "fatter" than the intra-cluster ones.
  • Wards play significant role for the detected spa-type. This can be derived from comparing the colours of the edges of different clusters.
  • The multi-circular layout enables the quick identification of wards and the interactions within the same ward. The single-circle layout makes the identification of edges between different clusters clearer.

[1] Detailed Contact Data and the Dissemination of Staphylococcus aureus in Hospitals
Obadia T, Silhol R, Opatowski L, Temime L, Legrand J, et al. (2015) Detailed Contact Data and the Dissemination of Staphylococcus aureus in Hospitals. PLoS Comput Biol 11(3): e1004170. doi: 10.1371/journal.pcbi.1004170