Same Stats, Different Graphs

Data analysts commonly utilize statistics to summarize large datasets. While it is often sufficient to explore only the summary statistics of a dataset (e.g., min/mean/max), Anscombe's Quartet demonstrates how such statistics can be misleading. Graph mining has a similar problem in that graph statistics (e.g., density, connectivity, clustering coefficient)

To find graphs that are identical over a number of graph statistics and yet are different, we use the ground truth data for small non-isomorphic graphs. For larger graphs, we use the graph generators together with some filters.

In fact, we can fix different combinations of 5 statistics and still get multiple distinct graphs. We visualize this with figures that encapsulate the variability of one statistic in 10 slots, covering the ranges [0.0, 0.1], [0.1, 0.2], ... [0.9,1] and in each slot we show a graph (if it exists) drawn by a spring layout;

*Fig. 1*: Variability of assortativity when fix |V| = 9, APL ∈ (1.42,1.47), den ∈ (0.52,0.57), GCC ∈ (0.5,0.6), Rt ∈ (0.15,0.25).

*Fig. 3*: Variability of Ce, when |V| = 9, SCC ∈ (0.75,0.85), ACC ∈ (0.75,0.8), r ∈ (-0.3, -0.2), Rt ∈ (0.35,0.45)

Same Stats, Different Graphs

Utilities