Average shortest-path length in word-adjacency networks: Chinese versus English
By: Jakub Dec , Michał Dolina , Stanisław Drożdż and more
Complex networks provide powerful tools for analyzing and understanding the intricate structures present in various systems, including natural language. Here, we analyze topology of growing word-adjacency networks constructed from Chinese and English literary works written in different periods. Unconventionally, instead of considering dictionary words only, we also include punctuation marks as if they were ordinary words. Our approach is based on two arguments: (1) punctuation carries genuine information related to emotional state, allows for logical grouping of content, provides a pause in reading, and facilitates understanding by avoiding ambiguity, and (2) our previous works have shown that punctuation marks behave like words in a Zipfian analysis and, if considered together with regular words, can improve authorship attribution in stylometric studies. We focus on a functional dependence of the average shortest path length $L(N)$ on a network size $N$ for different epochs and individual novels in their original language as well as for translations of selected novels into the other language. We approximate the empirical results with a growing network model and obtain satisfactory agreement between the two. We also observe that $L(N)$ behaves asymptotically similar for both languages if punctuation marks are included but becomes sizably larger for Chinese if punctuation marks are neglected.
Similar Papers
Network connectivity analysis via shortest paths
Physics and Society
Shows how well information travels through networks.
Network connectivity analysis via shortest paths
Physics and Society
Maps how fast information travels in networks.
Higher-order shortest paths in hypergraphs
Physics and Society
Measures how important group connections are for speed.