Statistical modeling and inferences on directed networks
Date
2024
Journal Title
Journal ISSN
Volume Title
Abstract
Network data has received great attention for elucidating comprehensive insights into nodes interactions and underlying network dynamics. This dissertation contributes new modeling tools and inference procedures to the field of network analysis, incorporating the dependence structure inherently introduced by the network data. Our first direction centers on modeling directed edges with count measurements, an area that has received limited attention in the literature. Most existing methods either assume the count edges are derived from continuous random variables or model the edge dependence by parametric distributions. In this dissertation, we develop a latent multiplicative Poisson model for directed network with count edges. Our approach directly models the edge dependence of count data by the pairwise dependence of latent errors, which are assumed to be weakly exchangeable. This assumption not only covers a variety of common network effects, but also leads to a concise representation of the error covariance. In addition, identification and inference of the mean structure, as well as the regression coefficients, depend on the errors only through their covariance, which provides substantial flexibility for our model. We propose a pseudo-likelihood based estimator for the regression coefficients that enjoys consistency and asymptotic normality. We evaluate our method by extensive numerical studies that corroborate the theory and apply our model to a food sharing network data to reveal interesting network effects that are further verified in literature. In the second project, we study the inference procedure of network dependence structures. While much research has targeted network-covariate associations and community detection, the inference of important network effects such as the reciprocity and sender-receiver effects has been largely overlooked. Testing network effects for network data or weighted directed networks is challenging due to the intricate potential edge dependence. Most existing methods are model-based, carrying strong assumptions with restricted applicability. In contrast, we present a novel, fully nonparametric framework that requires only minimal regularity assumptions. While inspired by recent developments in U-statistic literature, our work significantly broadens their scopes. Specifically, we identified and carefully addressed the indeterminate degeneracy inherent in network effect estimators - a challenge that aforementioned tools do not handle. We established Berry-Esseen type bound for the accuracy of type-I error rate control, as well as novel analysis show the minimax optimality of our test's power. Simulations highlight the superiority of our method in computation speed, accuracy, and numerical robustness relative to benchmarks. To showcase the practicality of our methods, we apply them to two real-world relationship networks, one in faculty hiring networks and the other in international trade networks. Finally, this dissertation introduces modeling strategies and corresponding methods for discerning the core-periphery (CP) structure in weighted directed networks. We adopt the signal-plus-noise model, categorizing uniform relational patterns as non-informative, by which we define the sender and receiver peripheries. Furthermore, instead of confining the core component to a specific structure, we consider it complementary to either the sender or receiver peripheries. Based on our definitions of the sender and receiver peripheries, we propose spectral algorithms to identify the CP structure in weighted directed networks. Our algorithm stands out with statistical guarantees, ensuring the identification of sender and receiver peripheries with overwhelmingly probability. Additionally, our methods scale effectively for expansive directed networks. We evaluate the proposed methods in extensive simulation studies and applied it to a faculty hiring network data, revealing captivating insights into the informative and non-informative sender/receiver behaviors.
Description
Rights Access
Subject
network inference
social networks
network modeling
dependent edges