
Machine Learning for Interpretability
Friday, Aug. 25, 2023




Peking university
Title: Structurally Grouped Approximate Factor Models
Abstract: This paper explores the group structure in large dimensional approximate factor models, which portrays homogeneous effects of the common factors on the individuals that fall into the same group. With the initial principal component estimates, we identify the unknown group structure by a combination of the agglomerative hierarchical clustering algorithm and an information criterion. The loadings and factors are then re-estimated conditional on the identified groups. Under some regularity conditions, we establish the consistency of the membership estimator as well as that of the group number estimator obtained from the information criterion. The new estimators under the group structure are shown to achieve efficiency gain compared to those obtained without this information. Numerical simulations and empirical applications demonstrate the nice finite sample performance of our proposed approach when group structure presents.
Shanghai University of Finance and Economics
Title: Efficient learning of Nonparametric Directed Acyclic Graph with Statistical Guarantee
Abstract: Directed acyclic graph (DAG) models are widely used to represent casual relations among collected nodes. This paper proposes an efficient and consistent method to learn DAG with a general causal dependence structure, which is in sharp contrast to most existing methods assuming linear dependence of causal relations. To facilitate DAG learning, the proposed method leverages the concept of topological layer, and connects nonparametric DAG learning with kernel ridge regression in a smooth reproducing kernel Hilbert space (RKHS) and learning gradients by showing that the topological layers of a nonparametric DAG can be exactly reconstructed via kernel-based estimation, and the parent-child relations can be obtained directly by computing the estimated gradient function. The developed algorithm is computationally efficient in the sense that it attempts to solve a convex optimization problem with an analytic solution, and the gradient functions can be directly computed by using the derivative reproducing property in the smooth RKHS. The asymptotic properties of the proposed method are established in terms of exact DAG recovery without requiring any explicit model specification. Its superior performance is also supported by a variety of simulated and a real-life example.
East China Normal University
Title: K-Nearest-Neighbor Local Sampling Based Conditional Independence Testing
Abstract: Conditional independence (CI) testing is a fundamental task in statistics and machine learning, but its effectiveness is hindered by the challenges posed by high-dimensional conditioning variables and limited data samples. This article introduces a novel testing approach to address these challenges and enhance control of the type I error while achieving high power under alternative hypotheses. The proposed approach incorporates a computationally efficient classifier-based conditional mutual information (CMI) estimator, capable of capturing intricate dependence structures among variables. To approximate a distribution encoding the null hypothesis, a k-nearest-neighbor local sampling strategy is employed. An important advantage of this approach is its ability to operate without assumptions about distribution forms or feature dependencies. Furthermore, it eliminates the need to derive asymptotic null distributions for the estimated CMI and avoids dataset splitting, making it particularly suitable for small datasets. The method presented in this article demonstrates asymptotic control of the type I error and consistency against all alternative hypotheses. Extensive analyses using both synthetic and real data highlight the computational efficiency of the proposed test. Moreover, it outperforms existing state-of-the-art methods in terms of type I and II errors, even in scenarios with high-dimensional conditioning sets. Additionally, the proposed approach exhibits robustness in the presence of heavy-tailed data.
Xiamen University
Title: Neural Networks for Partially Linear Quantile Regression
Abstract: Deep learning has enjoyed tremendous success in a variety of applications but its application to quantile regression remains scarce. A major advantage of the deep learning approach is its flexibility to model complex data in a more parsimonious way than nonparametric smoothing methods. However, while deep learning brought breakthroughs in prediction, it is not well suited for statistical inference due to its black box nature. In this paper, we leverage the advantages of deep learning and apply it to quantile regression where the goal is to produce interpretable results and perform statistical inference. We achieve this by adopting a semiparametric approach based on the partially linear quantile regression model, where covariates of primary interest for statistical inference are modelled linearly and all other covariates are modelled nonparametrically by means of a deep neural network. In addition to the new methodology, we provide theoretical justification for the proposed model by establishing the root-n consistency and asymptotically normality of the parametric coefficient estimator and the minimax optimal convergence rate of the neural nonparametric function estimator. Across numerical studies, the proposed model empirically produces superior estimates and more accurate predictions than various alternative approaches.