Complex Data Analysis

Saturday, Aug. 26, 2023

Session: Complex Data Analysis

Time: 1:30 p.m. — 3:00 p.m.

Location: 华东师范大学普陀校区文史楼203

Session Chair: Shurong Zheng, Northeast Normal University

Customizing Personal Large-Scale Language Model

Xiwen Qin

Changchun University of Technology

Title: Research on Time Series Data Classification Based on Variational Mode Decomposition and Deep Forest

Abstract: In this paper, a new adaptive decomposition and deep forest classification method is constructed based on nonlinear and non-stationary time series data. Firstly, the regularized variational mode decomposition method is optimized by the improved Harris Hawk algorithm. Secondly, the variable selection method of adaptive elastic net is implemented by the minimum common redundancy maximum relevance criterion. Finally, the two-stage weighted deep forest classification algorithm is constructed. The feasibility of the proposed method is verified by the simulation data, and it is applied to Epileptic Electroencephalogram signals recognition and rolling bearing fault diagnosis. The effectiveness of the proposed method is proved by comparative analysis and statistical test. The proposed method provides theoretical foundation for the analysis of complex nonlinear non-stationary time series data.

Jingru Zhang

Fudan University

Title: Graph-Based Two-Sample Tests for Multivariate Repeated Measurements of Histogram Objects

Abstract: Repeated observations have become increasingly common in biomedical research and longitudinal studies. For instance, wearable sensor devices are deployed to continuously track physiological and biological signals from each individual over multiple days. It remains of great interest to appropriately evaluate how the daily distribution of biosignals might differ across disease groups and demographics. Hence, these data could be formulated as multivariate complex object data, such as probability densities, histograms, and observations on a tree. Traditional statistical methods would often fail to apply, as they are sampled from an arbitrary non-Euclidean metric space. In this talk we propose novel, nonparametric, graph-based two-sample tests for object data with the same structure of repeated measures. We treat the repeatedly measured object data as multivariate object data, which requires the same number of repeated observations per individual but eliminates any assumptions on the errors of the repeated observations. A set of test statistics are proposed to capture various possible alternatives. We derive their asymptotic null distributions under the permutation null. These tests exhibit substantial power improvements over the existing methods while controlling the type I errors under finite samples as shown through simulation studies. The proposed tests are demonstrated to provide additional insights on the location, inter- and intra-individual variability of the daily physical activity distributions in a sample of studies for mood disorders.

Xiaoyi Wang

Beijing Normal University at Zhuhai

Title: Block-Diagonal Test for High-Dimensional Covariance Matrices

Abstract: The testing structure of a high-dimensional covariance matrix plays an important role in financial stock analyses, genetic series analyses, and many other fields. Testing that the covariance matrix is block-diagonal under the high-dimensional setting is a main focus of this paper. To tackle this problem, test procedures that are powerful under normality assumptions, two-diagonal block assumptions or sub-block dimensionality assumptions have been proposed in several existing studies. To relax these conditions, a test framework based on U-statistics is proposed in this paper, and the asymptotic distributions of those U-statistics are established under the null and alternative hypotheses. Moreover, another test approach is developed for alternatives with different sparsity levels. Finally, both a simulation study and real data analysis are conducted to show the performance of our proposed test procedures.

Ming Li

Northeast Normal University

Title: High-Dimensional scale invariant discriminant analysis

Abstract: We propose a scale invariant linear discriminant analysis classifier for high-dimensional data with dense signals. The method is valid for both cases that the data dimension is smaller or greater than the sample size. Based on recent advances of the sample correlation matrix in random matrix theory, we derive the asymptotic limits of the error rate which characterizes the influences of the data dimension and the tuning parameter. The major advantage of our proposed classifier is scale invariant and it is applicable to any variances of the feature. Several numerical studies are investigated and our proposed classifier performs favorably in comparison to some existing methods.