Software

Factor Augmented Sparse Throughput Deep ReLU Neural Networks

The software implements factor augmented sparse throughput deep ReLU Neural Networks that select important variables in the neural networks with or without factor structure in high-dimensional inputs. It encompasses both nonparametric sparse regression and sparse linear models with or without model structure. It also includes nonparametric factor regression model and principal component regression model as specific examples.

Soft Codes in Python

Fan, J. and Gu, Y. (2022).
Factor Augmented Sparse Throughput Deep ReLU Neural Networks for High Dimensional Regression.
Manuscript.

Iteratively Projected SVD for Tensor Factor Analysis

The software computes low-rank tensor decomposition with auxiliary covariates. It iteratively projects tensor data onto the linear space spanned by the basis functions of covariates and applies SVD on matricized tensors over each mode.

python code

Semiparametric Tensor Factor Analysis by \\ Iteratively Projected SVD

Estimating the number of factors by adjusted eigenvalues thresholding

Under some conditions, the number of factors = the number of population eigenvalues exceeding 1 for the correlation matrix.

To implement this, it is estimated by the number of biased corrected eigenvalues of sample correlation matrix exceeding 1 + C sqrt(p/(n-1)), where p = dimensionality, n = sample size, and C is a tuning parameter. The default C = 1.

R-code, matlab-code, python

Fan, J., Guo, J., and Zheng, S. (2022).
Estimating number of factors by adjusted eigenvalues thresholding.
Journal of American Statistical Association, 117, 852-861.

FarmTest: Factor Adjusted Robust Multiple Testing

Performs robust multiple testing for means in the presence of known and unknown latent factors. It implements a robust procedure to estimate distribution parameters using the Huber's loss function and accounts for strong dependence among coordinates via an approximate factor model.

Main functions:

farm.test(X,...): one-sample multiple tests;
farm.test(X,Y,...): two-sample multiple tests.

Reference:

Bose, K., Fan, J., Ke, Y. Pan, X. and Zhou, W.-X. (2020).
FarmTest: An R Package for Factor-Adjusted Robust Multiple Testing.
The R Journal, 12, 388-401.
Fan, J., Ke, Y., Sun, Q., and Zhou, W.X. (2019).
FarmTest: Factor-adjusted robust multiple testing with false discovery control.
Journal of American Statistical Association, 114, 1880-1893.
Zhou, W.-X., Bose, K., Fan, J. and Liu, H. (2018).
A new perspective on robust M-estimation: Finite sample theory and applications to dependence-adjusted multiple testing.
Annals of Statistics , 46, 1904-1931.
Manuscript

FarmSelect: Factor Adjusted Robust Model Selection

Implements a consistent model selection strategy for high dimensional sparse regression when the covariate dependence can be reduced through factor models. By separating the latent factors from idiosyncratic components, the problem is transformed from model selection with highly correlated covariates to that with weakly correlated variables.

Usage: farm.res(X, K.factors = NULL, robust = FALSE)

Reference:

Fan, J., Ke, Y., Wang, K. (2017).
Decorrelation of Covariates for High Dimensional Sparse Regression
Manuscript.

Matlab codes for Adaptive Huber estimation

This is the matlab codes used for simulation and real data analysis for the paper below. It computes robust mean regression for high-dimensional feature space with variable selection.

Reference:

Fan, J., Li, Q., and Wang, Y. (2017).
Estimation of high-dimensional mean regression in absence of symmetry and light-tail assumptions. Journal of Royal Statistical Society B , 79, 247--265.

pfa: an R package for "Estimates False Discovery Proportion Under Arbitrary Covariance Dependence"

by Jianqing Fan, Tracy Ke, Sydney Li and Lucy Xia

This package contains functions for performing multiple testing and estimating the false discovery proportion (FDP) under dependence.

Main functions: pfa.test(X,...): one-sample multiple tests;
pfa.test(X,Y,...): two-sample multiple tests.
pfa.gwas(X,Y,...): multiple testing in the genome-wise association study (GWAS).

See Manual

Reference:

(2011) Nonparametric independence screening in sparse ultra-high dimensional additive models.
Journal of American Statistical Association, 116, 544-557.

POET: an R package for estimating large covariance matrices by thresholding principal orthogonal complements.

by Fan, J., Liao, Y., and Mincheva, M. (2012)

Main function: POET performs PCA, estimate factor loadings, realized factors, and estimate sparse residual matrix by adaptive thresholding and the covariance matrix; See Manual

Reference:

Fan, J., Liao, Y. and Micheva, M. (2013).
Large Covariance Estimation by Thresholding Principal Orthogonal Complements. (with discussion)
Journal of Royal Statistical Society B , to appear.

SIS: an R package for (Iterative) Sure Independence Screening for generalized linear models and Cox's proportional hazards models.

by Fan, J., Feng, Y., Samworth, R. J. and Wu, Y. (2010)

Main function: SIS performs variable selection using iteratively two-scale methods (large-scale screenings followed by moderate-scale selections). It calls automatically the functions GLMvanISISscad and its variant GLMvarISISscad for Generalized linear models, and functions COXvanISISscad and its variant COXvarISISscad for Cox's proportional models. Many other functions are available and can be called directly or by SIS using non-default options. Examples are scadglm (a one-step method) and fullscadglm, and scadcox (a one-step method) and fullscadcox; See Manual.

Software

Factor Augmented Sparse Throughput Deep ReLU Neural Networks

Iteratively Projected SVD for Tensor Factor Analysis

Estimating the number of factors by adjusted eigenvalues thresholding

FarmTest: Factor Adjusted Robust Multiple Testing

Reference:

FarmSelect: Factor Adjusted Robust Model Selection

Reference:

Matlab codes for Adaptive Huber estimation

Reference:

pfa: an R package for "Estimates False Discovery Proportion Under Arbitrary Covariance Dependence"

Reference:

POET: an R package for estimating large covariance matrices by thresholding principal orthogonal complements.

Reference:

SIS: an R package for (Iterative) Sure Independence Screening for generalized linear models and Cox's proportional hazards models.

Related papers: procedures can be computed by the package