Fall Semester, 2023
MW 3:00pm  4:20pm
Text Books
Textbooks  Title and Author 


Chen, Y., Chen, Y., Fan, J., and Ma, C. (2021). 

Fan, J., Li, R., Zhang, C.H., and Zou (2020). 
General Information
Instructor: Jianqing Fan, Frederick L. Moore'18 Professor of Finance.
Office: 205 Sherred Hall
Phone: 2587924
Email: [email protected]
Office Hours: Monday 1:40pm2:30pm, Wednesday 10:30am11:30am, or by appointments.
Teaching Assistant: Dr. Soham Jana
Office: 214 Sherred Hall
Email: [email protected]
Office Hours: Tuesday 10:30 am  11:30 am, Thursday 10:30 am  11:30 am., or by appointments.
Text Book
 Fan, J., Li, R., Zhang, C.H., and Zou, H. (2020). Statistical Foundations of Data Science. CRC Press. (Chapters 911)

Chen, Y., Chen, Y., Fan, J., and Ma, C. (2021).Spectral Methods for Data Science: A Statistical Perspective. Foundations and Trends in Machine Learning.
Reference Books
 Hastie, T., Tibshirani, R., and Wainwright, M. (2015). Statistical learning with sparsity. CRC press, New York.
 Wainwright, M. J. (2019). Highdimensional statistics: A nonasymptotic viewpoint. Cambridge University Press.
Syllabus
This course covers several topics on statistical machine learning theory, methods, and algorithms for data science. Topics include (1) Spectral methods for Data Science. (2) Matrix perturbation theory and concentration inequalities. (3) Robust covariance regularization and graphical model. (4) Factor models and their applications (5) Matrix completion. (6) Graphical clustering and community detection. (7) Item ranking. (8) Deept Neurlal networks. (9) Uncertainty quantification in high dimension. Students are expected to participate in paper surveying and presentations.
Course material will be covered the following topics.
 Introduction to Spectral Methods
 Community Detection
 Topic Modeling
 Matrix Completion
 Item Ranking
 Factor Model and Covariation Regularization
 FactorAdjusted Regularized Model Selection
 Matrix Perturbation Theory
 Matrix Norms
 Distances and Angles of Eigenspaces
 Eigenspace Pertubation Theory
 Singular Subspace Perturbation Theory
 Perturbation for Probability Transition Matrix
 Covariance Learning and Factor Models
 Principal Component Analysis
 Covariance Learning and Factor Models
 Covariance Estimation with Observabe Factors
 Asymptotic Properties of PCA Based Estimators
 FactorAdjusted Robust Multiple Testing
 Factor Augmented Regression Methods for Prediction
 Applications of l_2 Perturbation Theory
 Matrix Tail Bounds
 Community Detection
 Lowrank Matrix Completion
 Ranking from Pairwise Comparisons
 PCA and Factor Models
 Applications of l_{2, \infty} Perturbation Theory
 Motivations: Exact Recovery
 Leaveoneout Analysis: an illustrative example
 l_{\infty} eigenvector perturbation theory (rank1)
 Exact Recovery in Community Detection
 l_{2, \infty} Eigenspace Purturbatioj Theory (rankr)
 Entrywise error in Matrix Completion
 Recent Developments on Statistical Machine Learning
 FASTDNN for Big Data Modeling
 Inferences for HeteroPCA and Matrix Completion
 Ranking Inferences Based on Top Choices of Multiway Comparisons
 Universally Trainable Optimal Prediction Intervals Aggregation
 Inferences on Mixing Probabilities and Ranking in MixedMembership Models
 Spectral Ranking Inferences based on General Multiway Comparisons
 Other Papers of Students' Choice
Attendance
Attendance of the class is required and essential. The course materials are mainly from the notes. Many conceptual issues and statistical thinking are only taught in the class. They will appear in the midterm and final exams.
Schedules and Tentative Grading Policy
Assignment  Schedule 

Participation (30%)  Throughout the semester 
Presentation (70%)  Before the end of reading period 
or Term paper (70%)  Before the end of reading period 