Date Awarded


Document Type


Degree Name

Doctor of Philosophy (Ph.D.)


Computer Science


Zhenming Liu

Committee Member

Gang Zhou

Committee Member

Bin Ren

Committee Member

Andreas Stathopoulos

Committee Member

Mihai Cucuringu


This thesis develops several forecasting models for simultaneously predicting the prices of d assets traded in financial markets, a most fundamental problem in the emerging area of ``FinTech''. The models are optimized to address three critical challenges, C1. High-dimensional interactions between assets. Assets could interact (e.g., Amazon's disclosure of its revenue change in cloud services could indicate that revenues also could change in other cloud providers). The number of possible interactions is quadratic in d, and is often much larger than the number of observations. C2. Non-linearity of the hypothesis class. Linear models are usually insufficient to characterize the relationship between the labels (responses) and the available information (features). C3. Data scarcity for each asset. The size of the data associated with an individual asset could be small. For example, a typical daily forecasting model based on technical factors uses three years (approx. 750 trading days) of data. We collect one data point for each day so only 750 observations are available for each asset. We develop the following works to address these challenges. Adaptive reduced rank regression (addressing C1). We examine a linear regression model y=Mx+ϵ that aims to directly capture the interactions between all features from all assets and all the responses, by estimating d×ω(d) entries in M using O(d) observations. In this setting, existing low-rank regularization techniques such as reduced rank regression or nuclear-norm based regularizations fail to work. Adaptive Reduced Rank Regression (Adaptive-RRR) is a new provable algorithm for estimating M under a mild assumption on the spectrum of the covariance matrix of x. On embedding stocks (addressing C1 & C2). We next propose a semi-parametric model called the "additive influence model" that decomposes the inference problem into two orthogonal subroutines. One subroutine is used to learn the high-dim interactions between entities, and we solve the problem with techniques developed for Adaptive-RRR. The other subroutine is used to learn the non-linear signals, and we solve the problem with practical algorithms such as deep learning and ensemble learning. Equity2Vec: Interaction beyond return correlations (addressing C2 & C3). We develop a specialized neural net model for each asset (e.g., train gi (∙) for asset i) but there is insufficient data to properly train gi with data only from i (because of C3). Our idea is to shrink gi (∙)’s toward one or more centroids to reduce model (sample) complexities. Specifically, we train a neural net model gi (x, W, Wi ) where W is shared across all entities, Wi is entity-specific and is learned through embedding, and gi (x) = gi (x, W, Wi ). When entities i and j are close, then Wi and Wj are close. Consequently, gi and gj will be similar when entity i and entity j are similar. The proposed algorithms/models are verified via extensive experiments based on real-world equity datasets. Our forecasting models can also be applied to a wide range of applications, such as identifying biomarkers, understanding risks associated with various diseases, image recognition, and link prediction.




© The Author