研究目的
To address the computational cost and slow convergence of training deep probabilistic graphical models with the stochastic gradient method by proposing a new, largely tuning-free algorithm based on novel majorization bounds derived from the Schatten- norm.
研究成果
The proposed Stochastic Spectral Descent algorithm (SSD) not only provides up to an order of magnitude speed-up compared to other approaches but also leads to state-of-the-art performance for similarly-sized models due to improved optimization. The algorithm's effectiveness is demonstrated empirically on both directed and undirected graphical models.
研究不足
The computational bottleneck remains the gradient estimation, where Markov Chain Monte Carlo (MCMC) methods are used. The algorithm's performance is sensitive to the inexactness of gradient estimates, and the conditions for convergence may be too conservative, leading to slow convergence in some cases.