Unofficial updates of statistical machine learning papers on arXiv

The Modern Mathematics of Deep Learning. (arXiv:2105.04026v1 [cs.LG]) ift.tt/3tFDH4u
4
111
509
A Survey on Statistical Theory of Deep Learning: Approximation, Training Dynamics, and Generative Models ift.tt/4bR3EZt
2
78
377
26,492
A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning. (arXiv:2109.02355v1 [stat.ML]) ift.tt/38OCr7f
72
358
Stop using the elbow criterion for k-means and how to choose the number of clusters instead. (arXiv:2212.12189v1 [stat.ML]) ift.tt/TJaIE2X
1
50
314
62,678
Graph Neural Networks: A Review of Methods and Applications. (arXiv:1812.08434v5 [cs.LG] UPDATED) ift.tt/2Tq9zNo
31
264
Applying statistical learning theory to deep learning. (arXiv:2311.15404v1 [cs.LG]) ift.tt/58MeHgo
45
273
32,654
Causality for Machine Learning. (arXiv:1911.10500v1 [cs.LG]) ift.tt/2KRSyVO
62
265
Autoencoders in Function Space ift.tt/HDQslSx
47
270
22,154
A Survey on Statistical Theory of Deep Learning: Approximation, Training Dynamics, and Generative Models. (arXiv:2401.07187v1 [stat.ML]) ift.tt/VHwOmbC
2
41
233
17,973
Revisiting Graph Neural Networks: All We Have is Low-Pass Filters. (arXiv:1905.09550v1 [stat.ML]) bit.ly/2JyBy8l
65
228
Neural Fourier Transform: A General Approach to Equivariant Representation Learning. (arXiv:2305.18484v1 [stat.ML]) ift.tt/BJXMoLw
40
194
31,730
Lecture Notes on Linear Neural Networks: A Tale of Optimization and Generalization in Deep Learning ift.tt/smbaZYl
33
185
10,105
From Statistical to Causal Learning. (arXiv:2204.00607v1 [cs.AI]) ift.tt/j12L3eT
1
46
180
On Neural Differential Equations. (arXiv:2202.02435v1 [cs.LG]) ift.tt/ehrXWpv
1
43
177
Making SGD Parameter-Free. (arXiv:2205.02160v1 [math.OC]) ift.tt/b2Xm6T1
1
20
168
Do Bayesian Neural Networks Need To Be Fully Stochastic?. (arXiv:2211.06291v1 [cs.LG]) ift.tt/vLQG0XF
33
170
Can Machines Learn the True Probabilities? ift.tt/ETqwk4h
1
28
167
11,089
Bayesian Nonparametrics: An Alternative to Deep Learning ift.tt/nMp7V3o
39
172
13,951
The statistical thermodynamics of generative diffusion models. (arXiv:2310.17467v1 [stat.ML]) ift.tt/dUlBzcn
1
37
167
25,031
Understanding Encoder-Decoder Structures in Machine Learning Using Information Measures ift.tt/Fh2NZT6
23
170
14,487
Gradient Descent Finds Global Minima of Deep Neural Networks. (arXiv:1811.03804v1 [cs.LG]) ift.tt/2qGzUGf
50
166
Uncertainty Quantification for Deep Learning ift.tt/XeSC3jw
31
169
15,945
Latent Space Energy-based Neural ODEs ift.tt/ycxJVQz
18
167
11,622
More is Better in Modern Machine Learning: when Infinite Overparameterization is Optimal and Overfitting is Obligatory. (arXiv:2311.14646v1 [cs.LG]) ift.tt/SqMcEBV
2
39
163
32,419
Your diffusion model secretly knows the dimension of the data manifold. (arXiv:2212.12611v4 [cs.LG] UPDATED) ift.tt/vKfrBPt
2
34
164
35,404
Any Deep ReLU Network is Shallow. (arXiv:2306.11827v1 [cs.LG]) ift.tt/HtvMuSO
25
163
23,836
The statistical thermodynamics of generative diffusion models: Phase transitions, symmetry breaking and critical instability ift.tt/McRFLQW
1
30
159
13,539
Neural Networks can Learn Representations with Gradient Descent. (arXiv:2206.15144v1 [cs.LG]) ift.tt/j9oGHXa
2
24
160
Online Learning of Neural Networks ift.tt/CRwg9Eq
24
162
7,284
Understanding LLMs Requires More Than Statistical Generalization ift.tt/x3aMZDj
34
156
14,662
Manifold learning in Wasserstein space ift.tt/QZxTqEw
23
160
7,455
Position Paper: Bayesian Deep Learning in the Age of Large-Scale AI ift.tt/HdsyZ3U
2
33
152
14,706
Manifold learning in Wasserstein space ift.tt/N7OurfM
25
156
10,306
Information Geometry of Variational Bayes ift.tt/OU1Xptz
1
27
157
13,904
Optimal Transport Tools (OTT): A JAX Toolbox for all things Wasserstein. (arXiv:2201.12324v1 [cs.LG]) ift.tt/Q3UYoDL58
1
16
152
Towards a theory of out-of-distribution learning ift.tt/QmIM1Ts
1
23
155
12,310
New algorithms for sampling and diffusion models ift.tt/OSQpd7m
30
158
14,462
Towards understanding how attention mechanism works in deep learning ift.tt/n43pAEQ
26
153
8,502
Wav-KAN: Wavelet Kolmogorov-Arnold Networks ift.tt/D6CtgvO
2
34
148
12,855
Can a Transformer Represent a Kalman Filter?. (arXiv:2312.06937v1 [cs.LG]) ift.tt/UwpkBGn
1
25
153
18,789
Kernel Density Estimators in Large Dimensions ift.tt/5vmRkLO
26
153
10,602
Elimination of All Bad Local Minima in Deep Learning. (arXiv:1901.00279v1 [cs.LG]) bit.ly/2Qig6TR
1
31
143