Stat.ML Papers · Jul 9, 2024 · 4:03 AM UTC

Stat.ML Papers

9 Jul 2024

A Theory of Machine Learning ift.tt/iOCAsJx

A Theory of Machine Learning

We critically review three major theories of machine learning and provide a new theory according to which machines learn a function when the machines successfully compute it. We show that this...

arxiv.org

113

586

69,289

Stat.ML Papers · May 11, 2021 · 7:47 AM UTC

Stat.ML Papers @StatMLPapers

11 May 2021

The Modern Mathematics of Deep Learning. (arXiv:2105.04026v1 [cs.LG]) ift.tt/3tFDH4u

111

509

Stat.ML Papers · May 8, 2025 · 4:03 AM UTC

Stat.ML Papers @StatMLPapers

8 May 2025

Machine Learning: a Lecture Note ift.tt/i3FbUp1

Machine Learning: a Lecture Note

This lecture note is intended to prepare early-year master's and PhD students in data science or a related discipline with foundational ideas in machine learning. It starts with basic ideas in...

arxiv.org

506

30,605

Stat.ML Papers · Mar 19, 2022 · 7:43 AM UTC

Stat.ML Papers @StatMLPapers

19 Mar 2022

The Mathematics of Artificial Intelligence. (arXiv:2203.08890v1 [cs.LG]) ift.tt/qw7YBvl

The Mathematics of Artificial Intelligence

We currently witness the spectacular success of artificial intelligence in both science and public life. However, the development of a rigorous mathematical foundation is still at an early stage....

arxiv.org

107

482

Stat.ML Papers · Sep 5, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

5 Sep 2024

Introduction to Machine Learning ift.tt/2RpGDC3

Introduction to Machine Learning

This book introduces the mathematical foundations and techniques that lead to the development and analysis of many of the algorithms that are used in machine learning. It starts with an...

arxiv.org

436

32,532

Stat.ML Papers · Feb 27, 2025 · 5:03 AM UTC

Stat.ML Papers @StatMLPapers

27 Feb 2025

Mathematical Introduction to Deep Learning: Methods, Implementations, and Theory ift.tt/IWBcq8Y

Mathematical Introduction to Deep Learning: Methods,...

This book aims to provide an introduction to the topic of deep learning algorithms. We review essential components of deep learning algorithms in full mathematical detail including different...

arxiv.org

441

23,236

Stat.ML Papers · Jul 8, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

8 Jul 2024

A Survey on Statistical Theory of Deep Learning: Approximation, Training Dynamics, and Generative Models ift.tt/4bR3EZt

377

26,492

Stat.ML Papers · Sep 7, 2021 · 1:43 AM UTC

Stat.ML Papers @StatMLPapers

7 Sep 2021

A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning. (arXiv:2109.02355v1 [stat.ML]) ift.tt/38OCr7f

358

Stat.ML Papers · Jan 28, 2025 · 5:04 AM UTC

Stat.ML Papers @StatMLPapers

28 Jan 2025

Matrix Calculus (for Machine Learning and Beyond) ift.tt/RIDzpw6

Matrix Calculus (for Machine Learning and Beyond)

This course, intended for undergraduates familiar with elementary calculus and linear algebra, introduces the extension of differential calculus to functions on more general vector spaces, such as...

arxiv.org

349

19,605

Stat.ML Papers · Aug 27, 2024 · 4:03 AM UTC

Stat.ML Papers @StatMLPapers

27 Aug 2024

Bayesian neural networks via MCMC: a Python-based tutorial ift.tt/YgAsvUO

Bayesian neural networks via MCMC: a Python-based tutorial

Bayesian inference provides a methodology for parameter estimation and uncertainty quantification in machine learning and deep learning methods. Variational inference and Markov Chain Monte-Carlo...

arxiv.org

346

33,596

Stat.ML Papers · Jun 14, 2024 · 4:03 AM UTC

Stat.ML Papers @StatMLPapers

14 Jun 2024

Step-by-Step Diffusion: An Elementary Tutorial ift.tt/dT1YcSb

Step-by-Step Diffusion: An Elementary Tutorial

We present an accessible first course on diffusion models and flow matching for machine learning, aimed at a technical audience with no diffusion experience. We try to simplify the mathematical...

arxiv.org

341

26,608

Stat.ML Papers · Jul 26, 2024 · 4:09 AM UTC

Stat.ML Papers @StatMLPapers

26 Jul 2024

Statistical optimal transport ift.tt/z4oy1XU

Statistical optimal transport

We present an introduction to the field of statistical optimal transport, based on lectures given at École d'Été de Probabilités de Saint-Flour XLIX.

arxiv.org

323

33,036

Stat.ML Papers · Dec 26, 2022 · 2:43 AM UTC

Stat.ML Papers @StatMLPapers

26 Dec 2022

Stop using the elbow criterion for k-means and how to choose the number of clusters instead. (arXiv:2212.12189v1 [stat.ML]) ift.tt/TJaIE2X

314

62,678

Stat.ML Papers · Oct 4, 2024 · 4:09 AM UTC

Stat.ML Papers @StatMLPapers

4 Oct 2024

Large Language Models as Markov Chains ift.tt/uPbQTkA

Large Language Models as Markov Chains

Large language models (LLMs) are remarkably efficient across a wide range of natural language processing tasks and well beyond them. However, a comprehensive theoretical analysis of the LLMs'...

arxiv.org

297

27,966

Stat.ML Papers · Apr 12, 2021 · 1:47 AM UTC

Stat.ML Papers @StatMLPapers

12 Apr 2021

Graph Neural Networks: A Review of Methods and Applications. (arXiv:1812.08434v5 [cs.LG] UPDATED) ift.tt/2Tq9zNo

264

Stat.ML Papers · Nov 28, 2023 · 3:14 AM UTC

Stat.ML Papers @StatMLPapers

28 Nov 2023

Applying statistical learning theory to deep learning. (arXiv:2311.15404v1 [cs.LG]) ift.tt/58MeHgo

273

32,654

Stat.ML Papers · May 1, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

1 May 2024

KAN: Kolmogorov-Arnold Networks ift.tt/20dT3if

KAN: Kolmogorov-Arnold Networks

Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation...

arxiv.org

268

25,149

Stat.ML Papers · Nov 26, 2019 · 2:34 AM UTC

Stat.ML Papers @StatMLPapers

26 Nov 2019

Causality for Machine Learning. (arXiv:1911.10500v1 [cs.LG]) ift.tt/2KRSyVO

265

Stat.ML Papers · Aug 5, 2024 · 4:03 AM UTC

Stat.ML Papers @StatMLPapers

5 Aug 2024

Autoencoders in Function Space ift.tt/HDQslSx

270

22,154

Stat.ML Papers · Aug 21, 2024 · 4:03 AM UTC

Stat.ML Papers @StatMLPapers

21 Aug 2024

Information-Theoretic Foundations for Machine Learning ift.tt/B76pTXf

Information-Theoretic Foundations for Machine Learning

The progress of machine learning over the past decade is undeniable. In retrospect, it is both remarkable and unsettling that this progress was achievable with little to no rigorous theory to...

arxiv.org

265

15,778

Stat.ML Papers · Jul 18, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

18 Jul 2024

Information-Theoretic Foundations for Machine Learning ift.tt/u0yIVB8

Information-Theoretic Foundations for Machine Learning

The progress of machine learning over the past decade is undeniable. In retrospect, it is both remarkable and unsettling that this progress was achievable with little to no rigorous theory to...

arxiv.org

257

16,225

Stat.ML Papers · Aug 5, 2024 · 4:03 AM UTC

Stat.ML Papers @StatMLPapers

5 Aug 2024

Transformers are Universal In-context Learners ift.tt/tZ7Lzsl

Transformers are Universal In-context Learners

Transformers are deep architectures that define "in-context mappings" which enable predicting new tokens based on a given set of tokens (such as a prompt in NLP applications or a set of patches...

arxiv.org

248

21,939

Stat.ML Papers · Feb 14, 2024 · 5:04 AM UTC

Stat.ML Papers @StatMLPapers

14 Feb 2024

On Limitations of the Transformer Architecture ift.tt/bR6gyfD

On Limitations of the Transformer Architecture

What are the root causes of hallucinations in large language models (LLMs)? We use Communication Complexity to prove that the Transformer layer is incapable of composing functions (e.g., identify...

arxiv.org

247

18,382

Stat.ML Papers · Mar 27, 2024 · 4:08 AM UTC

Stat.ML Papers @StatMLPapers

27 Mar 2024

Applying statistical learning theory to deep learning ift.tt/icILRPD

Applying statistical learning theory to deep learning

Although statistical learning theory provides a robust framework to understand supervised learning, many theoretical aspects of deep learning remain unclear, in particular how different...

arxiv.org

233

20,477

Stat.ML Papers · Jan 17, 2024 · 5:14 AM UTC

Stat.ML Papers @StatMLPapers

17 Jan 2024

A Survey on Statistical Theory of Deep Learning: Approximation, Training Dynamics, and Generative Models. (arXiv:2401.07187v1 [stat.ML]) ift.tt/VHwOmbC

233

17,973

Stat.ML Papers · May 15, 2025 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

15 May 2025

Introduction to Machine Learning ift.tt/K60gFra

Introduction to Machine Learning

This book introduces the mathematical foundations and techniques that lead to the development and analysis of many of the algorithms that are used in machine learning. It starts with an...

arxiv.org

227

10,385

Stat.ML Papers · Apr 3, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

3 Apr 2024

Bayesian neural networks via MCMC: a Python-based tutorial ift.tt/mTYRgDS

Bayesian neural networks via MCMC: a Python-based tutorial

Bayesian inference provides a methodology for parameter estimation and uncertainty quantification in machine learning and deep learning methods. Variational inference and Markov Chain Monte-Carlo...

arxiv.org

228

18,419

Stat.ML Papers · May 24, 2019 · 12:42 AM UTC

Stat.ML Papers @StatMLPapers

24 May 2019

Revisiting Graph Neural Networks: All We Have is Low-Pass Filters. (arXiv:1905.09550v1 [stat.ML]) bit.ly/2JyBy8l

228

Stat.ML Papers · Jul 19, 2024 · 1:58 AM UTC

Stat.ML Papers @StatMLPapers

19 Jul 2024

Information-Theoretic Foundations for Machine Learning ift.tt/u0yIVB8

Information-Theoretic Foundations for Machine Learning

The progress of machine learning over the past decade is undeniable. In retrospect, it is both remarkable and unsettling that this progress was achievable with little to no rigorous theory to...

arxiv.org

227

11,837

Stat.ML Papers · Apr 12, 2024 · 4:03 AM UTC

Stat.ML Papers @StatMLPapers

12 Apr 2024

An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization ift.tt/FJiwevm

An Overview of Diffusion Models: Applications, Guided Generation,...

Diffusion models, a powerful and universal generative AI technology, have achieved tremendous success in computer vision, audio, reinforcement learning, and computational biology. In these...

arxiv.org

219

22,145

Stat.ML Papers · Dec 2, 2024 · 5:04 AM UTC

Stat.ML Papers @StatMLPapers

2 Dec 2024

The Return of Pseudosciences in Artificial Intelligence: Have Machine Learning and Deep Learning Forgotten Lessons from Statistics and History? ift.tt/EQLimlt

The Return of Pseudosciences in Artificial Intelligence: Have...

In today's world, AI programs powered by Machine Learning are ubiquitous, and have achieved seemingly exceptional performance across a broad range of tasks, from medical diagnosis and credit...

arxiv.org

210

18,511

Stat.ML Papers · Aug 26, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

26 Aug 2024

A Geometric Perspective on Diffusion Models ift.tt/IUlpHvW

A Geometric Perspective on Diffusion Models

Recent years have witnessed significant progress in developing effective training and fast sampling techniques for diffusion models. A remarkable advancement is the use of stochastic differential...

arxiv.org

214

13,752

Stat.ML Papers · Apr 12, 2023 · 1:46 AM UTC

Stat.ML Papers @StatMLPapers

12 Apr 2023

Automatic Gradient Descent: Deep Learning without Hyperparameters. (arXiv:2304.05187v1 [cs.LG]) ift.tt/Qkv9KRg

Automatic Gradient Descent: Deep Learning without Hyperparameters

The architecture of a deep neural network is defined explicitly in terms of the number of layers, the width of each layer and the general network topology. Existing optimisation frameworks neglect...

arxiv.org

195

27,091

Stat.ML Papers · Dec 23, 2024 · 5:05 AM UTC

Stat.ML Papers @StatMLPapers

23 Dec 2024

Lecture Notes on High Dimensional Linear Regression ift.tt/QoBcJUK

Lecture Notes on High Dimensional Linear Regression

These lecture notes cover advanced topics in linear regression, with an in-depth exploration of the existence, uniqueness, relations, computation, and non-asymptotic properties of the most...

arxiv.org

194

12,201

Stat.ML Papers · Jan 17, 2025 · 5:03 AM UTC

Stat.ML Papers @StatMLPapers

17 Jan 2025

Hidden Markov Neural Networks ift.tt/pG92uqo

Hidden Markov Neural Networks

We define an evolving in-time Bayesian neural network called a Hidden Markov Neural Network, which addresses the crucial challenge in time-series forecasting and continual learning: striking a...

arxiv.org

196

9,795

Stat.ML Papers · Nov 10, 2025 · 5:04 AM UTC

Stat.ML Papers @StatMLPapers

10 Nov 2025

On Flow Matching KL Divergence ift.tt/tOJ9MLa

On Flow Matching KL Divergence

We derive a deterministic, non-asymptotic upper bound on the Kullback-Leibler (KL) divergence of the flow-matching distribution approximation. In particular, if the $L_2$ flow-matching loss is...

arxiv.org

196

14,978

Stat.ML Papers · May 31, 2023 · 1:03 AM UTC

Stat.ML Papers @StatMLPapers

31 May 2023

Neural Fourier Transform: A General Approach to Equivariant Representation Learning. (arXiv:2305.18484v1 [stat.ML]) ift.tt/BJXMoLw

194

31,730

Stat.ML Papers · Oct 17, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

17 Oct 2024

Manifolds, Random Matrices and Spectral Gaps: The geometric phases of generative diffusion ift.tt/2FAhuw8

Manifolds, Random Matrices and Spectral Gaps: The geometric phases...

In this paper, we investigate the latent geometry of generative diffusion models under the manifold hypothesis. For this purpose, we analyze the spectrum of eigenvalues (and singular values) of...

arxiv.org

188

12,701

Stat.ML Papers · Aug 14, 2024 · 4:03 AM UTC

Stat.ML Papers @StatMLPapers

14 Aug 2024

How Transformers Learn Causal Structure with Gradient Descent ift.tt/Y51vxXk

How Transformers Learn Causal Structure with Gradient Descent

The incredible success of transformers on sequence modeling tasks can be largely attributed to the self-attention mechanism, which allows information to be transferred between different parts of a...

arxiv.org

186

11,729

Stat.ML Papers · Mar 21, 2025 · 4:03 AM UTC

Stat.ML Papers @StatMLPapers

21 Mar 2025

Manifold learning in metric spaces ift.tt/cwXi5zp

Manifold learning in metric spaces

Laplacian-based methods are popular for the dimensionality reduction of data lying in $\mathbb{R}^N$. Several theoretical results for these algorithms depend on the fact that the Euclidean...

arxiv.org

195

9,569

Stat.ML Papers · Aug 27, 2024 · 4:03 AM UTC

Stat.ML Papers @StatMLPapers

27 Aug 2024

Lecture Notes on Linear Neural Networks: A Tale of Optimization and Generalization in Deep Learning ift.tt/smbaZYl

185

10,105

Stat.ML Papers · Mar 6, 2025 · 5:04 AM UTC

Stat.ML Papers @StatMLPapers

6 Mar 2025

Applications of Entropy in Data Analysis and Machine Learning: A Review ift.tt/wX7Gioh

Applications of Entropy in Data Analysis and Machine Learning: A Review

Since its origin in the thermodynamics of the 19th century, the concept of entropy has also permeated other fields of physics and mathematics, such as Classical and Quantum Statistical Mechanics,...

arxiv.org

181

9,407

Stat.ML Papers · Apr 1, 2025 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

1 Apr 2025

NoProp: Training Neural Networks without Back-propagation or Forward-propagation ift.tt/UjN04y6

NoProp: Training Neural Networks without Full Back-propagation or...

The canonical deep learning approach for learning requires computing a gradient term at each block by back-propagating the error signal from the output towards each learnable parameter. Given the...

arxiv.org

180

19,740

Stat.ML Papers · Oct 10, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

10 Oct 2024

Manifolds, Random Matrices and Spectral Gaps: The geometric phases of generative diffusion ift.tt/O0pli5J

Manifolds, Random Matrices and Spectral Gaps: The geometric phases...

In this paper, we investigate the latent geometry of generative diffusion models under the manifold hypothesis. For this purpose, we analyze the spectrum of eigenvalues (and singular values) of...

arxiv.org

178

11,701

Stat.ML Papers · Apr 4, 2022 · 2:45 AM UTC

Stat.ML Papers @StatMLPapers

4 Apr 2022

From Statistical to Causal Learning. (arXiv:2204.00607v1 [cs.AI]) ift.tt/j12L3eT

180

Stat.ML Papers · Jun 19, 2023 · 1:03 AM UTC

Stat.ML Papers @StatMLPapers

19 Jun 2023

Gradient is All You Need?. (arXiv:2306.09778v1 [cs.LG]) ift.tt/LtCQJM1

Gradient is All You Need? How Consensus-Based Optimization can be...

In this paper, we provide a novel analytical perspective on the theoretical understanding of gradient-based learning algorithms by interpreting consensus-based optimization (CBO), a recently...

arxiv.org

176

37,131

Stat.ML Papers · Jul 9, 2024 · 4:03 AM UTC

Stat.ML Papers @StatMLPapers

9 Jul 2024

How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning ift.tt/h1pwz7c

How DNNs break the Curse of Dimensionality: Compositionality and...

We show that deep neural networks (DNNs) can efficiently learn any composition of functions with bounded $F_{1}$-norm, which allows DNNs to break the curse of dimensionality in ways that shallow...

arxiv.org

171

11,797

Stat.ML Papers · Apr 23, 2025 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

23 Apr 2025

Deep learning with missing data ift.tt/cCHwpOk

Deep learning with missing data

In the context of multivariate nonparametric regression with missing covariates, we propose Pattern Embedded Neural Networks (PENNs), which can be applied in conjunction with any existing...

arxiv.org

175

7,166

Stat.ML Papers · Feb 8, 2022 · 8:44 AM UTC

Stat.ML Papers @StatMLPapers

8 Feb 2022

On Neural Differential Equations. (arXiv:2202.02435v1 [cs.LG]) ift.tt/ehrXWpv

177

Stat.ML Papers · Aug 30, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

30 Aug 2024

A More Unified Theory of Transfer Learning ift.tt/fwixebk

Adaptive Sample Aggregation In Transfer Learning

Transfer Learning aims to optimally aggregate samples from a target distribution, with related samples from a so-called source distribution to improve target risk. Multiple procedures have been...

arxiv.org

174

9,892

Stat.ML Papers · Dec 25, 2024 · 5:04 AM UTC

Stat.ML Papers @StatMLPapers

25 Dec 2024

A Phase Transition in Diffusion Models Reveals the Hierarchical Nature of Data ift.tt/wcySVzd

A Phase Transition in Diffusion Models Reveals the Hierarchical...

Understanding the structure of real data is paramount in advancing modern deep-learning methodologies. Natural data such as images are believed to be composed of features organized in a...

arxiv.org

175

10,038

Stat.ML Papers · Jul 1, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

1 Jul 2024

Kolmogorov-Smirnov GAN ift.tt/6DbXwzG

Kolmogorov-Smirnov GAN

We propose a novel deep generative model, the Kolmogorov-Smirnov Generative Adversarial Network (KSGAN). Unlike existing approaches, KSGAN formulates the learning process as a minimization of the...

arxiv.org

167

12,770

Stat.ML Papers · Feb 15, 2024 · 5:04 AM UTC

Stat.ML Papers @StatMLPapers

15 Feb 2024

Neural Fourier Transform: A General Approach to Equivariant Representation Learning ift.tt/njDOl4g

Neural Fourier Transform: A General Approach to Equivariant...

Symmetry learning has proven to be an effective approach for extracting the hidden structure of data, with the concept of equivariance relation playing the central role. However, most of the...

arxiv.org

168

11,967

Stat.ML Papers · May 5, 2022 · 2:43 AM UTC

Stat.ML Papers @StatMLPapers

5 May 2022

Making SGD Parameter-Free. (arXiv:2205.02160v1 [math.OC]) ift.tt/b2Xm6T1

168

Stat.ML Papers · Nov 14, 2022 · 2:43 AM UTC

Stat.ML Papers @StatMLPapers

14 Nov 2022

Do Bayesian Neural Networks Need To Be Fully Stochastic?. (arXiv:2211.06291v1 [cs.LG]) ift.tt/vLQG0XF

170

Stat.ML Papers · May 15, 2023 · 12:48 AM UTC

Stat.ML Papers @StatMLPapers

15 May 2023

Transformers in Time Series: A Survey. (arXiv:2202.07125v5 [cs.LG] UPDATED) ift.tt/fLnBqxr

Transformers in Time Series: A Survey

Transformers have achieved superior performances in many tasks in natural language processing and computer vision, which also triggered great interest in the time series community. Among multiple...

arxiv.org

173

21,091

Stat.ML Papers · Mar 17, 2025 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

17 Mar 2025

When Do Transformers Outperform Feedforward and Recurrent Networks? A Statistical Perspective ift.tt/CNKfIZw

When Do Transformers Outperform Feedforward and Recurrent...

Theoretical efforts to prove advantages of Transformers in comparison with classical architectures such as feedforward and recurrent neural networks have mostly focused on representational power....

arxiv.org

176

7,781

Stat.ML Papers · Jul 11, 2025 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

11 Jul 2025

Deep Learning is Not So Mysterious or Different ift.tt/eIrljhX

Deep Learning is Not So Mysterious or Different

Deep neural networks are often seen as different from other model classes by defying conventional notions of generalization. Popular examples of anomalous generalization behaviour include benign...

arxiv.org

164

8,955

Stat.ML Papers · Jul 9, 2024 · 4:03 AM UTC

Stat.ML Papers @StatMLPapers

9 Jul 2024

Can Machines Learn the True Probabilities? ift.tt/ETqwk4h

167

11,089

Stat.ML Papers · Apr 2, 2024 · 7:09 PM UTC

Stat.ML Papers @StatMLPapers

2 Apr 2024

Bayesian Nonparametrics: An Alternative to Deep Learning ift.tt/nMp7V3o

172

13,951

Stat.ML Papers · Oct 27, 2023 · 1:39 AM UTC

Stat.ML Papers @StatMLPapers

27 Oct 2023

The statistical thermodynamics of generative diffusion models. (arXiv:2310.17467v1 [stat.ML]) ift.tt/dUlBzcn

167

25,031

Stat.ML Papers · Jun 3, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

3 Jun 2024

Understanding Encoder-Decoder Structures in Machine Learning Using Information Measures ift.tt/Fh2NZT6

170

14,487

Stat.ML Papers · Nov 12, 2018 · 1:44 AM UTC

Stat.ML Papers @StatMLPapers

12 Nov 2018

Gradient Descent Finds Global Minima of Deep Neural Networks. (arXiv:1811.03804v1 [cs.LG]) ift.tt/2qGzUGf

166

Stat.ML Papers · Jun 3, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

3 Jun 2024

Uncertainty Quantification for Deep Learning ift.tt/XeSC3jw

169

15,945

Stat.ML Papers · Apr 18, 2025 · 4:03 AM UTC

Stat.ML Papers @StatMLPapers

18 Apr 2025

Applications of Statistical Field Theory in Deep Learning ift.tt/ehUTb5w

Applications of Statistical Field Theory in Deep Learning

Deep learning algorithms have made incredible strides in the past decade, yet due to their complexity, the science of deep learning remains in its early stages. Being an experimentally driven...

arxiv.org

170

8,501

Stat.ML Papers · Sep 9, 2024 · 4:05 AM UTC

Stat.ML Papers @StatMLPapers

9 Sep 2024

Latent Space Energy-based Neural ODEs ift.tt/ycxJVQz

167

11,622

Stat.ML Papers · Aug 23, 2024 · 4:08 AM UTC

Stat.ML Papers @StatMLPapers

23 Aug 2024

Transformers are Minimax Optimal Nonparametric In-Context Learners ift.tt/4qgG8cb

Transformers are Minimax Optimal Nonparametric In-Context Learners

In-context learning (ICL) of large language models has proven to be a surprisingly effective method of learning a new task from only a few demonstrative examples. In this paper, we study the...

arxiv.org

164

22,692

Stat.ML Papers · Nov 27, 2023 · 2:43 AM UTC

Stat.ML Papers @StatMLPapers

27 Nov 2023

More is Better in Modern Machine Learning: when Infinite Overparameterization is Optimal and Overfitting is Obligatory. (arXiv:2311.14646v1 [cs.LG]) ift.tt/SqMcEBV

163

32,419

Stat.ML Papers · May 22, 2023 · 12:53 AM UTC

Stat.ML Papers @StatMLPapers

22 May 2023

Your diffusion model secretly knows the dimension of the data manifold. (arXiv:2212.12611v4 [cs.LG] UPDATED) ift.tt/vKfrBPt

164

35,404

Stat.ML Papers · Jun 22, 2023 · 1:03 AM UTC

Stat.ML Papers @StatMLPapers

22 Jun 2023

Any Deep ReLU Network is Shallow. (arXiv:2306.11827v1 [cs.LG]) ift.tt/HtvMuSO

163

23,836

Stat.ML Papers · Apr 24, 2025 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

24 Apr 2025

A Geometric Approach to Problems in Optimization and Data Science ift.tt/7MOVr0p

A Geometric Approach to Problems in Optimization and Data Science

We give new results for problems in computational and statistical machine learning using tools from high-dimensional geometry and probability. We break up our treatment into two parts. In Part...

arxiv.org

166

7,508

Stat.ML Papers · Mar 15, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

15 Mar 2024

The statistical thermodynamics of generative diffusion models: Phase transitions, symmetry breaking and critical instability ift.tt/McRFLQW

159

13,539

Stat.ML Papers · Jul 1, 2022 · 7:43 AM UTC

Stat.ML Papers @StatMLPapers

1 Jul 2022

Neural Networks can Learn Representations with Gradient Descent. (arXiv:2206.15144v1 [cs.LG]) ift.tt/j9oGHXa

160

Stat.ML Papers · May 15, 2025 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

15 May 2025

Online Learning of Neural Networks ift.tt/CRwg9Eq

162

7,284

Stat.ML Papers · May 6, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

6 May 2024

Understanding LLMs Requires More Than Statistical Generalization ift.tt/x3aMZDj

156

14,662

Stat.ML Papers · Sep 24, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

24 Sep 2024

Physics-informed kernel learning ift.tt/LUSy9qi

Physics-informed kernel learning

Physics-informed machine learning typically integrates physical priors into the learning process by minimizing a loss function that includes both a data-driven term and a partial differential...

arxiv.org

156

9,234

Stat.ML Papers · Nov 3, 2017 · 12:38 AM UTC

Stat.ML Papers @StatMLPapers

3 Nov 2017

Don't Decay the Learning Rate, Increase the Batch Size. (arXiv:1711.00489v1 [cs.LG]) ift.tt/2zgmsyb

Don't Decay the Learning Rate, Increase the Batch Size

It is common practice to decay the learning rate. Here we show one can usually obtain the same learning curve on both training and test sets by instead increasing the batch size during training....

arxiv.org

158

Stat.ML Papers · Mar 31, 2025 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

31 Mar 2025

Manifold learning in Wasserstein space ift.tt/QZxTqEw

160

7,455

Stat.ML Papers · Feb 27, 2025 · 5:04 AM UTC

Stat.ML Papers @StatMLPapers

27 Feb 2025

Applications of Statistical Field Theory in Deep Learning ift.tt/u0ZDSBA

Applications of Statistical Field Theory in Deep Learning

Deep learning algorithms have made incredible strides in the past decade, yet due to their complexity, the science of deep learning remains in its early stages. Being an experimentally driven...

arxiv.org

161

7,179

Stat.ML Papers · Sep 16, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

16 Sep 2024

Theoretical guarantees in KL for Diffusion Flow Matching ift.tt/m4uYVqa

Theoretical guarantees in KL for Diffusion Flow Matching

Flow Matching (FM) (also referred to as stochastic interpolants or rectified flows) stands out as a class of generative models that aims to bridge in finite time the target distribution...

arxiv.org

161

11,926

Stat.ML Papers · Feb 2, 2024 · 3:39 PM UTC

Stat.ML Papers @StatMLPapers

2 Feb 2024

Position Paper: Bayesian Deep Learning in the Age of Large-Scale AI ift.tt/HdsyZ3U

152

14,706

Stat.ML Papers · Aug 1, 2024 · 4:03 AM UTC

Stat.ML Papers @StatMLPapers

1 Aug 2024

Manifold learning in Wasserstein space ift.tt/N7OurfM

156

10,306

Stat.ML Papers · Sep 22, 2025 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

22 Sep 2025

Information Geometry of Variational Bayes ift.tt/OU1Xptz

157

13,904

Stat.ML Papers · Oct 2, 2025 · 4:03 AM UTC

Stat.ML Papers @StatMLPapers

2 Oct 2025

Learn to Guide Your Diffusion Model ift.tt/xCednHj

Learn to Guide Your Diffusion Model

Classifier-free guidance (CFG) is a widely used technique for improving the perceptual quality of samples from conditional diffusion models. It operates by linearly combining conditional and...

arxiv.org

155

7,187

Stat.ML Papers · Jan 31, 2022 · 2:44 AM UTC

Stat.ML Papers @StatMLPapers

31 Jan 2022

Optimal Transport Tools (OTT): A JAX Toolbox for all things Wasserstein. (arXiv:2201.12324v1 [cs.LG]) ift.tt/Q3UYoDL58

152

Stat.ML Papers · Jun 10, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

10 Jun 2024

Towards a theory of out-of-distribution learning ift.tt/QmIM1Ts

155

12,310

Stat.ML Papers · Jun 17, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

17 Jun 2024

New algorithms for sampling and diffusion models ift.tt/OSQpd7m

158

14,462

Stat.ML Papers · Apr 10, 2025 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

10 Apr 2025

Hyperparameter Optimization in Machine Learning ift.tt/7P3cGaH

Hyperparameter Optimization in Machine Learning

Hyperparameters are configuration variables controlling the behavior of machine learning algorithms. They are ubiquitous in machine learning and artificial intelligence and the choice of their...

arxiv.org

151

5,945

Stat.ML Papers · Dec 25, 2024 · 5:04 AM UTC

Stat.ML Papers @StatMLPapers

25 Dec 2024

Towards understanding how attention mechanism works in deep learning ift.tt/n43pAEQ

153

8,502

Stat.ML Papers · Apr 5, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

5 Apr 2024

Deep Generative Models through the Lens of the Manifold Hypothesis: A Survey and New Connections ift.tt/p1uNPR3

Deep Generative Models through the Lens of the Manifold...

In recent years there has been increased interest in understanding the interplay between deep generative models (DGMs) and the manifold hypothesis. Research in this area focuses on understanding...

arxiv.org

149

14,188

Stat.ML Papers · Feb 26, 2025 · 5:04 AM UTC

Stat.ML Papers @StatMLPapers

26 Feb 2025

An Overview of Large Language Models for Statisticians ift.tt/Y2vcz4u

An Overview of Large Language Models for Statisticians

Large Language Models (LLMs) have emerged as transformative tools in artificial intelligence (AI), exhibiting remarkable capabilities across diverse tasks such as text generation, reasoning, and...

arxiv.org

169

21,862

Stat.ML Papers · May 22, 2024 · 4:03 AM UTC

Stat.ML Papers @StatMLPapers

22 May 2024

Wav-KAN: Wavelet Kolmogorov-Arnold Networks ift.tt/D6CtgvO

148

12,855

Stat.ML Papers · Dec 13, 2023 · 2:24 AM UTC

Stat.ML Papers @StatMLPapers

13 Dec 2023

Can a Transformer Represent a Kalman Filter?. (arXiv:2312.06937v1 [cs.LG]) ift.tt/UwpkBGn

153

18,789

Stat.ML Papers · Dec 30, 2024 · 5:04 AM UTC

Stat.ML Papers @StatMLPapers

30 Dec 2024

Neural Networks Perform Sufficient Dimension Reduction ift.tt/Sf2rVEo

Neural Networks Perform Sufficient Dimension Reduction

This paper investigates the connection between neural networks and sufficient dimension reduction (SDR), demonstrating that neural networks inherently perform SDR in regression tasks under...

arxiv.org

146

9,292

Stat.ML Papers · Aug 13, 2024 · 4:03 AM UTC

Stat.ML Papers @StatMLPapers

13 Aug 2024

Kernel Density Estimators in Large Dimensions ift.tt/5vmRkLO

153

10,602

Stat.ML Papers · Sep 2, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

2 Sep 2024

Transformers are Expressive, But Are They Expressive Enough for Regression? ift.tt/MeKtZva

Transformers are Expressive, But Are They Expressive Enough for Regression?

Transformers have become pivotal in Natural Language Processing, demonstrating remarkable success in applications like Machine Translation and Summarization. Given their widespread adoption,...

arxiv.org

156

12,129

Stat.ML Papers · Mar 7, 2025 · 5:04 AM UTC

Stat.ML Papers @StatMLPapers

7 Mar 2025

How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning ift.tt/KzWeqsm

How DNNs break the Curse of Dimensionality: Compositionality and...

We show that deep neural networks (DNNs) can efficiently learn any composition of functions with bounded $F_{1}$-norm, which allows DNNs to break the curse of dimensionality in ways that shallow...

arxiv.org

154

6,143

Stat.ML Papers · Jun 7, 2024 · 4:04 AM UTC

Stat.ML Papers @StatMLPapers

7 Jun 2024

Variational inference, Mixture of Gaussians, Bayesian Machine Learning ift.tt/D5LQuaG

Theoretical Guarantees for Variational Inference with...

Variational inference (VI) is a popular approach in Bayesian inference, that looks for the best approximation of the posterior distribution within a parametric family, minimizing a loss that is...

arxiv.org

145

11,787

Stat.ML Papers · May 7, 2024 · 4:03 AM UTC

Stat.ML Papers @StatMLPapers

7 May 2024

Causal K-Means Clustering ift.tt/vYIy9u1

Causal K-Means Clustering

Causal effects are often characterized with population summaries. These might provide an incomplete picture when there are heterogeneous treatment effects across subgroups. Since the subgroup...

arxiv.org

145

13,470

Stat.ML Papers · Jan 3, 2019 · 1:46 AM UTC

Stat.ML Papers @StatMLPapers

3 Jan 2019

Elimination of All Bad Local Minima in Deep Learning. (arXiv:1901.00279v1 [cs.LG]) bit.ly/2Qig6TR

143