Experienced Data Science Leader | PhD in Machine Learning | 7x Author | Black Belt 🥋 in Time Series | Chief Conformal Prediction Promoter| Mathematician |

London
NEW BOOK: Mastering Modern Forecasting — Principles and Practice in Python. 🔥🔥🔥 16 chapters. Every method from classical statistics to foundation models. Working Python code throughout. The definitive practitioner's reference. e throughout. The definitive practitioner's reference. Preorder now:
6
46
355
51,898
Hidden deep inside the depth of internet, a treasure. #probability
50
639
5,475
643,030
“algorithms for decision making”, @MIT publishing book is freely available.
21
373
3,027
278,520
Turns out, a simple sum operator might just render #deeplearning obsolete. Solid math-based ideas are proving once again to outperform gimmicks. #kolmogorovarnoldnetwork
75
200
2,534
472,671
RIP Monte Carlo @GoogleDeepMind releases the code for Conformal Monte Carlo. #conformalprediction
20
295
2,539
336,705
The Holy Grail of forbidden knowledge about matrices' math happens to be on the web. #math #matrices
19
277
2,447
487,115
Karpathy was right: YouTube is infotainment at best and entertainment at worst. Real learning comes from structured courses with expert feedback. You can’t master hard subjects like math by watching videos—you learn math by solving problems. #learning
122
147
2,514
113,743
Step-by-Step Diffusion: An Elementary Tutorial
13
302
2,457
377,019
🎯 Tracking enemy aircraft with math? Enter the godfather of cybernetics Norbert Wiener. During WWII, MIT mathematician Norbert Wiener (1894–1964) was tasked with a critical challenge: predict the future position of German bomber aircraft using only past radar observations. 📡 The result?
19
354
2,411
129,033
“algorithms for decision making”, from @MIT publishing book is freely available.
14
224
2,090
131,941
LeCun is right, when enrolling into PhD program don’t work on what is a hype topic of today. It was true in 2015 for reinforcement learning, it is true in 2025 for LLMs. The topic of tomorrow won’t be the hype topic of today, find a promising niche tech and work on it instead. Like conformal prediction for example. #research #conformalprediction
77
196
2,088
537,750
150 pages review paper on the applications of machine learning in finance. #machinelearning #finance
11
303
2,027
260,042
Goldman Sachs has an #python #opensource package gs-quant? Covers data access, pricing, markets, risk and hedging.
13
234
1,911
286,441
A 150 page review paper on the applications of machine learning in finance. #machinelearning #finance
38
334
1,928
186,337
The new edition of one of the most popular machine learning textbooks is coming in summer 2023 and unlike all the previous editions is in Python 🐍 and available for free as PDF book. hastie.su.domains/ISLP/ISLP_… When it comes to machine learning Python is clearly taking over, for time series it already did couple of years ago, see my Medium article “Python vs R for time-series forecasting” medium.com/@valeman/python-v… #machinelearning #python
20
454
1,907
279,853
One of the best (the best?) books on linear algebra. #math
28
150
1,754
118,476
The best way to learn #machinelearning is to: Avoid popular content that oversimplifies concepts, especially when written by individuals with no real ML experience. Dive into high-quality books and work through them diligently. While it may be challenging at first, the effort will pay off in the long run. As @karpathy recently noted, steer clear of entertainment disguised as “courses.” True learning is a gradual process that requires genuine effort and dedication #machinelearning
28
200
1,742
156,956
The Holy Grail of forbidden knowledge about matrices' math happens to be on the web. #math #matrices
20
207
1,716
117,763
150 pages review paper on the applications of machine learning in finance. #machinelearning #finance
267
235
1,681
146,685
Kolmogorov-Arnold network obliterates Deepmind's results with much smaller networks and much more automation. KANs also discovered new formulas for signature and discovered new relations of knot invariants in unsupervised ways. Incredible 🔥🔥🔥🔥🔥 The bell 🔔has tolled for deep learning finally? #kan #deeplearning
35
182
1,622
222,855
The greatest math book of all times. #math
27
202
1,632
131,437
Hidden deep inside the depth of internet, a treasure. #probability
10
206
1,619
107,723
Did you know that people tried to prove central limit theorem for over two centuries, first starting with de Moivre (1733), then almost a century after by Laplace who both used binomial distribution. Then it was Poisson who worked on this theorem, and Chebyshev (1890–1891) who gave a rigorous demonstration of it in the middle of the nineteenth century. At the beginning of the twentieth century, the Russian mathematician Liapounov, Aleksandr Mikhailovich (1901) created the generally recognized form of the central limit theorem by introducing its characteristic functions.  Markov, Andrei Andreevich(1908) also worked on it and was the first to generalize the theorem to the case of independent variables. In 1924 Kolmogorov started to become interested in research in Probability Theory and in 1928 he was able for the first time to formulate necessary and sufficient conditions of the Law of Large Numbers that escaped other best mathematicians of the time for many decades. It has taken the best mathematicians almost two centuries to prove conditions for LLN and prove CLT. In fact there is almost 500 (!) pages book describing the history of CLT. medicine.mcgill.ca/epidemiol… #statistics #machinelearning #gaussian
13
323
1,576
177,321
Good ones read this instead
do cs majors actually read this book?
41
136
1,536
104,529
KAN-GPT 🔥🔥🔥 #kan
32
188
1,523
226,257
RIP Monte Carlo @GoogleDeepMind releases the code for Conformal Monte Carlo. github.com/google-deepmind/u… #conformalprediction
8
245
1,473
202,681
When it comes to explainable AI this is all one needs #xai
8
228
1,484
154,788
Scikit-learn has become an antique museum piece in machine learning. It is still paraded around as if it were modern, but in reality it lags far behind.
32
61
1,418
214,018
RIP Monte Carlo @GoogleDeepMind releases the code for Conformal Monte Carlo. #conformalprediction
9
93
1,397
179,761
A 150 pages review paper on the applications of machine learning in finance. papers.ssrn.com/sol3/papers.… #machinelearning #finance
8
288
1,375
109,291
Never ask a woman her age, a man his salary or a Cambridge machine learning department why waste taxpayer funds on frameworks that neither work nor scale like on Gaussian processes or Bayesian deep nets. #bayesianism
42
146
1,363
350,030
One of the best stats textbooks by the great Larry Wasserman. #statistics
7
134
1,362
64,024
📈 Kolmogorov & Wiener: The Godfathers of Modern Forecasting Before the 1950s, forecasting was part art, part guesswork. That changed thanks to two brilliant minds—Andrey Kolmogorov and Norbert Wiener.
8
186
1,341
51,788
150 pages review paper on the applications of machine learning in finance. papers.ssrn.com/sol3/papers.… #machinelearning #finance
6
306
1,312
131,036
A new paper from China 🇨🇳 introduces **Sundial**, a groundbreaking family of time series foundation models that **outperforms all previous foundation models** in time series forecasting.
15
162
1,341
127,594
🎯 Tracking enemy aircraft with math? Enter the godfather of cybernetics Norbert Wiener. During WWII, MIT mathematician Norbert Wiener (1894–1964) was tasked with a critical challenge: predict the future position of German bomber aircraft using only past radar observations. 📡 The result?
11
209
1,316
85,807
People often ask about a good book on statistics. Larry Wasserman is in my opinion one of the best (the best?) statistics professors there . Larry is also the academic ambassador of conformal prediction who was the first to introduce conformal prediction to the leading academia in the USA 🇺🇸. His statistic book is available open access for free and there is also companion R code for the book. stat.cmu.edu/~larry/all-of-s… #statistics
15
191
1,267
190,557
Hidden deep inside the depth of internet, a treasure. #probability
5
183
1,268
57,619
Kolmogorov Arnold Network” is one of the best innovations of 2024. “Kolmogorov–Arnold-Informed neural network: A physics-informed deep learning framework for solving PDEs based on Kolmogorov–Arnold Networks” “Our results demonstrate that KINN significantly outperforms MLP in terms of accuracy and convergence speed for numerous PDEs in computational solid mechanics, except for the complex geometry problem. This highlights KINN’s potential for more efficient and accurate PDE solutions in AI for PDEs” #KAN
10
213
1,232
120,830
What a gem 💎 written by a math grandee. #book #math
7
126
1,236
61,590
Great to see that cleared. #deeplearning
17
158
1,215
209,409
Looking for good intro to econometrics? This book got it covered. #econometrics
6
159
1,218
57,152
“It must be emphasized that this book is not an ordinary textbook but one in which certain carefully selected topics of theory and an abundant amount of problem solving will enable the student to expand and deepen his knowledge of the school course of elementary mathematics and enable him better to begin the study of higher mathematics in higher educational institutions” #math #book
10
176
1,149
71,796
Fourier series are everywhere. Pure Gold. #math #book
4
131
1,125
48,688
A new 150 pages review paper on the applications of machine learning in finance. #machinelearning #finance papers.ssrn.com/sol3/papers.…
13
281
1,114
94,214
The original LLMs.
6
89
1,070
42,333
“Are Transformers Effective for Time Series Forecasting?” represents a pivotal paper, decisively highlighting the shortcomings and deficiencies in research surrounding the use of transformers for #timeseries #forecasting. This paper effectively exposes the deceptive practices employed by various authors in their papers, such as inadequate benchmarking and other tactics, which have previously led to inflated claims regarding the performance of transformers in this domain.
15
190
1,084
141,562
More probability recommendations, this one is by the founder of the theory of probability itself. Short and sweet. #probability
12
134
1,070
72,314
Did you know that it has taken some of the best mathematicians over two centuries to prove central limit theorem, first starting with de Moivre (1733), then almost a century after by Laplace who both used binomial distribution. Then it was Poisson who worked on this theorem, and Chebyshev (1890–1891) who gave a rigorous demonstration of it in the middle of the nineteenth century. At the beginning of the twentieth century, the Russian mathematician Liapounov, Aleksandr Mikhailovich (1901) created the generally recognized form of the central limit theorem by introducing its characteristic functions. Markov, Andrei Andreevich(1908) also worked on it and was the first to generalize the theorem to the case of independent variables. In 1924 Kolmogorov started to become interested in research in Probability Theory and in 1928 he was able for the first time to formulate necessary and sufficient conditions of the Law of Large Numbers that escaped other best mathematicians of the time for many decades. It has taken the best mathematicians almost two centuries to prove conditions for LLN and prove CLT. In fact there is almost 500 (!) pages book describing the history of CLT. #statistics #machinelearning #gaussian
12
163
1,039
102,790
A lot of people are asking where one can find python code for the best book to start with forecasting by the great Rob Hyndman. Here are not one, not two but 4(!) companion repos for the book. 1. Python Read-Along: Forecasting: Principles and Practice 2.
8
172
1,046
67,015
Andrey Kolmogorov: The Soviet Counterpart to Claude Shannon When we think of information theory, the name that usually comes to mind is Claude Shannon, the American mathematician whose 1948 paper “A Mathematical Theory of Communication” founded the field.
19
171
1,037
84,049
A 150 pages review paper on the applications of machine learning in finance. papers.ssrn.com/sol3/papers.… #machinelearning #finance
11
206
982
94,415
People often ask me what is the best way to start with time series and forecasting in 2024. Well the answer is still the same as in 2023, 2022 and 2021, the best way to start with forecasting in 2024 is to learn the fundamentals from a great book by @robjhyndman. #timeseries #forecasting #machinelearning
14
165
924
60,821
In 2023, there is no need to slap Gaussian distribution on top of everything. 

One can surely do better than using XVIII century method invented by Gauss. 

Most of the data encountered in any industry and socio economic phenomena is never normal, whether it concerns retail, energy, finance, health or anything else really. 

One should always challenge the use of normality assumption by data sciensists and researchers given that one can always do better in 2023. Credit: whoever created this picture 

#conformalprediction #machinelearning #gaussian #statisitcs
32
109
914
169,511
Fourier’s Vision, Kolmogorov’s Counterexample Joseph Fourier boldly claimed that any function could be represented as a sum of sines and cosines — a Fourier series. His insight revolutionized physics and mathematics, but it came with a major flaw: a lack of rigor.
11
125
948
59,021
All you need is Kolmogorov–Arnold Network! 🔥🔥🔥 complete with GitHub repo 🚀🚀🚀🚀🚀 'KAN: Kolmogorov–Arnold Networks' from @MIT and @Caltech h/t @illumattnati
10
144
910
103,647
How a Feud Between Mathematicians Birthed Markov Chains—and Revolutionized Probability Picture this: Russia, 1906. Two brilliant mathematicians are locked in a heated debate. On one side, Pavel Nekrasov insists that Central Limit Theorem only works under strict independence.
7
146
934
55,147
The history of calculus #math
9
123
879
47,256
RIP Monte Carlo @GoogleDeepMind releases the code for Conformal Monte Carlo. #conformalprediction
13
87
933
64,076
Algorithms for optimization. Completely free 520 pages book from MIT. #math
5
127
914
59,127
The best way to learn #machinelearning is to: Avoid popular content that oversimplifies concepts, especially when written by individuals with no real ML experience. Dive into high-quality books and work through them diligently.
4
106
892
44,462
Google-owned Kaggle. Flawed metrics, fewer tabular data competitions, and faux pas like synthetic (fake) data competitions… the result is predictable. Kaggle is no longer the #1 competitions platform. Not even second. CodaLab and Tianchi now host far more competitions, while other platforms like Grand Challenge and EvalAI are gaining traction. The competition landscape in 2024 looks very different—and it’s clear the community is voting with their feet. #machinelearning
17
72
894
61,047
Did you know that people tried to prove central limit theorem for over two centuries, first starting with de Moivre (1733), then almost a century after by Laplace who both used binomial distribution.
7
105
870
51,606
People often ask about a good book on statistics. Larry Wasserman is, in my opinion, one of the best (the best?) statistics professors there.
6
108
878
45,469
By far the best intro book into #probability
7
77
869
48,461
Pattern recognition and machine learning is a book written by legendary machine learning from Microsoft for people studying machine learning. If you want to learn machine learning from machine learning perspective this is the book that is the book that machine learning programs at good universities use. Free PDF courtesy of Microsoft #MachineLearning microsoft.com/en-us/research…
16
157
853
128,434
Data scientists 👨‍🔬 need to learn about forecasting, one can’t just do .fit .predict thinking that #timeseries is the same as iid data. Having worked with many data scientists and mathematicians who didn’t have previous exposure to time series, econometrics and forecasting one often gets bemused about some data scientists trying to either apply methods that were not designed for time series or reinventing somethings like either creating weird metrics or not even properly validating time series models. There is certainly no need to reinvent the wheel unless one is already very fluent in time series and is doing research pushing knowledge frontiers further. One of the best tutorials tailored specifically for data scientists. #timeseries #machinelearning #datascientists #econometrics #datascience
12
164
870
108,279
The book that led me into #conformalprediction
3
119
845
106,339
The paper we have been waiting for essentially shows that #timeseries #llms do not work in forecasting. Back in 2022, paper “Are Transformers Effective for Time Series Forecasting?“ challenged the appearing narrative that transformers are useful for forecasting. By removing transformer elements the authors showed the performance went up ⬆️ And now people did the same with time series LLMs. The papers demonstrated: - removing the LLM component or replacing it with a basic attention layer does not degrade the forecasting results—in most cases the results even improved! - in fact removing even removing the language model entirely, yields comparable or better performance! - these simpler methods after removal of LLM component reduce training and inference time by up to three orders of magnitude while maintaining comparable performance! - the sequence modeling capabilities of LLMs do not transfer to time series. By shuffling input time series the authors find no appreciable change in performance. What this says is that LLMs can’t deal with critical features of time series, the time order is key and if LLMs performance doesn’t change when shuffling data it basically means it doesn’t model time series. These finding are as damming to time series LLMs as the “Are Transformers Effective for Time Series Forecasting?” was for transformers. #timeseries #forecasting
11
181
854
120,089
Want an intro level book for #probablity? This one got it covered.
6
99
845
55,516
The people who fit linear regression to cloud of points like this… what century are they living in? They call themselves “academics” and “researchers”, but the reality is when it comes to the toolkit they use it is literally right out of statistical equivalent of Stone Age.
Needing glasses (myopia) is associated with higher IQ.
29
54
842
59,623
A new paper from Google that peddles “foundational model” for time series forecasting is both an example of beginner mistakes coupled with deployment of deceptive “benchmarks.” In figure 6 the authors try to portray performance of the new “wunderwaffe“ in positive light. The only problem - well in fact there are several. 1. One should never evaluate performance visually. This is a beginner 1.01 mistake and has been explicitly mentioned in Forecast Evaluation for Data Scientists: Common Pitfalls and Best Practices” by Christoph Bergmeir and Hansika Hewamalage. As the authors of this great tutorial explain “The visual appeal of a generated forecast or the possibility of such a forecast to happen in general are not good criteria to judge forecasts.” The tutorial explains in more details why. 2. Google authors deployed a standard trick to embellish performance of their new “foundational model.” They used classical datasets such as air passengers that can be very easily fit using classical models (almost to perfection). Did they use classical models as benchmark? Of course not - what they did instead is they used another bad model llmtime. #timeseries #forecasting
16
151
829
153,146
The bible of #timeseries analysis. One might argue that machine learning has taken over. True, BUT until and unless someone created a Machine Learning Time Series Analysis 'bible' similar to what Kevin Murphy did for general machine learning, Hamilton's book (800 pages!) is and will remain the real 'bible' of time series. For now, nothing remotely comparable exists on the market, no matter how many other books about time series are published. And if one does not understand critical concepts in econometrics (basically a form of machine learning for time series), one should never do machine learning for time series. Does one have to read the whole Hamilton book? No? Does one has to take and read a smaller and simpler book about econometrics - yes. Doing time series without understanding basic concepts of #econometrics is like jumping into Deep Learning without understanding linear regression. #machinelearning #timeseries #machinelearning #deeplearning
10
134
827
132,628
Absolute gem, didn’t know it was translated. One of the best math schoolbooks ever. #math
6
105
802
65,369
Great to see that cleared. Vapnik for Nobel prize. 🏆 #deeplearning
4
91
820
64,711
If you are in your 20s invest into this book. Get into debt if you have to or just buy student edition. #informationtheory
21
88
814
47,777
Hidden deep inside the depth of internet, a treasure. #probability
3
92
807
36,399
Quite simply, by far the best high school math textbook ever created. Written by a group of brilliant authors, including long-standing members of the MIPT mathematics department, this book sets a gold standard in clarity, rigor, and depth. It goes beyond rote learning, teaching students how to think mathematically—a rare quality in textbooks today. #math archive.org/details/g.-n.-ya… Both parts on -> github.com/valeman/Awesome_M…
8
144
796
34,851
📈 Kolmogorov & Wiener: The Godfathers of Modern Forecasting Before the 1950s, forecasting was part art, part guesswork. That changed thanks to two brilliant minds—Andrey Kolmogorov and Norbert Wiener.
4
138
802
40,827
One of the best (the best?) books on linear algebra. #math
7
72
754
38,971
Legendary book by Kolmogorov himself. #math #book
9
79
747
37,244
A lot of people are asking where one can find python code for the best book to start with forecasting by the great Rob Hyndman. Here are not one, not two but 4(!) companion repos for the book. 1. Python Read-Along: Forecasting: Principles and Practice 2. Exercises and others experiments in Python  3. The University of Sydney. Predictive Analytics (QBUS2820), undergradute course at the University of Sydney Business School. 4. Nixtla/fpp3-python #timeseries #forecasting
6
130
748
69,546
The bible of #timeseries analysis. One might argue that machine learning has taken over. 

True, BUT until and unless someone created a Machine Learning Time Series Analysis 'bible' similar to what @sirbayes did for general machine learning Hamilton's book (800 pages!) is and will remain the indisputable bible of time series.

For the time being nothing even remotely comparable exists on the market no matter how many yet another book about time series are published. 

And if one does not understand key concepts in econometrics (basically a form of machine learning for time series) one should never do machine learning for time series. 

Does one have to read the whole Hamilton book? No? Does one has to take and read a smaller and simpler book about econometrics - yes. 

Doing time series without understanding basic concepts of #econometrics is like jumping into Deep Learning without understanding linear regression. 

#machinelearning #timeseries #machinelearning #deeplearning
8
121
754
85,511
Good math based ideas 💡 will always outperform subjective and contrived approaches. Conformal Prediction has been developed based on the ideas from Kolmogorov’s complexity in conversations between Vovk (last PhD student of Kolmogorov) and Kolmogorov himself. Kolmogorov was an absolute titan of the XXst century math, you can read about his larger than human contributions here where I have tried to describe just a small fraction of what he did across numerous scientific fields. medium.com/@valeman/andrey-k… A lot of people have been asking what are the books on Kolmogorov’s complexity. This is the main “bible” book on the subject. #kolmogorovcomplexity #conformalprediction
5
126
722
86,370
People often ask me what they should study when starting an MSc or PhD in machine learning. My advice: skip the books written by statisticians that masquerade as ML texts. For MSc, PhD, or even advanced BSc preparation, there’s a proven resource created by an actual ML expert that insiders rely on: Caltech's Abu Mostafa’s "Learning from Data" course. It teaches the fundamentals that truly matter, using the language and perspective of machine learning itself. #machinelearning
11
76
754
48,118
Awesome Math Books got its first 100 GitHub stars 🌟
4
73
724
45,014
Legendary book by Kolmogorov himself. #math #book
2
100
707
26,742
One of the best books on #deeplearning is open access and will be published by MIT publishing in December. udlbook.github.io/udlbook/
5
147
714
103,472
Hard to believe people are still posting about p-values from mid XXst century when forward looking researchers are moving to e-values.
P-value. P-value = 0.043 = 0.051
11
43
734
74,096
A study, “Why do tree-based models still outperform deep learning on tabular data?” confirms tree-based models outperform deep learning and explain some of the reasons why. Paper ->hal.science/hal-03723551 When it comes to #tabulardata and #timeseries (by far the most important majority of data for almost any real company), deep learning is not one needs. #tabulardata #deeplearning #machinelearning #boostedtrees #XGBoost#LightGBM #CatBoost
7
152
703
119,422
“An Idiot’s guide to Support vector machines (SVMs)” #svm
8
70
700
75,288
A 150 pages review paper on the applications of machine learning in finance. #machinelearning #finance
6
103
719
39,968
Comprehensive KAN paper walkthrough. #kan
5
95
701
52,676