Bringing together the global data science community to help foster the exchange of innovative ideas and encourage the growth of open source software.

Boston, MA
Google recently unveiled the Google Dataset Search, a new product in the beta phase that you can use to find datasets published online. The single interface allows you to search repositories worldwide. #Google #OpenData #ODSC hubs.ly/H0fmGCB0
2
278
494
At ODSC AI West 2025, Andrej Radonjic @0xdrej, Co-Founder & CEO of @Wyndlabs_ai, pulls back the curtain in: The Hidden Infrastructure Behind AI: Building High-Performance Systems for Planet-Scale Data Collection. 🔗 Register → hubs.li/Q03QhDhM0
53
80
430
119,925
Fraud detection: using relational graph learning to detect collusion. #DataScience #MachineLearning #DeepLearning #Uber ubereng hubs.li/H0P8mc10
1
59
252
Netflix is using open-source Druid to help with real-time insights for optimal user experience. #DataScience hubs.ly/H0nJcVJ0
98
240
The same machine learning courses used to train Amazon employees are now available to all data scientists and data engineers through AWS for free. So, how is it? #AWS #MachineLearning #DataScience hubs.ly/H0h0t7S0
1
124
242
The same machine learning courses used to train Amazon employees are now available to all data scientists and data engineers through AWS for free. So, how is it? #AWS #MachineLearning #DataScience hubs.ly/H0h0t7T0
1
106
215
The same machine learning courses used to train Amazon employees are now available to all data scientists and data engineers through AWS for free. So, how is it? #AWS #MachineLearning #DataScience hubs.ly/H0jYXld0
1
82
155
arXiv now allows researchers to submit code with their manuscripts. #DataScience hubs.ly/H0y0rjZ0
47
164
Researchers from Google Research and UC San Diego have introduced PixelLLM, a sophisticated vision-language model. #DataScience #ArtificialIntelligence hubs.li/Q02dwWzz0
20
34
3,290
Netflix has officially open-sourced Metaflow, its data science framework designed for simplicity and ease-of-use. #DataScience hubs.ly/H0mcK1Y0
1
82
151
Here are a few Python libraries for data science tasks other than the commonly used ones like pandas, scikit-learn, matplotlib, and more. #DataScience #Python #MachineLearning hubs.ly/H0hc6D_0
79
141
In this article, we'll discuss and look into a new method of data mapping, including dimensionality reduction and network theory. #DataScience hubs.ly/H0lz1Ch0
1
64
140
Skewed data is common in data science; skew is the degree of distortion from a normal distribution. So, let's learn about transforming skewed data. #DataScience #MachineLearning hubs.ly/H0k1w2Q0
3
59
131
This free machine learning development stack is all you need to create end-to-end ML pipelines with ease. #DataScience #MachineLearning hubs.ly/H0yLqy_0
2
34
126
Let’s learn about Docker as a tool for data scientists, in particular in conjunction with the popular interactive programming platform, Jupyter, and AWS. @joshuacook #DataScience #ODSC hubs.ly/H0j14zh0
2
51
132
K-means is a helpful algorithm for with lots of potential uses, so versatile it can be used for almost any kind of data grouping. Here’s a deeper dive into it. #Python #MachineLearning hubs.ly/H0gSnDc0
2
55
123
Google research finds a way to reduce noise in training data. #DataScience hubs.ly/H0kCDsP0
51
108
Google launches TensorFlow machine learning framework for graphical data. #DataScience #MachineLearning #TensorFlow hubs.ly/H0kKPL40
1
44
98
In this tutorial, you will discover how to choose activation functions for neural network models. #DataScience #NeuralNetworks hubs.ly/H0FbzNT0
37
101
Need some data for your next NLP initiatives? Here are 20 open datasets for natural language processing that anyone can use. #DataScience #NLP #NaturalLanguageProcessing #ODSC hubs.ly/H0k80V30
1
48
105
Twitter grants academics full access to public data, but not for suspended accounts. #DataScience #OpenData hubs.ly/H0FB6ck0
1
27
94
In this post, you will learn the basic concepts of how Recurrent Neural Networks work, what the biggest issues are, and how to solve them. #DataScience #MachineLearning hubs.ly/H0hB-D_0
54
97
As many data science professionals begin to work remotely, it's a good time to consider using Jupyter Notebooks for your machine learning projects. #DataScience #JupyterNotebooks #MachineLearning hubs.ly/H0rXFJ00
1
46
92
Learning to scrape websites for data is essential to becoming a great data scientist. If the data you want to work with isn’t readily available, there’s always a solution - and collecting the data yourself is one of them. #DataScience #ODSC hubs.ly/H0j5zpZ0
1
41
94
New AI programming language goes beyond deep learning - general-purpose language works for computer vision, robotics, statistics, and more. #DataScience #AI #ArtificialIntelligence @MIT hubs.ly/H0jJCtr0
65
94
Here are 10 machine learning projects which will boost your portfolio and will help you to get a job as a data scientist. #DataScience #MachineLearning hubs.ly/H0sxS060
32
90
NLP has many applications across both business and software development, but roadblocks in human language have made text challenging to analyze and replicate. Why is that? #NLP #DataScience #NaturalLanguageProcessing #ODSC hubs.ly/H0hc6yl0
1
46
95
Here are 10 machine learning projects which will boost your portfolio and will help you to get a job as a data scientist. #DataScience #MachineLearning hubs.ly/H0sxRfs0
2
48
92
Chatbots aren’t a gimmick, as they’re becoming widely used by organizations of all shapes and sizes. Learn the fundamentals for creating your own chatbot, starting with the collection of data to training and testing. #Chatbot #ODSCWest #ODSC hubs.ly/H0f0ML10
1
41
89
Building and scaling data lineage at @Netflix to improve data infrastructure reliability and efficiency #DataScience #Netflix hubs.ly/H0hwp3T0
45
84
This article examines where we are with Bayesian Neural Networks and Bayesian Deep Learning by looking at some definitions, a little history, key areas of focus, current research efforts, and more. #DataScience #MachineLearning hubs.ly/H0KP26z0
25
88
Check out @PerceptiLabs’ low-code @TensorFlow tool - Visually build and interpret your deep learning model. It’s free for developers: hubs.ly/H0K_9R90 #ML #AI #MachineLearning #DataScience #TensorFlow #PerceptiLabs
1
33
88
Here’s how to read and run correlation plots in Python Pandas. #DataScience #Python #Pandas hubs.li/H0MK99w0
23
85
A simple application of Probabilistic Programming with PyMC3 in Python #Python #PythonLanguage #DataScience hubs.ly/H0fZKRJ0
1
49
82
Language processing is complicated, but these are a few trends and methods to help you perform NLP better in real-world scenarios. hubs.ly/Q017RlQ30
1
11
72
Why are so many people starting to use Keras in R for deep learning applications? Let’s see why. #DataScience #Keras #ODSC #DeepLearning @gdequeiroz hubs.ly/H0kh0880
1
26
75
Optimizing hyperparameters for machine learning models is a key step in making accurate predictions, as they define characteristics of the model that can impact model accuracy and computational efficiency. #DataScience #MachineLearning hubs.ly/H0jCWGR0
30
77
Hey @huggingface, we're having some fun with DALL-E 😎
3
9
80
In this article, we’ll go through the advantages of employing hierarchical Bayesian models and go through an exercise building one in R. #DataScience #Statistics hubs.ly/H0kvYP10
49
73
Uber open-sources Plato for developing and testing conversational AI. #DataScience #AI #ArtificialIntelligence @ubereng hubs.ly/H0jXYjT0
1
31
78
Microsoft introduces an open-source and cross-platform machine learning framework, ML NET. #MachineLearning #DataScience hubs.ly/H0j0GWB0
41
77
A tutorial using pandas, matplotlib, and seaborn to produce digestible insights from dirty data #DataScience #Python hubs.ly/H0hwwWf0
1
36
71
As many data science professionals begin to work remotely, it's a good time to consider using Jupyter Notebooks for your machine learning projects. #DataScience #JupyterNotebooks #MachineLearning hubs.ly/H0rXFJ10
1
30
68
Alexa researchers improve AI error rate up to 30% by reducing data imbalance #DataScience #AI #ODSC @alexadevs hubs.ly/H0h5T4X0
34
70
Scikit-Learn is one of the premier tools in the machine learning community, used by academics and industry professionals alike.The most important thing to figure out from the get-go is what we’re actually trying to learn. #DataScience #ODSC hubs.ly/H0ft5Zd0
22
70
A tutorial using pandas, matplotlib, and seaborn to produce digestible insights from dirty data #DataScience #Python hubs.ly/H0hwvFQ0
28
68
Learning to scrape websites for data is essential to becoming a great data scientist. If the data you want to work with isn’t readily available, there’s always a solution, and collecting the data yourself is one of them. #ODSC #DataScience #OpenData hubs.ly/H0hVT_C0
20
72
Google research finds a way to reduce noise in training data. #DataScience hubs.ly/H0kCGkF0
39
70
Google announced the opening of access to Bard, its new AI-powered chatbot, and likely answer to OpenAI's wildly popular ChatGPT. #DataScience #Google #AI #ArtificialIntelligence #Bard hubs.li/Q01JDRBh0
21
75
17,272
This step-by-step guide will help you use R to build your first Bayesian model, which are models that offer a method for making probabilistic predictions about the state of the world. #DataScience #ODSC #RProgramming #AI#MachineLearning hubs.ly/H0jJxnT0
34
69
GitHub is a playground for data science and AI projects. With all of the latest-and-greatest projects, what are a few that are viewed as the best in the community this summer? #DataScience #GitHub hubs.ly/H0kS2rM0
20
70
Fraud detection: using relational graph learning to detect collusion. #DataScience #MachineLearning #DeepLearning #Uber @ubereng hubs.li/H0P8lMc0
1
20
68
Covering topics like telling stories with data to exploring new ML frameworks, these are 21 free machine learning talks coming to ODSC East 2022. hubs.ly/Q017Rf-w0
1
10
54
PhD candidates often work on some fascinating data science projects. Here are 10 standout machine learning dissertations that may interest you. #DataScience #MachineLearning hubs.ly/H0ljt2M0
20
68
Let’s explore Tensorflow, a popular library often used for solving deep learning problems and for training and evaluating processes up to the model deployment. #MachineLearning #DeepLearning #DataScience #ODSC @TensorFlow hubs.ly/H0hq3yJ0
43
65
As many data science professionals begin to work remotely, it's a good time to consider using Jupyter Notebooks for your machine learning projects. #DataScience #JupyterNotebooks hubs.ly/H0tWzWR0
26
64
This article examines where we are with Bayesian Neural Networks (BBNs) and Bayesian Deep Learning (BDL) by looking at some definitions, a little history, key areas of focus, current research efforts, and more. #DataScience #DeepLearning hubs.ly/H0tDYDr0
35
63
This article is the first in a four-part series that introduces three popular ensemble methods: bagging, boosting, and stacking. #DataScience #MachineLearning hubs.li/H0SXSKK0
36
67
In a leadup to his talk at ODSC West on ML algorithms and unique use cases, Kirk Borne gives a bit of background on what makes these use cases so novel. @KirkDBorne #DataScience #MachineLearning hubs.ly/H0kw90B0
27
59