Online communities are invaluable in machine learning, regardless of your skill level. The reason is that, like programming, you never stop learning. You simply cannot know everything, there are always new algorithms, new data and new combinations to discover and practice.
Communities help. You can get your questions answered, learn by answering other peoples questions and discover new areas from reading through the exchanges.
Machine learning communities have had a big impact on my education and in this blog post I want to list all of the online machine learning communities I know about so that you too can make the most of them.
The stack exchange sites are question and answer communities, so they are targeted towards problem solving. You can post the specific questions you have, answer questions to which you know the answer and (my favorite) read questions and answers to discover new methods and perspectives.
There are four sites I like to dip into:
- Cross Validated: This site is useful for low-level questions on algorithms and statistical methods.
- Quantitative Finance: (specifically the machine learning tag) This site is useful if you are operating in the financial domain, but generally if you are working with time series data.
- Programmers: (specifically the machine learning tag) Great for specific code questions, such as a problem with a given library or tool you are using.
- Stack Overflow: (specifically the machine learning tag) Again, like programmers, great for specific questions with the implementation side of machine learning. It’s also the oldest site and can cover machine learning algorithms and libraries.
There is a new site that has started up, but is still in beta, so it may not survive. It is called Data Science and I am finding it very interesting for the general concerns of applied machine learning (mix of code and math).
Reddit is a community of communities called sub-reddits. A given subreddit can be question and answer site, a link sharing site or (more typically) a mix of the two.
A few sub-reddits I frequent include:
- Machine Learning: Contains of mix of “how do I get started” and more advanced links to machine learning blog posts. Also good for linking to your own projects to get some feedback.
- Computer Vision: Mostly questions on computer vision questions both theoretical and practical (such as libraries).
- Natural Language: Focus on natural language processing, providing a good mix of questions and links to relevant articles and blog posts.
- Statistics: Discussion on statistical software and methods, great for digging deeper into a given method or algorithm.
- Data Science: Mostly links to posts that straddle data analysis and machine learning.
- Big Data: Focused posts and discussions on the big data ecosystem.
There are other sub-reddits on relevant and related topics, but I have not found them as useful.
Quora is a question and answer site that is divided into topics, much like reddit but only questions and answers. The questions are typically good and the answers high quality. Unlike the stack exchange sites, they are typically less technical, less problem focused and more meaty.
A few Quora topics I frequent include:
- Machine Learning: Useful for high-level questions on algorithms, processes, resources and getting started. A good mix.
- Statistics: Focus on deeper statistical methods and algorithms, but includes a lot of machine learning content.
- Data Mining: Good questions with a focus on the applied side of machine learning, but a lot of overlap with Machine Learning.
- Data Science: Much like the Data Mining and Machine learning topics, the questions are typically a higher level.
There are many other topics that might be useful, not limited to Data Analysis, Predictive Analytics, NLP and Computer Vision. Also there are topics on specific methods such as SVM, Deep Learning, Classification, and R.
There are some other great communities around that I could not classify as easily.
- MetaOptimize Q+A: Like Cross Validated, this is a question and answer site that is great for lower level questions on specific algorithms and methods. Maths and theory heavy.
- Kaggle Forums: Great for discussion around specific competitions and datasets, and full of great nuggets of advice for feature engineering, ensembling and refining your test harnesses.
- DataTau: A social news site with a focus on links to posts on data and machine learning relate topics. Low traffic and useful links.
Some social media websites have machine learning groups. I don’t use these as much, but I mention them because you might find them useful than me.
There are also some LinkedIn groups that might be interesting, specifically, Data Mining and Machine Learning and the Machine Learning Connection. Again, like Google+ groups, there are multiple LinkedIn groups for a given area without clear leaders.
There used to be usenet groups I used to voraciously consume, but they all seem dead (or supplanted) these days.
Finally, consider communities in meat space. Take a look at a site like Meet Up and search for meet-ups in your area on subjects like machine learning, R, data analytics and data science. R user groups are typically a great place to learn and connect with professionals.
Do you know about another machine learning community? Leave a comment.