Getting started in deep learning is a struggle.
It’s a struggle because deep learning is taught by academics, for academics.
If you’re a developer (or practitioner), you’re different.
You want results.
The way practitioners learn new technologies is by developing prototypes that deliver value quickly.
This is a top-down approach to learning, but it is not the way that deep learning is taught.
There is another way. A way that works for top-down practitioners like you.
In this post, you will discover this other way.
(I teach this approach and have helped more than 1,145 developers
get their start in deep learning with python, click to learn more)
You will believe that being successful with applied deep learning is possible. I hope it will inspire you to take your first step towards this goal.
Let’s dive in.
You Want To Get Started In Deep Learning…
But You’re Different
You don’t have a Masters or Ph.D. in advanced math.
You’re not a machine learning expert.
You’re a professional or a student with a keen interest and eager to start using deep learning.
Maybe You’re a Developer
- You want to know how to apply deep learning to solve complex problems.
- You want deep learning skills to improve your job prospects.
- You want to use deep learning as leverage to get into a data scientist (or similar) position.
Maybe You’re a Data Scientist
- You want to use deep learning on a future project.
- You have a sticky problem that you think deep learning can help with.
- You want deep learning skills to stay relevant and on top of your field.
Maybe You’re a Student
- You want deep learning skills to improve your prospects for getting a job.
- You have an interesting problem for which you think deep learning will be a good fit.
- You want to discover why deep learning is so popular.
Does one of these reasons fit you?
Let me know in the comments, I’d love to hear your reason?
Do you have a different reason for getting into deep learning?
Let me know in the comments and I will give you personal advice.
The reasons for getting into the field of deep learning are varied.
Regardless, you’re treated the same as everyone else. Like an academic.
Deep Learning Is For Academics… The Lie
Deep Learning is an academic field of study.
It has been this way for a long time. The field used to involve the study of small artificial neural networks. Now the focus is on much larger networks and more exotic network architectures. The breakthroughs in the field are still coming from academia. The field is young and this is to be expected.
This means that most of the information on deep learning is written by academics. And it is written for other academics, like Researchers, Masters and Ph.D. students.
It is not written for developers, like us.
This is why you see bad advice like:
You need a PhD to get into deep learning.
Or comments like:
You need 3 years of advanced math before you can get into deep learning.
This is why getting started in deep learning is such a struggle. It is a challenge that developers think they can only solve by going back to school, going into debt and investing 3-to-7 years of their life.
You can work through deep learning tutorials in minutes. You can begin building a portfolio that you can use to show your growing skills in the field. And you can start today.
Programming Is Only For Computer Scientists (NOT)
Programming used to be hard and theoretical.
You needed to know a lot of math to understand programming before there were computers.
Things like computability and completeness theorems.
In the early days of programming, you had to define your own data structures and basic algorithms. This was before all the low-hanging fruit of algorithms and data structures were defined. This required a good understand of discrete math.
Things like complexity theory.
These theoretical topics can help you be a better programmer and engineer today. They are still taught in computer science courses.
But you and I both know that you don’t need them to get started in programming. You don’t even need these topics if you’re working in most programming jobs.
You call the sort routine,
you don’t derive a new sort operation from first principles.
Can we stretch this analogy to deep learning?
Do you need to derive the back propagation equation from first principles and implement it from scratch? Instead, we can just call model.fit() on the deep learning API.
Wait… What About The Top Engineers?
Yes, the top engineers can derive a new algorithm.
In fact, they are often hired to do just this. It’s their job. They can do the easy stuff and the harder stuff. They can call the sort routine and derive a new sort method for business data that is too big to fit into memory.
My point is that these capabilities do not have to come first, they can come later.
Top-down. Rather than bottom-up.
This is key.
Like practical programming in the real world.
The Top-Down Programmer (…that gets results)
Programming is fun in the beginning.
You learn this function. You learn that API. You stitch together your own programs and discover you can solve problems with your own ideas.
You’re productive early on and only get more productive with time. You can progress deeper into the theory to solve more challenging problems, or not. It’s up to you.
Being productive early is important for two reasons:
- It keeps you motivated, which keeps you engaged.
- It allows you to deliver value early, which feeds motivation.
It’s too easy to stop.
It’s too easy to give up.
It is a super power. Knowing that you can write a program to solve a specific problem. Then having the confidence to actually implement and deploy.
Code and design will be crappy, to begin with. Hard to maintain. Not viable for long-term operational use. But code gets better with experience, with mentors, with continual learning.
This is how the majority of IT operates. Top-down. Not bottom-up.
You don’t take a University course in computer language theory to learn Ruby on Rails for your next web development project. You work through some tutorials, make some mistakes and get familiar with the platform.
Repeat for the next framework, the next library. Again and again.
Repeat for Deep Learning.
Deep Learning Is NOT Just For The Academics
You can learn deep learning from the bottom-up.
It may take many years and a few higher degrees, but you will know a lot about the theory of deep learning techniques.
Even after all this effort, you may or may not know how to apply them in practice to real data. They generally don’t teach practical or vocational skills at University.
The academic textbooks, video courses, and journal papers are a fantastic resource to leverage. They are a gold mine of ideas. They are just not the place to begin when starting out with deep learning.
Focus on Delivering Value With Deep Learning
The value in deep learning to business and other fields of study is in reliable predictions.
Learn how to model problems with deep learning.
Develop (or steal) a systematic process for working through a predictive modeling problem. Then apply it again, and again and again, until you get really good at delivering this value.
Get good at applying deep learning.
We all like the things we’re good at.
If you can do this well and reliably, you will, in turn, have a valuable skill that the market wants in great supply.
You will find yourself diving into the academic papers parsing the Greek letters and emailing or calling the authors. All to extract gold nuggets that you can use to get better model performance on your next project.
Time For You To Get Success With Deep Learning
Now, hopefully, you believe that you can get started and get good at applied deep learning.
It is time to take action. It is time to get started with deep learning.
1. Pick a Framework
I recommend the Keras platform. It supports Python. Meaning you can leverage scikit-learn and the whole SciPy ecosystem for your deep learning projects.
This is important.
You essentially get data preparation, model evaluation and hyperparameter optimization for free.
Keras also provides a practitioner-friendly API (i.e. simple ant intuitive). It wraps the power (and unnecessary complexity) of the Theano and TensorFlow libraries. It gives you the speed and efficiency of bleeding edge frameworks, without the tens or hundreds of lines of code needed to make something work.
I have a ton of tutorials on Keras, as well as a free 14-day mini-course, see:
2. Pick a Process
Keep it simple, but pick a strong skeleton that you can add to and tailor to your preferred techniques and problem types.
A good set of general steps I like to use on predictive modeling projects are:
- Define Problem: What problem you are trying to solve and the data and framing you need to solve it.
- Prepare Data: What transforms to apply to the data to create views that best expose the structure of the prediction problem to the models.
- Evaluate Algorithms: What techniques to use to model the problem, and the metrics to filter good from bad solutions.
- Improve Results: What parameter tuning and even ensemble methods to use to get the most from what is working.
- Present Results: What results you achieved, lessons learned and the saved model or set of predictions of which you can make direct use.
For more on my process for working through predictive modeling problems, see:
3. Pick a Problem
You need practice. Lots of practice.
If you are interested in predictive modeling with image data, find all the standard machine learning problems with image data and work your way through them.
Text data? Video data? Use the same approach.
Learn how to get a result using your process.
Then learn how to get a good result.
Then a world class result.
The good thing about standard machine learning datasets is that you have a benchmark score to compare your results to.
Not sure about your preferences yet? Start with Multilayer Perceptrons on standard datasets from the UCI Machine Learning Repository (here’s a tutorial). Then try Convolutional Neural Networks on standard object recognition problems (here’s a tutorial). Move onto Recurrent Neural Networks on simple timeseries problems (here’s a tutorial).
Later, graduate to more complex problems used in machine learning competitions like those on Kaggle. Graduate further to defining your own problems and gathering data from creative commons.
Your goal is to develop a portfolio of completed projects.
This portfolio will be a resource you can leverage as you take on large and more challenging projects. It can also be a resource that you can use to show your growing skills with deep learning and ability to deliver value.
For more on developing a machine learning portfolio, see the post:
You discovered that deep learning as you know it is “deep learning for academics“. Not “deep learning for developers“.
You now know that there is a whole world of libraries and tutorials designed for you and developers like you.
You discovered a simple 3 step process that you can use to get success in deep learning as a developer, summarized as:
- Pick a Framework (like Keras).
- Pick a Process (like the one listed above).
- Pick a Problem (then develop a portfolio).
Has this changed your mind about deep learning?
Leave a comment and let me know.
Some more posts on deep learning that you might like to read include:
Want to take the next step?
Deep Learning With Python is my book.
It guides you from your first model to world-class results
with step-by-step tutorials and end-to-end projects.
It makes applied deep learning fun.
Click here to learn more.