This is a rough list of my favorite deep learning resources. It has been
useful to me for learning how to do deep learning, I use it for revisiting
topics or for reference. I (Guillaume Chevalier) have built this list and got through all of the content listed here,
carefully.
Contents
Trends
Here are the all-time
Google Trends, from 2004 up to now, September 2017:
You might also want to look at Andrej Karpathy’s
new post
about trends in Machine Learning research.
I believe that Deep learning is the key to make computers think more like
humans, and has a lot of potential. Some hard automation tasks can be
solved easily with that while this was impossible to achieve earlier with
classical algorithms.
Moore’s Law about exponential progress rates in computer science hardware
is now more affecting GPUs than CPUs because of physical limits on how
tiny an atomic transistor can be. We are shifting toward parallel
architectures [read more]. Deep learning exploits parallel architectures as such under the hood
by using GPUs. On top of that, deep learning algorithms may use Quantum
Computing and apply to machine-brain interfaces in the future.
I find that the key of intelligence and cognition is a very interesting
subject to explore and is not yet well understood. Those technologies are
promising.
Online Classes
-
DL&RNN Course
- I created this richely dense course on Deep Learning and Recurrent
Neural Networks.
-
Machine Learning by Andrew Ng on Coursera
- Renown entry-level online class with
certificate. Taught by: Andrew Ng, Associate Professor, Stanford University; Chief
Scientist, Baidu; Chairman and Co-founder, Coursera.
-
Deep Learning Specialization by Andrew Ng on Coursera
- New series of 5 Deep Learning courses by Andrew Ng, now with Python
rather than Matlab/Octave, and which leads to a
specialization certificate.
-
Deep Learning by Google
- Good intermediate to advanced-level course covering high-level deep
learning concepts, I found it helps to get creative once the basics are
acquired.
-
Machine Learning for Trading by Georgia Tech
- Interesting class for acquiring basic knowledge of machine learning
applied to trading and some AI and finance concepts. I especially liked
the section on Q-Learning.
-
Neural networks class by Hugo Larochelle, Université de Sherbrooke
- Interesting class about neural networks available online for free by
Hugo Larochelle, yet I have watched a few of those videos.
-
GLO-4030/7030 Apprentissage par réseaux de neurones profonds
- This is a class given by Philippe Giguère, Professor at University
Laval. I especially found awesome its rare visualization of the
multi-head attention mechanism, which can be contemplated at the
slide 28 of week 13’s class.
-
Deep Learning & Recurrent Neural Networks (DL&RNN)
- The most richly dense, accelerated course on the topic of Deep
Learning & Recurrent Neural Networks (scroll at the end).
Books
-
Clean Code
- Get back to the basics you fool! Learn how to do Clean Code for your
career. This is by far the best book I’ve read even if this list is
related to Deep Learning.
-
Clean Coder
- Learn how to be professional as a coder and how to interact with your
manager. This is important for any coding career.
-
How to Create a Mind
- The audio version is nice to listen to while commuting. This book is
motivating about reverse-engineering the mind and thinking on how to
code AI.
-
Neural Networks and Deep Learning
- This book covers many of the core concepts behind neural networks and
deep learning.
-
Deep Learning - An MIT Press book
- Yet halfway through the book, it contains satisfying math content on
how to think about actual deep learning.
-
Some other books I have read
- Some books listed here are less related to deep learning but are still
somehow relevant to this list.
Posts and Articles
-
Predictions made by Ray Kurzweil
- List of mid to long term futuristic predictions made by Ray Kurzweil.
-
The Unreasonable Effectiveness of Recurrent Neural Networks
- MUST READ post by Andrej Karpathy - this is what motivated me to learn
RNNs, it demonstrates what it can achieve in the most basic form of NLP.
-
Neural Networks, Manifolds, and Topology
- Fresh look on how neurons map information.
-
Understanding LSTM Networks
- Explains the LSTM cells’ inner workings, plus, it has interesting
links in conclusion.
-
Attention and Augmented Recurrent Neural Networks
- Interesting for visual animations, it is a nice intro to attention
mechanisms as an example.
-
Recommending music on Spotify with deep learning
- Awesome for doing clustering on audio - post by an intern at Spotify.
-
Announcing SyntaxNet: The World’s Most Accurate Parser Goes Open
Source
- Parsey McParseface’s birth, a neural syntax tree parser.
-
Improving Inception and Image Classification in TensorFlow
- Very interesting CNN architecture (e.g.: the inception-style
convolutional layers is promising and efficient in terms of reducing the
number of parameters).
-
WaveNet: A Generative Model for Raw Audio
- Realistic talking machines: perfect voice generation.
-
François Chollet’s Twitter -
Author of Keras - has interesting Twitter posts and innovative ideas.
-
Neuralink and the Brain’s Magical Future
- Thought provoking article about the future of the brain and
brain-computer interfaces.
-
Migrating to Git LFS for Developing Deep Learning Applications with
Large Files
- Easily manage huge files in your private Git projects.
-
The future of deep learning
- François Chollet’s thoughts on the future of deep learning.
-
Discover structure behind data with decision trees
- Grow decision trees and visualize them, infer the hidden logic behind
data.
-
Hyperopt tutorial for Optimizing Neural Networks’ Hyperparameters
- Learn to slay down hyperparameter spaces automatically rather than by
hand.
-
Estimating an Optimal Learning Rate For a Deep Neural Network
- Clever trick to estimate an optimal learning rate prior any single
full training.
-
The Annotated Transformer
- Good for understanding the “Attention Is All You Need” (AIAYN) paper.
-
The Illustrated Transformer
- Also good for understanding the “Attention Is All You Need” (AIAYN)
paper.
-
Improving Language Understanding with Unsupervised Learning
- SOTA across many NLP tasks from unsupervised pretraining on huge
corpus.
-
NLP’s ImageNet moment has arrived
- All hail NLP’s ImageNet moment.
-
The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer
Learning)
- Understand the different approaches used for NLP’s ImageNet moment.
-
Uncle Bob’s Principles Of OOD
- Not only the SOLID principles are needed for doing clean code, but the
furtherless known REP, CCP, CRP, ADP, SDP and SAP principles are very
important for developping huge software that must be bundled in
different separated packages.
-
Why do 87% of data science projects never make it into production?
- Data is not to be overlooked, and communication between teams and data
scientists is important to integrate solutions properly.
-
The real reason most ML projects fail
- Focus on clear business objectives, avoid pivots of algorithms unless
you have really clean code, and be able to know when what you coded is
“good enough”.
-
SOLID Machine Learning
- The SOLID principles applied to Machine Learning.
Practical Resources
Librairies and Implementations
-
Neuraxle, a framwework for machine learning pipelines
- The best framework for structuring and deploying your machine learning
projects, and which is also compatible with most framework (e.g.:
Scikit-Learn, TensorFlow, PyTorch, Keras, and so forth).
-
TensorFlow’s GitHub repository
- Most known deep learning framework, both high-level and low-level
while staying flexible.
-
skflow - TensorFlow
wrapper à la scikit-learn.
-
Keras - Keras is another intersting deep
learning framework like TensorFlow, it is mostly high-level.
-
carpedm20’s repositories -
Many interesting neural network architectures are implemented by the
Korean guy Taehoon Kim, A.K.A. carpedm20.
-
carpedm20/NTM-tensorflow
- Neural Turing Machine TensorFlow implementation.
-
Deep learning for lazybones
- Transfer learning tutorial in TensorFlow for vision from high-level
embeddings of a pretrained CNN, AlexNet 2012.
-
LSTM for Human Activity Recognition (HAR)
- Tutorial of mine on using LSTMs on time series for classification.
-
Deep stacked residual bidirectional LSTMs for HAR
- Improvements on the previous project.
-
Sequence to Sequence (seq2seq) Recurrent Neural Network (RNN) for
Time Series Prediction
- Tutorial of mine on how to predict temporal sequences of numbers -
that may be multichannel.
-
Hyperopt for a Keras CNN on CIFAR-100
- Auto (meta) optimizing a neural net (and its architecture) on the
CIFAR-100 dataset.
-
ML / DL repositories I starred
- GitHub is full of nice code samples & projects.
-
Smoothly Blend Image Patches
- Smooth patch merger for
semantic segmentation with a U-Net.
-
Self Governing Neural Networks (SGNN): the Projection Layer
- With this, you can use words in your deep learning models without
training nor loading embeddings.
-
Neuraxle - Neuraxle
is a Machine Learning (ML) library for building neat pipelines,
providing the right abstractions to both ease research, development, and
deployment of your ML applications.
-
Clean Machine Learning, a Coding Kata
- Learn the good design patterns to use for doing Machine Learning the
good way, by practicing.
Some Datasets
Those are resources I have found that seems interesting to develop models
onto.
Other Math Theory
Gradient Descent Algorithms & Optimization Theory
Complex Numbers & Digital Signal Processing
Okay, signal processing might not be directly related to deep learning,
but studying it is interesting to have more intuition in developing neural
architectures based on signal.
Papers
Recurrent Neural Networks
Convolutional Neural Networks
-
What is the Best Multi-Stage Architecture for Object Recognition?
- Awesome for the use of “local contrast normalization”.
-
ImageNet Classification with Deep Convolutional Neural Networks
- AlexNet, 2012 ILSVRC, breakthrough of the ReLU activation function.
-
Visualizing and Understanding Convolutional Networks
- For the “deconvnet layer”.
-
Fast and Accurate Deep Network Learning by Exponential Linear
Units
- ELU activation function for CIFAR vision tasks.
-
Very Deep Convolutional Networks for Large-Scale Image Recognition
- Interesting idea of stacking multiple 3x3 conv+ReLU before pooling for
a bigger filter size with just a few parameters. There is also a nice
table for “ConvNet Configuration”.
-
Going Deeper with Convolutions
- GoogLeNet: Appearance of “Inception” layers/modules, the idea is of
parallelizing conv layers into many mini-conv of different size with
“same” padding, concatenated on depth.
-
Highway Networks -
Highway networks: residual connections.
-
Batch Normalization: Accelerating Deep Network Training by Reducing
Internal Covariate Shift
- Batch normalization (BN): to normalize a layer’s output by also
summing over the entire batch, and then performing a linear rescaling
and shifting of a certain trainable amount.
-
U-Net: Convolutional Networks for Biomedical Image Segmentation
- The U-Net is an encoder-decoder CNN that also has skip-connections,
good for image segmentation at a per-pixel level.
-
Deep Residual Learning for Image Recognition
- Very deep residual layers with batch normalization layers - a.k.a.
“how to overfit any vision dataset with too many layers and make any
vision model work properly at recognition given enough data”.
-
Inception-v4, Inception-ResNet and the Impact of Residual Connections
on Learning
- For improving GoogLeNet with residual connections.
-
WaveNet: a Generative Model for Raw Audio
- Epic raw voice/music generation with new architectures based on
dilated causal convolutions to capture more audio length.
-
Learning a Probabilistic Latent Space of Object Shapes via 3D
Generative-Adversarial Modeling
- 3D-GANs for 3D model generation and fun 3D furniture arithmetics from
embeddings (think like word2vec word arithmetics with 3D furniture
representations).
-
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
- Incredibly fast distributed training of a CNN.
-
Densely Connected Convolutional Networks
- Best Paper Award at CVPR 2017, yielding improvements on
state-of-the-art performances on CIFAR-10, CIFAR-100 and SVHN datasets,
this new neural network architecture is named DenseNet.
-
The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for
Semantic Segmentation
- Merges the ideas of the U-Net and the DenseNet, this new neural
network is especially good for huge datasets in image segmentation.
-
Prototypical Networks for Few-shot Learning
- Use a distance metric in the loss to determine to which class does an
object belongs to from a few examples.
Attention Mechanisms
Other
YouTube and Videos
Misc. Hubs & Links
-
Hacker News - Maybe how
I discovered ML - Interesting trends appear on that site way before they
get to be a big deal.
-
DataTau - This is a hub similar to
Hacker News, but specific to data science.
-
Naver - This is a Korean search
engine - best used with Google Translate, ironically. Surprisingly,
sometimes deep learning search results and comprehensible advanced math
content shows up more easily there than on Google search.
-
Arxiv Sanity Preserver -
arXiv browser with TF/IDF features.
-
Awesome Neuraxle
- An awesome list for Neuraxle, a ML Framework for coding clean
production-level ML pipelines.
License
To the extent possible under law,
Guillaume Chevalier
has waived all copyright and related or neighboring rights to this work.