DTSA 5511 Introduction to Deep Learning
- Specialization: Machine Learning
- Instructor: Geena Kim, Assistant Teaching Professor
- Prior knowledge needed: Calculus, Linear algebra, Python
Learning Outcomes
- Explain what multilayer perceptrons, convolutional neural networks and recurrent neural networks are and how they work.
- Build and train neural network models using deep learning libraries such as Keras.
- Tune hyperparameters of a neural network model and an optimizer to improve performance.
- Choose and/or design neural network architectures.
- Explain how optimization and weight update works in neural network training
Course Content
We are starting off the course with a busy week. This week's module has two parts. In the first part, after a quick introduction to Deep Learning's exciting applications in self-driving cars, medical imaging, and robotics, we will learn about artificial neurons called perceptrons. Interestingly, neural networks are loosely modeled on the human brain with perceptrons mimicking neurons. After we learn to train a simple perceptron (and become aware of its limitations), we will move on to more complex multilayer perceptrons. The second part of the module introduces the backpropagation algorithm, which trains a neural network through the chain rule. We will finish by learning how deep learning libraries like Tensorflow create computation graphs for gradient computation. This week, you will have two short quizzes, a Jupyter lab programming assignment, and an accompanying Peer Review assignment. This material, notably the backpropagation algorithm, is so foundational to Deep Learning that it is essential to take the time necessary to work through and understand it.
Last week, we built our Deep Learning foundation, learning about perceptrons and the backprop algorithm. This week, we are learning about optimization methods. We will start with Stochastic Gradient Descent (SGD). SGD has several design parameters that we can tweak, including learning rate, momentum, and decay. Then we will turn our attention to advanced gradient descent methods like learning rate scheduling and Nesterov momentum. Besides vanilla gradient descent, other optimization algorithms include AdaGrad, AdaDelta, RMSprop, and Adam. We will cover general tips to reduce overfitting while training neural networks, including regularization methods like dropout and batch normalization. This week, you will build your DL toolkit, gaining experience with the Python library Keras. Assessments for the week include a quiz and a Jupyter lab notebook with an accompanying Peer Review.
This assignment is your last Jupyter lab notebook for the course. For the next three weeks, you will build hands-on experience and complete weekly mini-projects that incorporate Kaggle challenges.
This module will teach a type of neural network called convolutional neural networks, suitable for image analysis tasks. We will learn about definitions, design parameters, operations, hyperparameter tuning, and applications. There is no Jupyter lab notebook this week. You will have a brief quiz and participate in a clinically relevant Kaggle challenge mini-project. It is critical to evaluate whether cancer has spread to the sentinel lymph node for staging breast cancer. You will build a CNN model to classify whether digital pathology images show that cancer has spread to the lymph nodes. This project utilizes the PCam dataset, which has an approachable size, with the authors noting that "Models can easily be trained on a single GPU in a couple of hours, and achieve competitive scores." As you prepare for the week, look over the rubric and develop a plan for how you will complete it. It will be necessary for a project like this to work on a timeframe that allows you to run experiments. The expectation is not that you will cram the equivalent of a final project into a single week or that you need to have a top leaderboard score to receive a good grade for this project. Hopefully, you will have time to achieve some exciting results to show off in your portfolio.
This module will teach you another neural network called recurrent neural networks (RNNs) to handle sequential data. So far, we have covered feed-forward neural networks, including Multi-layer Perceptrons and CNNs. However, in biological systems, information can flow backward and forwards. RNNs do a backward pass closer to biological systems. Using RNNs has excellent benefits, especially for text data, since RNN architectures reduce the number of parameters. We will learn about the vanishing and exploding gradient problems that can arise when working with vanilla RNNs and remedies for those problems, including GRU and LSTM cells.
We don't have a quiz this week, but we have a Kaggle challenge mini-project on NLP with Disaster Tweets. The project is a Getting Started competition designed for learners building their machine learning background. The challenge is very doable in a week, but make sure to start early to run experiments and iterate a bit.
You will complete a peer reviewed final project worth 45% of your grade. You must attempt the final in order to earn a grade in the course. If you've upgraded to the for-credit version of this course, please make sure you review the additional for-credit materials in the Introductory module and anywhere else they may be found.
Note: This page is periodically updated. Course information on the Coursera platform supersedes the information on this page. Click View on Coursera button above for the most up-to-date information.