Elastic Compute for Deep Learning

Cloud service providers such as Amazon Web Services are offering data scientists and machine learning specialists out there new compute options designed to accelerate specialized applications like training deep learning models. These compute resources are more cost-effective than other options, and if you’re currently using GPU-enhanced instances to train your models in PyTorch or TensorFlow then you should consider trying these new accelerated EC instances like the DL1. To get the word out, Amazon is sponsoring a hackathon next month to attract attention from the developer community, and it comes with a $200 AWS credit to help you develop your eligible computer vision or natural language project idea.

Powered by 8 Gaudi Accelerators

Amazon claims their DL1 instances are up to 40% more cost-effective than GPU instances for training deep learning models, and looking under the hood it’s easy to see why. Each instance is powered by 8 Habana Labs’ Gaudi accelerators cross-connected to each other with 100 Gb/s networking, and each with a programmable Tensor Processing Core (TPC). Their integration into the AWS ecosystem also makes it a snap for you to install supporting software (Habana Labs’ SynapseAI software stack), frameworks and drivers from an AMI or container. Habana Labs developer resources site can jump start your existing model migration, leveraging their model garden examples or developing your own custom kernels.

What is this about a Hackathon?

The AWS Deep Learning Challenge is scheduled for 5. January, 2022, through 14. February, 2022. You can sign-up on the hackathon’s devpost.com website. Be sure to read the terms and conditions carefully, particularly about having the intellectual property rights to any data your deep learning model needs to train itself on.

They are looking for computer vision or natural language processing project ideas; two areas for which there are no shortage of training materials out there for students looking to get more experience in the field of machine learning.

Developers and teams with an eligible project idea can request AWS credits, which should help defray the cost of training your deep learning model on DL1 instances.

Tip to Reduce DL1 Training Costs Further

AWS EC instance spot pricing constantly changes, but while DL1 instances are still relatively new there is going to be a surplus of instances available early on. You may be able to stretch your AWS credits in this hackathon by checking DL1 spot pricing availability during off-hours. I was surprised to see them priced at about a two-thirds discount.

Looking for a Better Result

Some of you who attended my 2019 Seattle Code Camp demonstration of an OpenCV-based supervised machine learning model for identifying user emotional sentiment from facial expression recognition. Using 160 training images of actor Patrick Stewart; could I identify the emotion of another 40 unseen images of the actor? Disappointingly, the results were little more accurate than what would be expected by chance. I chalk it up to irregularity across my training data set (not all of the head shots were taken from a fully-frontal perspective, and also lighting conditions and age varied significantly).

This time I’m going into this hackathon with an even better project idea (one that overlaps both computer vision and natural language processing categories, actually). After the hackathon is over, I will be sure to describe my experience with using PyTorch on the DL1 instance here for you all. Hopefully, I have convinced some of you to sign-up for the hackathon yourselves.

Make it so.