Reduce Amazon SageMaker costs by shutting down notebook instances automatically
Amazon SageMaker is a powerful yet costly service. I’ve had projects where it accounted for two-thirds of the AWS bill. A less-than-obvious issue is that a SageMaker notebook instance incurs costs all the time that it is running, and it does not shut down automatically, when you stop working with it — at night, for example, though your schedule may vary.
It’s a good practice to remember to always shut down your Amazon SageMaker notebook instance, but if you want additional security, set up a scheduled shutdown. SageMaker does not provide this OOTB, but you can achieve this with the following steps:
- Write a simple Lambda function that stops the notebook via AWS SDK,
- Set up a Step Function that filters incoming events to those triggered by your notebook instance and instance status “InService”, waits for a specified amount of time (say, 8–10 hours) and triggers your Lambda function.
- Set up a CloudWatch event rule that triggers an event any time that a SageMaker notebook instance changes its state. Set your newly created step function as its target.
You will need the Step Function for two reasons: To delay the execution of your Lambda by a specific interval, and because you cannot set up the CloudWatch event rule in a more fine-grained way; it will always trigger on all notebook instances and all state changes.
Before we look at each of the steps, let’s set up the necessary IAM role.
The IAM part
You may go for a separate role for each of the three steps, but I prefer to unite them into one. It would need the following assume policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": [
"lambda.amazonaws.com",
"events.amazonaws.com",
"states.amazonaws.com"
]
},
"Effect": "Allow",
"Sid": ""
}
]
}
The function will need just three permissions:
sagemaker:StopNotebookInstance
for the Lambda functionlambda:InvokeFunction
for your Step Function state machinestates:StartExecution
for your CloudWatch event rule
That’s it, you don’t need any permissions to list or describe anything.
Killer Lambda function
Create a Lambda function that uses AWS SDK to stop your notebook instance. For Node.js, its code looks like this:
const AWS = require("aws-sdk");
const sagemaker = new AWS.SageMaker({apiVersion: '2017-07-24'});module.exports.handler = (event, context, callback) => {
sagemaker.stopNotebookInstance(
{ NotebookInstanceName: 'your-instance-name' },
(err, data) => {
if (err) return callback(null, err);
return callback(null, err);
}
);
};
Step Function state machine
Create a state machine with the following definition:
{
"Comment": "A state machine that shuts down notebook your-notebook-name after 10 hours",
"StartAt": "Filter",
"States": {
"Filter": {
"Type": "Choice",
"Choices": [
{
"And": [
{
"Variable": "$.NotebookInstanceName",
"StringEquals": "your-notebook-name"
},
{
"Variable": "$.NotebookInstanceStatus",
"StringEquals": "InService"
}
],
"Next": "Wait"
}
],
"Default": "DoNothing"
},
"Wait": {
"Type": "Wait",
"Seconds": 36000,
"Next": "Invoke Shutdown"
},
"Invoke Shutdown": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"Parameters": {
"FunctionName": "your-function-arn"
},
"End": true
},
"DoNothing": {
"Type": "Pass",
"End": true
}
}
}
I’ve highlighted in bold the parameters that you will need to adjust. Note that the step invoking the Lambda function requires a parameter called FunctionName
, but it is the ARN, not the name of the Lambda function that you will need to provide. The step DoNothing
is necessary because you need to point the choice step to next step even if the condition is not fulfilled.
CloudWatch event rule
Create an event rule with the event type “SageMaker Notebook Instance State Change” as source type and your step function state machine as a target. For input configuration, choose “Part of the matched event” and enter $.detail
to narrow down the input to the relevant part. You may choose “Matched event”, but then you would need to adjust the two Variable
attributes in the step named Filter
.
That’s it, you’re all set.
Image courtesy: Free-Photos from Pixabay