Scheduling AWS EMR clusters resize

2019-07-22

Below a sample of howto schedule an Amzon Elastic MapReduce (EMR) cluster resize. It is useful if you have a cluster that is less used during the nights or in the weekends

I used a lambda function triggered by a Cloudwatch rule. Here is my python lambda function

import boto3, json

MIN=1
MAX=10

def lambda_handler(event, context):
    region = event["region"]
    ClusterId = event["ClusterId"]
    InstanceGroupId = event["InstanceGroupId"]
    InstanceCount = int(event['InstanceCount'])
    
    if InstanceCount >= MIN and InstanceCount <= MAX:
        client = boto3.client('emr', region_name=region)
        response = client.modify_instance_groups(
            ClusterId=ClusterId,
            InstanceGroups= [{
                "InstanceGroupId": InstanceGroupId,
                "InstanceCount": InstanceCount
            }])
        return response
    else:
        msg = "EMR cluster id %s (%s): InstanceCount=%d is NOT allowed [%d,%d]" % (ClusterId, region, InstanceGroupId, InstanceCount, MIN,MAX)
        return {"response": "ko", "message": msg}

Below the CloudWatch rule where the input event is a constant json object like 

{"region": "eu-west-1","ClusterId": "j-dsds","InstanceGroupId": "ig-sdsd","InstanceCount": 8}


Enter your instance's address


More posts like this

aws-ext

2021-06-25 | #aws #programming #python

The aws_ext python package contains some useful functions (built on top of boto3) for managing some aws services. At the moment only some utilities for the Aws Glue Data catalog

Continue reading 