Scheduling AWS EMR clusters resize


Below a sample of howto schedule an Amzon Elastic MapReduce (EMR) cluster resize. It is useful if you have a cluster that is less used during the nights or in the weekends

I used a lambda function triggered by a Cloudwatch rule. Here is my python lambda function

import boto3, json


def lambda_handler(event, context):
    region = event["region"]
    ClusterId = event["ClusterId"]
    InstanceGroupId = event["InstanceGroupId"]
    InstanceCount = int(event['InstanceCount'])
    if InstanceCount >= MIN and InstanceCount <= MAX:
        client = boto3.client('emr', region_name=region)
        response = client.modify_instance_groups(
            InstanceGroups= [{
                "InstanceGroupId": InstanceGroupId,
                "InstanceCount": InstanceCount
        return response
        msg = "EMR cluster id %s (%s): InstanceCount=%d is NOT allowed [%d,%d]" % (ClusterId, region, InstanceGroupId, InstanceCount, MIN,MAX)
        return {"response": "ko", "message": msg}

Below the CloudWatch rule where the input event is a constant json object like 

{"region": "eu-west-1","ClusterId": "j-dsds","InstanceGroupId": "ig-sdsd","InstanceCount": 8}

Enter your instance's address

More posts like this


2021-06-25 | #aws #programming #python

The aws_ext python package contains some useful functions (built on top of boto3) for managing some aws services. At the moment only some utilities for the Aws Glue Data catalog

Continue reading 