Aidan Mala

IT professional

Automated Custom AWS Sagemaker Images with Azure CI/CD pipelines

Sagemaker (an aws platform for machine learning/AI tasks) in its current form allows you to use a default distribution (base image) of packages and plugins. However, if for compliance purposes you are required to specify the base image in order to limit availability of particular packages, the use of a custom “BYO image” becomes necessary. The aws documentation goes over how to do this in a manually, however this process would become long and arduous if the base image needs to be updated on a daily schedule. In this article, I will go over how to automate this process within Azure devops CI/CD pipeline.

Overview of solution

  • Clear the settings on each Sagemaker domain in the account
  • For each of the following, check if they have been created and if not create them
    • ecr repository
    • Sagemaker image
    • Sagemaker app image configuration
  • Build docker image
  • Push docker image to ecr
  • Push docker image from ecr to Sagemaker image
  • Update all Sagemaker domains in the account with the new image added
---
title: Pipeline flow
---
flowchart TB
    Trigger([Weekly schedule trigger])

    subgraph Box1 ["**Job 1**"]
        direction LR
        
        A@{ shape: procs, label: "Sagemaker Domains in account"}
        B[Clear domain settings]
        A -- for each --> B
    end

    Trigger -- Triggers --> Box1

   subgraph Box2 ["**Job 2**"]
        direction LR
        C@{ shape: procs, label: "Custom Docker images"}

        subgraph Box3 ["Matrix process"]
            direction LR

            subgraph Box4 ["Check and create"]
                direction LR

                D([ecr repository])
                E([Sagemaker image])
                F([Sagemaker app image config])
                J@{ shape: diamond, label: "Created" }
                K[Create]
                D --> J
                E --> J
                F --> J
                J -- yes --> K
                
            end
            


            G[Build Docker image]
            H[Push Docker image to ecr]

            I@{ shape: procs, label: "Sagemaker Domains in account"}
            Box4 --> G --> H -- Update --> I



        end
        
        
        C -- for each --> Box3
        


    
    end

    Box1 -- Trigger--> Box2

Clear Sagemaker domain settings

First the settings for the domains must be cleared

aws sagemaker list-domains \
| jq -r '.Domains.[].DomainId' \
| while read DomainId; do \
    aws sagemaker update-domain \
    --domain-id "$DomainId"\
    --default-user-settings'{"JupyterLabAppSettings": {"CustomImages": []}}'
done

Build image

REPOSITORY_NAME:'data-analysis-ecr'
IMAGE_TAG: '$(Build.SourceVersion)

Check if resources have already been created

Need to login to ECR – follow the aws guide for this step

**Your aws ecr docker authentication goes here (from above)**

# Check if ecr repository needs to be created
REPOSITORY_CREATED=$(aws-ecr describe-repositories \
    | jq -r 'any( .repositories.[].repositoryName == "$(REPOSITORY_NAME)/(repository)"; .) ")
echo "##vso[task.setvariable variable=REPOSITORY_CREATED]$REPOSITORY_CREATED"

# Check if Sagemaker image needs to be created
SAGEMAKER_IMAGE_CREATED=$(aws sagemaker list-images | jq -r 'any(.Images. [].ImageName == "$ (name)"; .) ')
echo "##vso[task.setvariable variable=SAGEMAKER_IMAGE_CREATED]$SAGEMAKER_IMAGE_CREATED"

# Set the APP_IMAGE_CONFIG_NAME variable for later use
APP_IMAGE_CONFIG_NAME=$(echo '$(APP_IMAGE_CONFIG_PREFIX)-$(name)' | awk '(print tolower($0)}')
echo "#vso[task.setvariable variable=APP_IMAGE_CONFIG_NAME]$APP_IMAGE_CONFIG_NAME"

# Check if the ap image config needs to be created
APP_IMAGE_CONFIG_CREATED=$(aws sagemaker list-app-image-configs \
    | jq -r "any( .AppImageconfigs.[].AppImageconfigName == \"$APP_IMAGE_CONFIG_NAME\"; . )")
echo "##vso[task.setvariable variable=APP_IMAGE_CONFIG_CREATED]$APP_IMAGE_CONFIG_CREATED"

Create ecr repo

condition: eq(variables['REPOSITORY_CREATED'], 'false')
inputs:
  scripts:
    aws ecr create-repository --repository-name$(REPOSITORY_NAME)/(repository)
        

Create app image config

if [["$(appType)" = "JupyterLab" ]]; then
  # Make app image config for JupyterLab
  aws sagemaker create-app-image-config \
    --app-image-config-name "$(APP_IMAGE_CONFIG_NAME)" \
    --jupyter-lab-app-image-config "{}"
else
  # Make app image config for CodeEditor
  sagemaker create-app-image-config \
    --app-image-config-name"$(APP_IMAGE_CONEIG_NAME)" \
    --code-editor-app-image-config "{}"
fi

Build docker image

cd $(dockerFileLocation)
docker buildx build --t $(REGISTRY)/$(REPOSITORY NAME)/$(repository):$(IMAGE_TAG) .

Push docker image to ecr

docker push $(REGISTRY)/$(REPOSITORY NAME)/$(repository):$(IMAGE_TAG)

Push docker image from ecr to Sagemaker image

aws sagemaker create-image-version \
  --image-name "$(name)" \
  --base-image- $(REGISTRY)/$(REPOSITORY NAME)/$(repository):$(IMAGE_TAG)

IMAGE_VERSION_NUMBER=$( aws sagemaker describe-image-version \
  --image-name "$(name)" \
  | jq -r '.Version" )

PREVIOUS_VERSION=$((IMAGE_VERSION_NUMBER - 2))

if (( $PREVIOUS_VERSION > 0 )); then
  aws sagemaker delete-image-version
  --image-name "$(name)" \
  --version-number $PREVIOUS_VERSION
fi

PREVIOUS_IMAGE_VERSION_NUMBER=$(($IMAGE_VERSION_NUMBER - 1))
echo "##vso[task.setvariable variable=IMAGE_VERSION_NUMBER]$IMAGE_VERSION_NUMBER"
echo "##vso[task.setvariable variable=PREVIOUS_IMAGE_VERSION_NUMBER]$PREVIOUS_IMAGE_VERSION_NUMBER"

Update the Sagemaker domains

aws sagemaker list-domains | jq -r '.Domains.[].DomainId' | while read DomainId; do

  #Add two images unless there is no previous version
  if (($(PREVIOUS_IMAGE_VERSION_NUMBER) > 0 )); then
    updated_settings=$(aws sagemaker describe-domain --domain-id "$Domainid" \
      | jq -r ".DefaultUserSettings.$(appType)AppSettings.CustomImages | . + [{"ImageName":"$(name)","ImageVersionNumber":$(IMAGE_VERSION_NUMBER),"AppImageConfigName":"$(APP_IMAGE_CONFIG_NAME)"},{"ImageName":"$(name)", "ImageVersionNumber":$(PREVIOUS_IMAGE_VERSION_NUMBER), "AppImageConfigName":"$(APP_IMAGE_CONFIG_NAME)"}]')
  else # If there is no previous version, only add one image
    updated_settings=$(aws sagemaker describe-domain --domain-id "$Domainid" \
      | jq -r ".DefaultUserSettings.$(appType)AppSettings.CustomImages | . + [{"ImageName":"$(name)","ImageVersionNumber":$(IMAGE_VERSION_NUMBER),"AppImageConfigName":"$(APP_IMAGE_CONFIG_NAME)"}]')
fi

  app_type_name="$(appType)AppSettings"

  updated_settings_formatted=$(jq -n \
    --arg AppType "$app_type_name" \
    --argjson CustomImages "$updated_settings" \
    '{$AppType: {"CustomImages": $CustomImages}}')

  aws sagemaker update-domain --domain-id "$DomainId" --default-user-settings "$updated_settings_formatted"
done