Skip to content

Commit e27afc5

Browse files
authored
Fix canary failures (#31) (#34)
Description of changes: - Remove dockerhub login step and use ECR public gallery ubuntu image: Individual user credentials need to be created and registered for all regions to pull from dockerhub. However, this step is redundant when we are already using credentials vended by ECR and can pull from the ECR public gallery. Making this change to facilitate region expansion by avoiding having to create new dockerhub users for each region - Update eksctl latest git release URL: The URL has been updated and the old URL no longer works - Implement per region instance type config for canary and e2e tests: Canary tests in the eu-north-1 region are failing since the currently specified instance type requires a limit increase to be used. Implementing a config to specify instance type per region that is within the default limits. Precedence for changes: - Remove dockerhub login step and use ECR public gallery ubuntu image: aws-controllers-k8s/sagemaker-controller@2ae5c33 - Update eksctl latest git release URL: aws-controllers-k8s/sagemaker-controller#87 - Implement per region instance type config for canary and e2e tests: aws-controllers-k8s/sagemaker-controller#87 By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. Issue #, if available: Description of changes: By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
1 parent 0892219 commit e27afc5

File tree

4 files changed

+9
-6
lines changed

4 files changed

+9
-6
lines changed

test/canary/Dockerfile.canary

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
FROM ubuntu:18.04
1+
FROM public.ecr.aws/ubuntu/ubuntu:18.04
22

33
# Build time parameters
44
ARG SERVICE=applicationautoscaling
@@ -30,7 +30,7 @@ RUN curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.18.6/b
3030
&& cp ./kubectl /bin
3131

3232
# Install eksctl
33-
RUN curl --silent --location "https://github.com/weaveworks/eksctl/releases/download/latest_release/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp && mv /tmp/eksctl /bin
33+
RUN curl --silent --location "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp && mv /tmp/eksctl /bin
3434

3535
# Install Helm
3636
RUN curl -q -L "https://get.helm.sh/helm-v3.2.4-linux-amd64.tar.gz" | tar zxf - -C /usr/local/bin/ \

test/canary/canary.buildspec.yaml

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,6 @@ phases:
1010
- aws ecr get-login-password --region $CLUSTER_REGION | docker login --username AWS --password-stdin $ECR_CACHE_URI || true
1111
- docker pull ${ECR_CACHE_URI}:latest --quiet || true
1212

13-
# Login to dockerhub to avoid hitting throttle limit
14-
- docker login -u $DOCKER_CONFIG_USERNAME -p $DOCKER_CONFIG_PASSWORD
15-
1613
# Build test image
1714
- >
1815
docker build -f ./test/canary/Dockerfile.canary . -t ${ECR_CACHE_URI}:latest

test/e2e/common/sagemaker_utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ def sagemaker_make_endpoint_config(model_name, endpoint_config_name):
9393
"VariantName": "variant-1",
9494
"ModelName": model_name,
9595
"InitialInstanceCount": 1,
96-
"InstanceType": "ml.c5.large",
96+
"InstanceType": REPLACEMENT_VALUES["ENDPOINT_INSTANCE_TYPE"],
9797
}
9898
],
9999
}

test/e2e/replacement_values.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,8 +42,14 @@
4242
"sa-east-1": "737474898029.dkr.ecr.sa-east-1.amazonaws.com",
4343
}
4444

45+
ENDPOINT_INSTANCE_TYPES = {
46+
"eu-west-3": "ml.m5.large",
47+
"eu-north-1": "ml.m5.large",
48+
}
49+
4550
REPLACEMENT_VALUES = {
4651
"SAGEMAKER_DATA_BUCKET": get_bootstrap_resources().SageMakerDataBucketName,
4752
"SAGEMAKER_EXECUTION_ROLE_ARN": get_bootstrap_resources().SageMakerExecutionRoleARN,
4853
"SAGEMAKER_XGBOOST_IMAGE_URI": f"{SAGEMAKER_XGBOOST_IMAGE_URIS[get_region()]}/sagemaker-xgboost:1.0-1-cpu-py3",
54+
"ENDPOINT_INSTANCE_TYPE": ENDPOINT_INSTANCE_TYPES.get(get_region(), 'ml.c5.large'),
4955
}

0 commit comments

Comments
 (0)