Skip to content

Commit 5f6a5d6

Browse files
authored
Merge branch 'main' into epolon/oidc-x509
2 parents dd01b5f + 0e97c15 commit 5f6a5d6

21 files changed

+3817
-0
lines changed

packages/@aws-cdk/aws-sagemaker/README.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -156,3 +156,42 @@ import * as sagemaker from '@aws-cdk/aws-sagemaker';
156156
const bucket = new s3.Bucket(this, 'MyBucket');
157157
const modelData = sagemaker.ModelData.fromBucket(bucket, 'path/to/artifact/file.tar.gz');
158158
```
159+
160+
## Model Hosting
161+
162+
Amazon SageMaker provides model hosting services for model deployment. Amazon SageMaker provides an
163+
HTTPS endpoint where your machine learning model is available to provide inferences.
164+
165+
### Endpoint Configuration
166+
167+
By using the `EndpointConfig` construct, you can define a set of endpoint configuration which can be
168+
used to provision one or more endpoints. In this configuration, you identify one or more models to
169+
deploy and the resources that you want Amazon SageMaker to provision. You define one or more
170+
production variants, each of which identifies a model. Each production variant also describes the
171+
resources that you want Amazon SageMaker to provision. If you are hosting multiple models, you also
172+
assign a variant weight to specify how much traffic you want to allocate to each model. For example,
173+
suppose that you want to host two models, A and B, and you assign traffic weight 2 for model A and 1
174+
for model B. Amazon SageMaker distributes two-thirds of the traffic to Model A, and one-third to
175+
model B:
176+
177+
```typescript
178+
import * as sagemaker from '@aws-cdk/aws-sagemaker';
179+
180+
declare const modelA: sagemaker.Model;
181+
declare const modelB: sagemaker.Model;
182+
183+
const endpointConfig = new sagemaker.EndpointConfig(this, 'EndpointConfig', {
184+
instanceProductionVariants: [
185+
{
186+
model: modelA,
187+
variantName: 'modelA',
188+
initialVariantWeight: 2.0,
189+
},
190+
{
191+
model: modelB,
192+
variantName: 'variantB',
193+
initialVariantWeight: 1.0,
194+
},
195+
]
196+
});
197+
```
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
import * as cdk from '@aws-cdk/core';
2+
3+
/**
4+
* Supported Elastic Inference (EI) instance types for SageMaker instance-based production variants.
5+
* EI instances provide on-demand GPU computing for inference.
6+
*/
7+
export class AcceleratorType {
8+
/**
9+
* ml.eia1.large
10+
*/
11+
public static readonly EIA1_LARGE = AcceleratorType.of('ml.eia1.large');
12+
13+
/**
14+
* ml.eia1.medium
15+
*/
16+
public static readonly EIA1_MEDIUM = AcceleratorType.of('ml.eia1.medium');
17+
18+
/**
19+
* ml.eia1.xlarge
20+
*/
21+
public static readonly EIA1_XLARGE = AcceleratorType.of('ml.eia1.xlarge');
22+
23+
/**
24+
* ml.eia2.large
25+
*/
26+
public static readonly EIA2_LARGE = AcceleratorType.of('ml.eia2.large');
27+
28+
/**
29+
* ml.eia2.medium
30+
*/
31+
public static readonly EIA2_MEDIUM = AcceleratorType.of('ml.eia2.medium');
32+
33+
/**
34+
* ml.eia2.xlarge
35+
*/
36+
public static readonly EIA2_XLARGE = AcceleratorType.of('ml.eia2.xlarge');
37+
38+
/**
39+
* Builds an AcceleratorType from a given string or token (such as a CfnParameter).
40+
* @param acceleratorType An accelerator type as string
41+
* @returns A strongly typed AcceleratorType
42+
*/
43+
public static of(acceleratorType: string): AcceleratorType {
44+
return new AcceleratorType(acceleratorType);
45+
}
46+
47+
private readonly acceleratorTypeIdentifier: string;
48+
49+
constructor(acceleratorType: string) {
50+
if (cdk.Token.isUnresolved(acceleratorType) || acceleratorType.startsWith('ml.')) {
51+
this.acceleratorTypeIdentifier = acceleratorType;
52+
} else {
53+
throw new Error(`instance type must start with 'ml.'; (got ${acceleratorType})`);
54+
}
55+
}
56+
57+
/**
58+
* Return the accelerator type as a string
59+
* @returns The accelerator type as a string
60+
*/
61+
public toString(): string {
62+
return this.acceleratorTypeIdentifier;
63+
}
64+
}

0 commit comments

Comments
 (0)