Skip to content

Commit 7e2cc1c

Browse files
committed
Docs: Versions the quickstart guide
Signed-off-by: Daneyon Hansen <[email protected]>
1 parent a0cdd8b commit 7e2cc1c

File tree

1 file changed

+43
-42
lines changed

1 file changed

+43
-42
lines changed

site-src/guides/index.md

Lines changed: 43 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -10,13 +10,15 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
1010
## **Prerequisites**
1111

1212
A cluster with:
13-
- Support for services of type `LoadBalancer`. For kind clusters, follow [this guide](https://kind.sigs.k8s.io/docs/user/loadbalancer)
13+
14+
- Support for services of type `LoadBalancer`. For kind clusters, follow [this guide](https://kind.sigs.k8s.io/docs/user/loadbalancer)
1415
to get services of type LoadBalancer working.
15-
- Support for [sidecar containers](https://kubernetes.io/docs/concepts/workloads/pods/sidecar-containers/) (enabled by default since Kubernetes v1.29)
16+
- Support for [sidecar containers](https://kubernetes.io/docs/concepts/workloads/pods/sidecar-containers/) (enabled by default since Kubernetes v1.29)
1617
to run the model server deployment.
1718

1819
Tooling:
19-
- [Helm](https://helm.sh/docs/intro/install/) installed
20+
21+
- [Helm](https://helm.sh/docs/intro/install/) installed.
2022

2123
## **Steps**
2224

@@ -44,7 +46,7 @@ Tooling:
4446

4547
```bash
4648
kubectl create secret generic hf-token --from-literal=token=$HF_TOKEN # Your Hugging Face Token with access to the set of Llama models
47-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/gpu-deployment.yaml
49+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/vllm/gpu-deployment.yaml
4850
```
4951

5052
=== "CPU-Based Model Server"
@@ -63,7 +65,7 @@ Tooling:
6365
Deploy a sample vLLM deployment with the proper protocol to work with the LLM Instance Gateway.
6466

6567
```bash
66-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/cpu-deployment.yaml
68+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/vllm/cpu-deployment.yaml
6769
```
6870

6971
=== "vLLM Simulator Model Server"
@@ -74,14 +76,14 @@ Tooling:
7476
To deploy the vLLM simulator, run the following command.
7577

7678
```bash
77-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/sim-deployment.yaml
79+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/vllm/sim-deployment.yaml
7880
```
7981

8082
### Install the Inference Extension CRDs
8183

82-
```bash
83-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/latest/download/manifests.yaml
84-
```
84+
```bash
85+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v1.0.0/manifests.yaml
86+
```
8587

8688
### Deploy the InferencePool and Endpoint Picker Extension
8789

@@ -144,7 +146,7 @@ Tooling:
144146
2. Deploy Inference Gateway:
145147

146148
```bash
147-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/gateway.yaml
149+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/gke/gateway.yaml
148150
```
149151

150152
Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
@@ -157,15 +159,15 @@ Tooling:
157159
3. Deploy the HTTPRoute
158160

159161
```bash
160-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/httproute.yaml
162+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/gke/httproute.yaml
161163
```
162164

163165
4. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
164166

165167
```bash
166168
kubectl get httproute llm-route -o yaml
167169
```
168-
170+
169171
=== "Istio"
170172

171173
Please note that this feature is currently in an experimental phase and is not intended for production use.
@@ -195,13 +197,13 @@ Tooling:
195197
3. If you run the Endpoint Picker (EPP) with the `--secure-serving` flag set to `true` (the default mode), it is currently using a self-signed certificate. As a security measure, Istio does not trust self-signed certificates by default. As a temporary workaround, you can apply the destination rule to bypass TLS verification for EPP. A more secure TLS implementation in EPP is being discussed in [Issue 582](https://github.com/kubernetes-sigs/gateway-api-inference-extension/issues/582).
196198

197199
```bash
198-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/destination-rule.yaml
200+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/istio/destination-rule.yaml
199201
```
200202

201203
4. Deploy Gateway
202204

203205
```bash
204-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/gateway.yaml
206+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/istio/gateway.yaml
205207
```
206208

207209
Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
@@ -211,13 +213,13 @@ Tooling:
211213
inference-gateway inference-gateway <MY_ADDRESS> True 22s
212214
```
213215

214-
6. Deploy the HTTPRoute
216+
5. Deploy the HTTPRoute
215217

216218
```bash
217-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/httproute.yaml
219+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/istio/httproute.yaml
218220
```
219221

220-
7. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
222+
6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
221223

222224
```bash
223225
kubectl get httproute llm-route -o yaml
@@ -250,7 +252,7 @@ Tooling:
250252
4. Deploy the Gateway
251253

252254
```bash
253-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/gateway.yaml
255+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/kgateway/gateway.yaml
254256
```
255257

256258
Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
@@ -263,7 +265,7 @@ Tooling:
263265
5. Deploy the HTTPRoute
264266

265267
```bash
266-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/httproute.yaml
268+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/kgateway/httproute.yaml
267269
```
268270

269271
6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
@@ -297,7 +299,7 @@ Tooling:
297299
4. Deploy the Gateway
298300

299301
```bash
300-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/agentgateway/gateway.yaml
302+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/agentgateway/gateway.yaml
301303
```
302304

303305
Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
@@ -310,7 +312,7 @@ Tooling:
310312
5. Deploy the HTTPRoute
311313

312314
```bash
313-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/agentgateway/httproute.yaml
315+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/agentgateway/httproute.yaml
314316
```
315317

316318
6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
@@ -328,10 +330,9 @@ Tooling:
328330
Deploy the sample InferenceObjective which allows you to specify priority of requests.
329331

330332
```bash
331-
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/inferenceobjective.yaml
333+
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/inferenceobjective.yaml
332334
```
333335

334-
335336
### Try it out
336337

337338
Wait until the gateway is ready.
@@ -357,36 +358,36 @@ Tooling:
357358

358359
```bash
359360
helm uninstall vllm-llama3-8b-instruct
360-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/inferenceobjective.yaml --ignore-not-found
361-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/cpu-deployment.yaml --ignore-not-found
362-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/gpu-deployment.yaml --ignore-not-found
363-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/vllm/sim-deployment.yaml --ignore-not-found
361+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/inferenceobjective.yaml --ignore-not-found
362+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/vllm/cpu-deployment.yaml --ignore-not-found
363+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/vllm/gpu-deployment.yaml --ignore-not-found
364+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/vllm/sim-deployment.yaml --ignore-not-found
364365
kubectl delete secret hf-token --ignore-not-found
365366
```
366367

367368
1. Uninstall the Gateway API Inference Extension CRDs
368369

369370
```bash
370-
kubectl delete -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/config/crd --ignore-not-found
371+
kubectl delete -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v1.0.0/manifests.yaml --ignore-not-found
371372
```
372373

373374
1. Choose one of the following options to cleanup the Inference Gateway.
374375

375376
=== "GKE"
376377

377378
```bash
378-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/gateway.yaml --ignore-not-found
379-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/healthcheck.yaml --ignore-not-found
380-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/gcp-backend-policy.yaml --ignore-not-found
381-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/httproute.yaml --ignore-not-found
379+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/gke/gateway.yaml --ignore-not-found
380+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/gke/healthcheck.yaml --ignore-not-found
381+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/gke/gcp-backend-policy.yaml --ignore-not-found
382+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/gke/httproute.yaml --ignore-not-found
382383
```
383384

384385
=== "Istio"
385386

386387
```bash
387-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/gateway.yaml --ignore-not-found
388-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/destination-rule.yaml --ignore-not-found
389-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/httproute.yaml --ignore-not-found
388+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/istio/gateway.yaml --ignore-not-found
389+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/istio/destination-rule.yaml --ignore-not-found
390+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/istio/httproute.yaml --ignore-not-found
390391
```
391392

392393
The following steps assume you would like to clean up ALL Istio resources that were created in this quickstart guide.
@@ -397,7 +398,7 @@ Tooling:
397398
istioctl uninstall -y --purge
398399
```
399400

400-
1. Remove the Istio namespace
401+
2. Remove the Istio namespace
401402

402403
```bash
403404
kubectl delete ns istio-system
@@ -406,8 +407,8 @@ Tooling:
406407
=== "Kgateway"
407408

408409
```bash
409-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/gateway.yaml --ignore-not-found
410-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/httproute.yaml --ignore-not-found
410+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/kgateway/gateway.yaml --ignore-not-found
411+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/kgateway/httproute.yaml --ignore-not-found
411412
```
412413

413414
The following steps assume you would like to cleanup ALL Kgateway resources that were created in this quickstart guide.
@@ -418,13 +419,13 @@ Tooling:
418419
helm uninstall kgateway -n kgateway-system
419420
```
420421

421-
1. Uninstall the Kgateway CRDs.
422+
2. Uninstall the Kgateway CRDs.
422423

423424
```bash
424425
helm uninstall kgateway-crds -n kgateway-system
425426
```
426427

427-
1. Remove the Kgateway namespace.
428+
3. Remove the Kgateway namespace.
428429

429430
```bash
430431
kubectl delete ns kgateway-system
@@ -433,8 +434,8 @@ Tooling:
433434
=== "Agentgateway"
434435

435436
```bash
436-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/gateway.yaml --ignore-not-found
437-
kubectl delete -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/httproute.yaml --ignore-not-found
437+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/agentgateway/gateway.yaml --ignore-not-found
438+
kubectl delete -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/gateway/agentgateway/httproute.yaml --ignore-not-found
438439
```
439440

440441
The following steps assume you would like to cleanup ALL Kgateway resources that were created in this quickstart guide.

0 commit comments

Comments
 (0)