Skip to content

Serving focus #180

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
Jun 25, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
ca1c29a
Rename app to deployment
deliahu Jun 22, 2019
95390b7
Merge branch 'master' of github.com:cortexlabs/cortex into serving-focus
deliahu Jun 22, 2019
5b0c319
Update docs
ospillinger Jun 24, 2019
670030d
Reorganize files
ospillinger Jun 24, 2019
e9815ba
Update summary.md
ospillinger Jun 24, 2019
4cc318a
Update docs
ospillinger Jun 24, 2019
3a00a29
Update iris example
ospillinger Jun 24, 2019
5c98890
Update examples
ospillinger Jun 24, 2019
a502b20
Update docs
ospillinger Jun 24, 2019
f865cb1
Update docs
ospillinger Jun 24, 2019
8e36687
Update cli.md
ospillinger Jun 24, 2019
e086e9b
Merge branch 'master' of github.com:cortexlabs/cortex into serving-focus
deliahu Jun 24, 2019
0a242b6
Update README.md
ospillinger Jun 25, 2019
0c859e9
Update README.md
ospillinger Jun 25, 2019
79bade7
Update README.md
ospillinger Jun 25, 2019
a5a1ca1
Update iris examples
deliahu Jun 25, 2019
b5b58bc
Merge branch 'master' of github.com:cortexlabs/cortex into serving-focus
deliahu Jun 25, 2019
d241a17
Update docs
deliahu Jun 25, 2019
1d43535
Update docs
deliahu Jun 25, 2019
528a893
Update import.md
deliahu Jun 25, 2019
20a6ccb
Update docs
deliahu Jun 25, 2019
94f41a0
Update cli.md
deliahu Jun 25, 2019
3ddbdfb
Hide resource types from CLI help
deliahu Jun 25, 2019
dfaf564
Update README.md
ospillinger Jun 25, 2019
52144fa
Update docs
deliahu Jun 25, 2019
eb2e6a9
Update docs
deliahu Jun 25, 2019
30dcda3
Delete python-packages.md
ospillinger Jun 25, 2019
20a9727
Update tutorial.md
deliahu Jun 25, 2019
99c40ef
Update summary.md
ospillinger Jun 25, 2019
2413b0b
Update docs
deliahu Jun 25, 2019
dcf2605
Update docs
deliahu Jun 25, 2019
9b78aae
Update docs
deliahu Jun 25, 2019
0fc19c2
Update tutorial.md
deliahu Jun 25, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/ISSUE_TEMPLATE/bug-report.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@ assignees: ''

[Description of the bug]

### Application Configuration
### Configuration

[If applicable, any relevant resource configuration or the name of the example application]
[If applicable, any relevant resource configuration or the name of the example]

### To Reproduce

Expand Down
98 changes: 27 additions & 71 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,106 +2,62 @@

<br>

**Get started:** [Install](https://docs.cortex.dev/install) • [Tutorial](https://docs.cortex.dev/tutorial) • [Demo Video](https://www.youtube.com/watch?v=tgMjCOD_ufo) • <!-- CORTEX_VERSION_MINOR_STABLE e.g. https://docs.cortex.dev/v/0.2/ -->[Docs](https://docs.cortex.dev) • <!-- CORTEX_VERSION_MINOR_STABLE -->[Examples](https://github.com/cortexlabs/cortex/tree/0.4/examples)
**Get started:** [Install](https://docs.cortex.dev/install) • [Tutorial](https://docs.cortex.dev/tutorial) • <!-- CORTEX_VERSION_MINOR_STABLE e.g. https://docs.cortex.dev/v/0.2/ -->[Docs](https://docs.cortex.dev) • <!-- CORTEX_VERSION_MINOR_STABLE -->[Examples](https://github.com/cortexlabs/cortex/tree/0.4/examples)

**Learn more:** [Website](https://cortex.dev) • [FAQ](https://docs.cortex.dev/faq) • [Blog](https://blog.cortex.dev) • [Subscribe](https://cortexlabs.us20.list-manage.com/subscribe?u=a1987373ab814f20961fd90b4&id=ae83491e1c) • [Twitter](https://twitter.com/cortex_deploy) • [Contact](mailto:[email protected])
**Learn more:** [Website](https://cortex.dev) • [Blog](https://blog.cortex.dev) • [Subscribe](https://cortexlabs.us20.list-manage.com/subscribe?u=a1987373ab814f20961fd90b4&id=ae83491e1c) • [Twitter](https://twitter.com/cortex_deploy) • [Contact](mailto:[email protected])

<br>

## Deploy, manage, and scale machine learning applications

Deploy machine learning applications without worrying about setting up infrastructure, managing dependencies, or orchestrating data pipelines.
Cortex deploys your machine learning models to your cloud infrastructure. You define your deployment with simple declarative configuration, Cortex containerizes your models, deploys them as scalable JSON APIs, and manages their lifecycle in production.

Cortex is actively maintained by Cortex Labs. We're a venture-backed team of infrastructure engineers and [we're hiring](https://angel.co/cortex-labs-inc/jobs).

<br>

## How it works

1. **Define your app:** define your app using Python, TensorFlow, and PySpark.

2. **`$ cortex deploy`:** deploy end-to-end machine learning pipelines to AWS with one command.

3. **Serve predictions:** serve real time predictions via horizontally scalable JSON APIs.

<br>

## End-to-end machine learning workflow

**Data ingestion:** connect to your data warehouse and ingest data.
**Define** your deployment using declarative configuration:

```yaml
- kind: environment
name: dev
data:
type: csv
path: s3a://my-bucket/data.csv
schema: [@col1, @col2, ...]
- kind: api
name: my-api
external_model:
path: s3://my-bucket/my-model.zip
region: us-west-2
compute:
replicas: 3
gpu: 2
```

**Data validation:** prevent data quality issues early.
**Deploy** to your cloud infrastructure:

```yaml
- kind: raw_column
name: col1
type: INT_COLUMN
min: 0
max: 10
```
$ cortex deploy

**Data transformation:** use custom Python and PySpark code to transform data.

```yaml
- kind: transformed_column
name: col1_normalized
transformer_path: normalize.py # Python / PySpark code
input: @col1
```

**Model training:** train models with custom TensorFlow code.

```yaml
- kind: model
name: my_model
estimator_path: dnn.py # TensorFlow code
target_column: @label_col
input: [@col1_normalized, @col2_indexed, ...]
hparams:
hidden_units: [16, 8]
training:
batch_size: 32
num_steps: 10000
Deploying ...
Ready! https://amazonaws.com/my-api
```

**Prediction serving:** serve real time predictions via JSON APIs.
**Serve** real time predictions via scalable JSON APIs:

```yaml
- kind: api
name: my-api
model: @my_model
compute:
replicas: 3
```
$ curl -d '{"a": 1, "b": 2, "c": 3}' https://amazonaws.com/my-api

**Deployment:** Cortex deploys your pipeline on scalable cloud infrastructure.

```
$ cortex deploy
Ingesting data ...
Transforming data ...
Training models ...
Deploying API ...
Ready! https://abc.amazonaws.com/my-api
{ prediction: "def" }
```

<br>

## Key features

- **Machine learning pipelines as code:** Cortex applications are defined using a simple declarative syntax that enables flexibility and reusability.
- **Machine learning deployments as code:** Cortex deployments are defined using declarative configuration.

- **Multi framework support:** Cortex supports TensorFlow models with more frameworks coming soon.

- **CPU / GPU support:** Cortex can run inference on CPU or GPU infrastructure.

- **End-to-end machine learning workflow:** Cortex spans the machine learning workflow from feature management to model training to prediction serving.
- **Scalability:** Cortex can scale APIs to handle production workloads.

- **TensorFlow and PySpark support:** Cortex supports custom [TensorFlow](https://www.tensorflow.org) code for model training and custom [PySpark](https://spark.apache.org/docs/latest/api/python/index.html) code for data processing.
- **Rolling updates:** Cortex updates deployed APIs without any downtime.

- **Built for the cloud:** Cortex can handle production workloads and can be deployed in any AWS account in minutes.
- **Cloud native:** Cortex can be deployed on any AWS account in minutes.
4 changes: 2 additions & 2 deletions cli/cmd/delete.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,12 +30,12 @@ import (
var flagKeepCache bool

func init() {
deleteCmd.PersistentFlags().BoolVarP(&flagKeepCache, "keep-cache", "c", false, "keep cached data for the app")
deleteCmd.PersistentFlags().BoolVarP(&flagKeepCache, "keep-cache", "c", false, "keep cached data for the deployment")
addEnvFlag(deleteCmd)
}

var deleteCmd = &cobra.Command{
Use: "delete [APP_NAME]",
Use: "delete [DEPLOYMENT_NAME]",
Short: "delete a deployment",
Long: "Delete a deployment.",
Args: cobra.MaximumNArgs(1),
Expand Down
4 changes: 2 additions & 2 deletions cli/cmd/deploy.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,8 @@ func init() {

var deployCmd = &cobra.Command{
Use: "deploy",
Short: "deploy an application",
Long: "Deploy an application.",
Short: "create or update a deployment",
Long: "Create or update a deployment.",
Args: cobra.NoArgs,
Run: func(cmd *cobra.Command, args []string) {
deploy(flagDeployForce, false)
Expand Down
4 changes: 2 additions & 2 deletions cli/cmd/errors.go
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ func (e Error) Error() string {
func ErrorCliAlreadyInAppDir(dirPath string) error {
return Error{
Kind: ErrCliAlreadyInAppDir,
message: fmt.Sprintf("your current working directory is already in a cortex app directory (%s)", dirPath),
message: fmt.Sprintf("your current working directory is already in a cortex directory (%s)", dirPath),
}
}

Expand Down Expand Up @@ -123,6 +123,6 @@ func ErrorFailedToConnect(urlStr string) error {
func ErrorCliNotInAppDir() error {
return Error{
Kind: ErrCliNotInAppDir,
message: "your current working directory is not in or under a cortex app directory (identified via a top-level cortex.yaml file)",
message: "your current working directory is not in or under a cortex directory (identified via a top-level cortex.yaml file)",
}
}
2 changes: 1 addition & 1 deletion cli/cmd/get.go
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ func init() {
addEnvFlag(getCmd)
addWatchFlag(getCmd)
addSummaryFlag(getCmd)
addResourceTypesToHelp(getCmd)
// addResourceTypesToHelp(getCmd)
}

var getCmd = &cobra.Command{
Expand Down
2 changes: 1 addition & 1 deletion cli/cmd/logs.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ func init() {
addAppNameFlag(logsCmd)
addEnvFlag(logsCmd)
addVerboseFlag(logsCmd)
addResourceTypesToHelp(logsCmd)
// addResourceTypesToHelp(logsCmd)
}

var logsCmd = &cobra.Command{
Expand Down
2 changes: 1 addition & 1 deletion cli/cmd/root.go
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ func addWatchFlag(cmd *cobra.Command) {
}

func addAppNameFlag(cmd *cobra.Command) {
cmd.PersistentFlags().StringVarP(&flagAppName, "app", "a", "", "app name")
cmd.PersistentFlags().StringVarP(&flagAppName, "deployment", "d", "", "deployment name")
}

func addVerboseFlag(cmd *cobra.Command) {
Expand Down
47 changes: 47 additions & 0 deletions docs/apis/apis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# APIs

Serve models at scale and use them to build smarter applications.

## Config

```yaml
- kind: api
name: <string> # API name (required)
external_model:
path: <string> # path to a zipped model dir (e.g. s3://my-bucket/model.zip)
region: <string> # S3 region (default: us-west-2)
compute:
replicas: <int> # number of replicas to launch (default: 1)
cpu: <string> # CPU request per replica (default: Null)
gpu: <string> # gpu request per replica (default: Null)
mem: <string> # memory request per replica (default: Null)
```

See [packaging models](packaging-models.md) for how to create the zipped model.

## Example

```yaml
- kind: api
name: my-api
external_model:
path: s3://my-bucket/my-model.zip
region: us-west-2
compute:
replicas: 3
gpu: 2
```

## Integration

APIs can be integrated into other applications or services via their JSON endpoints. The endpoint for any API follows the following format: {apis_endpoint}/{deployment_name}/{api_name}.

The fields in the request payload for a particular API should match the raw columns that were used to train the model that it is serving. Cortex automatically applies the same transformers that were used at training time when responding to prediction requests.

## Horizontal Scalability

APIs can be configured using `replicas` in the `compute` field. Replicas can be used to change the amount of computing resources allocated to service prediction requests for a particular API. APIs that have low request volumes should have a small number of replicas while APIs that handle large request volumes should have more replicas.

## Rolling Updates

When the model that an API is serving gets updated, Cortex will update the API with the new model without any downtime.
28 changes: 28 additions & 0 deletions docs/apis/compute.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Compute

Compute resource requests in Cortex follow the syntax and meaning of [compute resources in Kubernetes](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/).

For example:

```yaml
- kind: model
...
compute:
cpu: "2"
mem: "1Gi"
gpu: 1
```

CPU and memory requests in Cortex correspond to compute resource requests in Kubernetes. In the example above, the training job will only be scheduled once 2 CPUs and 1Gi of memory are available, and the job will be guaranteed to have access to those resources throughout it's execution. In some cases, a Cortex compute resource request can be (or may default to) `Null`.

## CPU

One unit of CPU corresponds to one virtual CPU on AWS. Fractional requests are allowed, and can be specified as a floating point number or via the "m" suffix (`0.2` and `200m` are equivalent).

## Memory

One unit of memory is one byte. Memory can be expressed as an integer or by using one of these suffixes: `K`, `M`, `G`, `T` (or their power-of two counterparts: `Ki`, `Mi`, `Gi`, `Ti`). For example, the following values represent roughly the same memory: `128974848`, `129e6`, `129M`, `123Mi`.

## GPU

One unit of GPU corresponds to one virtual GPU on AWS. Fractional requests are not allowed. Here's some information on [adding GPU enabled nodes on EKS](https://docs.aws.amazon.com/en_ca/eks/latest/userguide/gpu-ami.html).
17 changes: 17 additions & 0 deletions docs/apis/deployment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Deployment

The deployment resource is used to group a set of APIs that can be deployed as a single unit. It must be defined in every Cortex directory in a top-level `cortex.yaml` file.

## Config

```yaml
- kind: deployment
name: <string> # deployment name (required)
```

## Example

```yaml
- kind: deployment
name: my_deployment
```
28 changes: 28 additions & 0 deletions docs/apis/packaging-models.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Packaging Models

## TensorFlow

Zip the exported estimator output in your checkpoint directory, e.g.

```text
$ ls export/estimator
saved_model.pb variables/

$ zip -r model.zip export/estimator
```

Upload the zipped file to Amazon S3, e.g.

```text
$ aws s3 cp model.zip s3://my-bucket/model.zip
```

Specify `external_model` in an API, e.g.

```yaml
- kind: api
name: my-api
external_model:
path: s3://my-bucket/model.zip
region: us-west-2
```
18 changes: 18 additions & 0 deletions docs/apis/statuses.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Resource Statuses

## Statuses

| Status | Meaning |
|----------------------|---|
| ready | API is deployed and ready to serve prediction requests |
| pending | API is waiting for another resource to be ready, or is initializing |
| updating | API is performing a rolling update |
| update pending | API will be updated when the new model is ready; a previous version of this API is ready |
| stopping | API is stopping |
| stopped | API is stopped |
| error | API was not created due to an error; run `cortex logs -v <name>` to view the logs |
| skipped | API was not created due to an error in another resource |
| update skipped | API was not updated due to an error in another resource; a previous version of this API is ready |
| upstream error | API was not created due to an error in one of its dependencies; a previous version of this API may be ready |
| upstream termination | API was not created because one of its dependencies was terminated; a previous version of this API may be ready |
| compute unavailable | API could not start due to insufficient memory, CPU, or GPU in the cluster; some replicas may be ready |
Loading