cortexlabs · deliahu · Jun 25, 2019 · Jun 22, 2019 · Jun 22, 2019 · Jun 24, 2019
diff --git a/.github/ISSUE_TEMPLATE/bug-report.md b/.github/ISSUE_TEMPLATE/bug-report.md
@@ -11,9 +11,9 @@ assignees: ''
 
 [Description of the bug]
 
-### Application Configuration
+### Configuration
 
-[If applicable, any relevant resource configuration or the name of the example application]
+[If applicable, any relevant resource configuration or the name of the example]
 
 ### To Reproduce
 

diff --git a/README.md b/README.md
@@ -2,106 +2,62 @@
 
 <br>
 
-**Get started:** [Install](https://docs.cortex.dev/install) • [Tutorial](https://docs.cortex.dev/tutorial) • [Demo Video](https://www.youtube.com/watch?v=tgMjCOD_ufo) • <!-- CORTEX_VERSION_MINOR_STABLE e.g. https://docs.cortex.dev/v/0.2/ -->[Docs](https://docs.cortex.dev) • <!-- CORTEX_VERSION_MINOR_STABLE -->[Examples](https://github.com/cortexlabs/cortex/tree/0.4/examples)
+**Get started:** [Install](https://docs.cortex.dev/install) • [Tutorial](https://docs.cortex.dev/tutorial) • <!-- CORTEX_VERSION_MINOR_STABLE e.g. https://docs.cortex.dev/v/0.2/ -->[Docs](https://docs.cortex.dev) • <!-- CORTEX_VERSION_MINOR_STABLE -->[Examples](https://github.com/cortexlabs/cortex/tree/0.4/examples)
 
-**Learn more:** [Website](https://cortex.dev) • [FAQ](https://docs.cortex.dev/faq) • [Blog](https://blog.cortex.dev) • [Subscribe](https://cortexlabs.us20.list-manage.com/subscribe?u=a1987373ab814f20961fd90b4&id=ae83491e1c) • [Twitter](https://twitter.com/cortex_deploy) • [Contact](mailto:[email protected])
+**Learn more:** [Website](https://cortex.dev) • [Blog](https://blog.cortex.dev) • [Subscribe](https://cortexlabs.us20.list-manage.com/subscribe?u=a1987373ab814f20961fd90b4&id=ae83491e1c) • [Twitter](https://twitter.com/cortex_deploy) • [Contact](mailto:[email protected])
 
 <br>
 
-## Deploy, manage, and scale machine learning applications
-
-Deploy machine learning applications without worrying about setting up infrastructure, managing dependencies, or orchestrating data pipelines.
+Cortex deploys your machine learning models to your cloud infrastructure. You define your deployment with simple declarative configuration, Cortex containerizes your models, deploys them as scalable JSON APIs, and manages their lifecycle in production.
 
 Cortex is actively maintained by Cortex Labs. We're a venture-backed team of infrastructure engineers and [we're hiring](https://angel.co/cortex-labs-inc/jobs).
 
 <br>
 
 ## How it works
 
-1. **Define your app:** define your app using Python, TensorFlow, and PySpark.
-
-2. **`$ cortex deploy`:** deploy end-to-end machine learning pipelines to AWS with one command.
-
-3. **Serve predictions:** serve real time predictions via horizontally scalable JSON APIs.
-
-<br>
-
-## End-to-end machine learning workflow
-
-**Data ingestion:** connect to your data warehouse and ingest data.
+**Define** your deployment using declarative configuration:
 
 ```yaml
-- kind: environment
-  name: dev
-  data:
-    type: csv
-    path: s3a://my-bucket/data.csv
-    schema: [@col1, @col2, ...]
+- kind: api
+  name: my-api
+  external_model:
+    path: s3://my-bucket/my-model.zip
+    region: us-west-2
+  compute:
+    replicas: 3
+    gpu: 2
 ```
 
-**Data validation:** prevent data quality issues early.
+**Deploy** to your cloud infrastructure:
 
-```yaml
-- kind: raw_column
-  name: col1
-  type: INT_COLUMN
-  min: 0
-  max: 10
 ```
+$ cortex deploy
 
-**Data transformation:** use custom Python and PySpark code to transform data.
-
-```yaml
-- kind: transformed_column
-  name: col1_normalized
-  transformer_path: normalize.py  # Python / PySpark code
-  input: @col1
-```
-
-**Model training:** train models with custom TensorFlow code.
-
-```yaml
-- kind: model
-  name: my_model
-  estimator_path: dnn.py  # TensorFlow code
-  target_column: @label_col
-  input: [@col1_normalized, @col2_indexed, ...]
-  hparams:
-    hidden_units: [16, 8]
-  training:
-    batch_size: 32
-    num_steps: 10000
+Deploying ...
+Ready! https://amazonaws.com/my-api
 ```
 
-**Prediction serving:** serve real time predictions via JSON APIs.
+**Serve** real time predictions via scalable JSON APIs:
 
-```yaml
-- kind: api
-  name: my-api
-  model: @my_model
-  compute:
-    replicas: 3
 ```
+$ curl -d '{"a": 1, "b": 2, "c": 3}' https://amazonaws.com/my-api
 
-**Deployment:** Cortex deploys your pipeline on scalable cloud infrastructure.
-
-```
-$ cortex deploy
-Ingesting data ...
-Transforming data ...
-Training models ...
-Deploying API ...
-Ready! https://abc.amazonaws.com/my-api
+{ prediction: "def" }
 ```
 
 <br>
 
 ## Key features
 
-- **Machine learning pipelines as code:** Cortex applications are defined using a simple declarative syntax that enables flexibility and reusability.
+- **Machine learning deployments as code:** Cortex deployments are defined using declarative configuration.
+
+- **Multi framework support:** Cortex supports TensorFlow models with more frameworks coming soon.
+
+- **CPU / GPU support:** Cortex can run inference on CPU or GPU infrastructure.
 
-- **End-to-end machine learning workflow:** Cortex spans the machine learning workflow from feature management to model training to prediction serving.
+- **Scalability:** Cortex can scale APIs to handle production workloads.
 
-- **TensorFlow and PySpark support:** Cortex supports custom [TensorFlow](https://www.tensorflow.org) code for model training and custom [PySpark](https://spark.apache.org/docs/latest/api/python/index.html) code for data processing.
+- **Rolling updates:** Cortex updates deployed APIs without any downtime.
 
-- **Built for the cloud:** Cortex can handle production workloads and can be deployed in any AWS account in minutes.
+- **Cloud native:** Cortex can be deployed on any AWS account in minutes.
diff --git a/cli/cmd/delete.go b/cli/cmd/delete.go
@@ -30,12 +30,12 @@ import (
 var flagKeepCache bool
 
 func init() {
-	deleteCmd.PersistentFlags().BoolVarP(&flagKeepCache, "keep-cache", "c", false, "keep cached data for the app")
+	deleteCmd.PersistentFlags().BoolVarP(&flagKeepCache, "keep-cache", "c", false, "keep cached data for the deployment")
 	addEnvFlag(deleteCmd)
 }
 
 var deleteCmd = &cobra.Command{
-	Use:   "delete [APP_NAME]",
+	Use:   "delete [DEPLOYMENT_NAME]",
 	Short: "delete a deployment",
 	Long:  "Delete a deployment.",
 	Args:  cobra.MaximumNArgs(1),

diff --git a/cli/cmd/deploy.go b/cli/cmd/deploy.go
@@ -37,8 +37,8 @@ func init() {
 
 var deployCmd = &cobra.Command{
 	Use:   "deploy",
-	Short: "deploy an application",
-	Long:  "Deploy an application.",
+	Short: "create or update a deployment",
+	Long:  "Create or update a deployment.",
 	Args:  cobra.NoArgs,
 	Run: func(cmd *cobra.Command, args []string) {
 		deploy(flagDeployForce, false)

diff --git a/cli/cmd/errors.go b/cli/cmd/errors.go
@@ -95,7 +95,7 @@ func (e Error) Error() string {
 func ErrorCliAlreadyInAppDir(dirPath string) error {
 	return Error{
 		Kind:    ErrCliAlreadyInAppDir,
-		message: fmt.Sprintf("your current working directory is already in a cortex app directory (%s)", dirPath),
+		message: fmt.Sprintf("your current working directory is already in a cortex directory (%s)", dirPath),
 	}
 }
 
@@ -123,6 +123,6 @@ func ErrorFailedToConnect(urlStr string) error {
 func ErrorCliNotInAppDir() error {
 	return Error{
 		Kind:    ErrCliNotInAppDir,
-		message: "your current working directory is not in or under a cortex app directory (identified via a top-level cortex.yaml file)",
+		message: "your current working directory is not in or under a cortex directory (identified via a top-level cortex.yaml file)",
 	}
 }
diff --git a/cli/cmd/get.go b/cli/cmd/get.go
@@ -42,7 +42,7 @@ func init() {
 	addEnvFlag(getCmd)
 	addWatchFlag(getCmd)
 	addSummaryFlag(getCmd)
-	addResourceTypesToHelp(getCmd)
+	// addResourceTypesToHelp(getCmd)
 }
 
 var getCmd = &cobra.Command{

diff --git a/cli/cmd/logs.go b/cli/cmd/logs.go
@@ -27,7 +27,7 @@ func init() {
 	addAppNameFlag(logsCmd)
 	addEnvFlag(logsCmd)
 	addVerboseFlag(logsCmd)
-	addResourceTypesToHelp(logsCmd)
+	// addResourceTypesToHelp(logsCmd)
 }
 
 var logsCmd = &cobra.Command{

diff --git a/cli/cmd/root.go b/cli/cmd/root.go
@@ -96,7 +96,7 @@ func addWatchFlag(cmd *cobra.Command) {
 }
 
 func addAppNameFlag(cmd *cobra.Command) {
-	cmd.PersistentFlags().StringVarP(&flagAppName, "app", "a", "", "app name")
+	cmd.PersistentFlags().StringVarP(&flagAppName, "deployment", "d", "", "deployment name")
 }
 
 func addVerboseFlag(cmd *cobra.Command) {

diff --git a/docs/apis/apis.md b/docs/apis/apis.md
@@ -0,0 +1,47 @@
+# APIs
+
+Serve models at scale and use them to build smarter applications.
+
+## Config
+
+```yaml
+- kind: api
+  name: <string>  # API name (required)
+  external_model:
+    path: <string>  # path to a zipped model dir (e.g. s3://my-bucket/model.zip)
+    region: <string>  # S3 region (default: us-west-2)
+  compute:
+    replicas: <int>  # number of replicas to launch (default: 1)
+    cpu: <string>  # CPU request per replica (default: Null)
+    gpu: <string>  # gpu request per replica (default: Null)
+    mem: <string>  # memory request per replica (default: Null)
+```
+
+See [packaging models](packaging-models.md) for how to create the zipped model.
+
+## Example
+
+```yaml
+- kind: api
+  name: my-api
+  external_model:
+    path: s3://my-bucket/my-model.zip
+    region: us-west-2
+  compute:
+    replicas: 3
+    gpu: 2
+```
+
+## Integration
+
+APIs can be integrated into other applications or services via their JSON endpoints. The endpoint for any API follows the following format: {apis_endpoint}/{deployment_name}/{api_name}.
+
+The fields in the request payload for a particular API should match the raw columns that were used to train the model that it is serving. Cortex automatically applies the same transformers that were used at training time when responding to prediction requests.
+
+## Horizontal Scalability
+
+APIs can be configured using `replicas` in the `compute` field. Replicas can be used to change the amount of computing resources allocated to service prediction requests for a particular API. APIs that have low request volumes should have a small number of replicas while APIs that handle large request volumes should have more replicas.
+
+## Rolling Updates
+
+When the model that an API is serving gets updated, Cortex will update the API with the new model without any downtime.
diff --git a/docs/apis/compute.md b/docs/apis/compute.md
@@ -0,0 +1,28 @@
+# Compute
+
+Compute resource requests in Cortex follow the syntax and meaning of [compute resources in Kubernetes](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/).
+
+For example:
+
+```yaml
+- kind: model
+  ...
+  compute:
+    cpu: "2"
+    mem: "1Gi"
+    gpu: 1
+```
+
+CPU and memory requests in Cortex correspond to compute resource requests in Kubernetes. In the example above, the training job will only be scheduled once 2 CPUs and 1Gi of memory are available, and the job will be guaranteed to have access to those resources throughout it's execution. In some cases, a Cortex compute resource request can be (or may default to) `Null`.
+
+## CPU
+
+One unit of CPU corresponds to one virtual CPU on AWS. Fractional requests are allowed, and can be specified as a floating point number or via the "m" suffix (`0.2` and `200m` are equivalent).
+
+## Memory
+
+One unit of memory is one byte. Memory can be expressed as an integer or by using one of these suffixes: `K`, `M`, `G`, `T` (or their power-of two counterparts: `Ki`, `Mi`, `Gi`, `Ti`). For example, the following values represent roughly the same memory: `128974848`, `129e6`, `129M`, `123Mi`.
+
+## GPU
+
+One unit of GPU corresponds to one virtual GPU on AWS. Fractional requests are not allowed. Here's some information on [adding GPU enabled nodes on EKS](https://docs.aws.amazon.com/en_ca/eks/latest/userguide/gpu-ami.html).
diff --git a/docs/apis/deployment.md b/docs/apis/deployment.md
@@ -0,0 +1,17 @@
+# Deployment
+
+The deployment resource is used to group a set of APIs that can be deployed as a single unit. It must be defined in every Cortex directory in a top-level `cortex.yaml` file.
+
+## Config
+
+```yaml
+- kind: deployment
+  name: <string>  # deployment name (required)
+```
+
+## Example
+
+```yaml
+- kind: deployment
+  name: my_deployment
+```
diff --git a/docs/apis/packaging-models.md b/docs/apis/packaging-models.md
@@ -0,0 +1,28 @@
+# Packaging Models
+
+## TensorFlow
+
+Zip the exported estimator output in your checkpoint directory, e.g.
+
+```text
+$ ls export/estimator
+saved_model.pb  variables/
+
+$ zip -r model.zip export/estimator
+```
+
+Upload the zipped file to Amazon S3, e.g.
+
+```text
+$ aws s3 cp model.zip s3://my-bucket/model.zip
+```
+
+Specify `external_model` in an API, e.g.
+
+```yaml
+- kind: api
+  name: my-api
+  external_model:
+    path: s3://my-bucket/model.zip
+    region: us-west-2
+```
diff --git a/docs/apis/statuses.md b/docs/apis/statuses.md
@@ -0,0 +1,18 @@
+# Resource Statuses
+
+## Statuses
+
+| Status               | Meaning |
+|----------------------|---|
+| ready                | API is deployed and ready to serve prediction requests |
+| pending              | API is waiting for another resource to be ready, or is initializing |
+| updating             | API is performing a rolling update |
+| update pending       | API will be updated when the new model is ready; a previous version of this API is ready |
+| stopping             | API is stopping |
+| stopped              | API is stopped |
+| error                | API was not created due to an error; run `cortex logs -v <name>` to view the logs |
+| skipped              | API was not created due to an error in another resource |
+| update skipped       | API was not updated due to an error in another resource; a previous version of this API is ready |
+| upstream error       | API was not created due to an error in one of its dependencies; a previous version of this API may be ready |
+| upstream termination | API was not created because one of its dependencies was terminated; a previous version of this API may be ready |
+| compute unavailable  | API could not start due to insufficient memory, CPU, or GPU in the cluster; some replicas may be ready |