-
Notifications
You must be signed in to change notification settings - Fork 607
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
I'm finding that my tensorflow model is ~6X slower when run from the local server than when it is run from a jupyter notebook. I've checked nvtop when the model is running and it appears that the GPU is being used although only for a very brief portion of the overall time. I've also tried running the model in bentoml. In that case it's also slower, but only 3X. Speeds are comparably when I run from AWS, although I'm using a T4 in that case rather than the rtx 2080 ti that I use locally. Any suggestions on how I might diagnose the cause of the slowdown? Here are my config files:
- name: Foo
kind: RealtimeAPI
predictor:
type: tensorflow
path: serving/cortex_server.py
models:
path: foo
signature_key: serving_default
image: quay.io/robertlucian/tensorflow-predictor:0.25.0-tfs
tensorflow_serving_image: quay.io/robertlucian/cortex-tensorflow-serving-gpu-tf2.4:0.25.0
compute:
gpu: 1
autoscaling:
min_replicas: 1
max_replicas: 1
# cluster.yaml
# EKS cluster name
cluster_name: foo
# AWS region
region: us-east-1
# list of availability zones for your region
availability_zones: # default: 3 random availability zones in your region, e.g. [us-east-1a, us-east-1b, us-east-1c]
# instance type
instance_type: g4dn.xlarge
# minimum number of instances
min_instances: 1
# maximum number of instances
max_instances: 1
# disk storage size per instance (GB)
instance_volume_size: 50
# instance volume type [gp2 | io1 | st1 | sc1]
instance_volume_type: gp2
# instance volume iops (only applicable to io1)
# instance_volume_iops: 3000
# subnet visibility [public (instances will have public IPs) | private (instances will not have public IPs)]
subnet_visibility: private
# NAT gateway (required when using private subnets) [none | single | highly_available (a NAT gateway per availability zone)]
nat_gateway: single
# API load balancer scheme [internet-facing | internal]
api_load_balancer_scheme: internal
# operator load balancer scheme [internet-facing | internal]
# note: if using "internal", you must configure VPC Peering to connect your CLI to your cluster operator
operator_load_balancer_scheme: internet-facing
# to install Cortex in an existing VPC, you can provide a list of subnets for your cluster to use
# subnet_visibility (specified above in this file) must match your subnets' visibility
# this is an advanced feature (not recommended for first-time users) and requires your VPC to be configured correctly; see https://eksctl.io/usage/vpc-networking/#use-existing-vpc-other-custom-configuration
# here is an example:
# subnets:
# - availability_zone: us-west-2a
# subnet_id: subnet-060f3961c876872ae
# - availability_zone: us-west-2b
# subnet_id: subnet-0faed05adf6042ab7
# additional tags to assign to AWS resources (all resources will automatically be tagged with cortex.dev/cluster-name: <cluster_name>)
tags: # <string>: <string> map of key/value pairs
# whether to use spot instances in the cluster (default: false)
spot: true
spot_config:
# additional instance types with identical or better specs than the primary cluster instance type (defaults to only the primary instance type)
instance_distribution: # [similar_instance_type_1, similar_instance_type_2]
# minimum number of on demand instances (default: 0)
on_demand_base_capacity: 0
# percentage of on demand instances to use after the on demand base capacity has been met [0, 100] (default: 50)
# note: setting this to 0 may hinder cluster scale up when spot instances are not available
on_demand_percentage_above_base_capacity: 0
# max price for spot instances (default: the on-demand price of the primary instance type)
max_price: # <float>
# number of spot instance pools across which to allocate spot instances [1, 20] (default: number of instances in instance distribution)
instance_pools: 3
# fallback to on-demand instances if spot instances were unable to be allocated (default: true)
on_demand_backup: true
# SSL certificate ARN (only necessary when using a custom domain)
ssl_certificate_arn:
# primary CIDR block for the cluster's VPC
vpc_cidr: 192.168.0.0/16
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working