diff --git a/python/agents/model_garden_agent/.hgignore b/python/agents/model_garden_agent/.hgignore new file mode 100644 index 00000000..4c49bd78 --- /dev/null +++ b/python/agents/model_garden_agent/.hgignore @@ -0,0 +1 @@ +.env diff --git a/python/agents/model_garden_agent/README.md b/python/agents/model_garden_agent/README.md new file mode 100644 index 00000000..28ad7e5a --- /dev/null +++ b/python/agents/model_garden_agent/README.md @@ -0,0 +1,126 @@ +# Model Garden Deploy Agent +## Overview of the Agent + +This project implements an ADK-based agent that provides a seamless, conversational interface for Google's Vertex AI Model Garden, a Google Cloud service for discovering, customizing, and deploying a variety of models from Google and Google Cloud partners. +Current methods of interacting with Model Garden, while powerful, are not very user-friendly, often requiring users to write code or navigate complex web console interfaces. This agent bridges that gap by allowing users to discover models, deploy them to an endpoint, and run inference on those models, using natural language prompts. + +## Agent Details +### Features +The key features of the Model Garden Assistant include: + +| Feature | Description | +| :--- | :--- | +| Interaction Type | Conversational | +| Complexity | Advanced | +| Agent Type | Multi Agent | +| Components | Tools, AgentTools | +| Vertical | LLMOps | + +## Setup and Installation +### Prerequisites + +* Python 3.11+ +* Poetry + * For dependency management and packaging. Please follow the instructions on the official Poetry website for installation. +``` +pip install poetry +``` +* A project on Google Cloud Platform +* Google Cloud CLI + * For installation, please follow the instructions on the official [Google Cloud website.](https://cloud.google.com/sdk/docs/install) + +## Installation +``` +# Clone this repository. +git clone https://github.com/google/adk-samples.git +cd adk-samples/python/agents/model-garden-agent +# Install the package and dependencies. +poetry install +``` +## Configuration +Set up Google Cloud credentials. +* You may set the following environment variables in your shell, or in a .env file instead. + +``` +export GOOGLE_GENAI_USE_VERTEXAI=true +export GOOGLE_CLOUD_PROJECT= +export GOOGLE_CLOUD_LOCATION= +export GOOGLE_CLOUD_STORAGE_BUCKET= +# Only required for deployment on Agent Engine +``` +* Authenticate your GCloud account. + +``` +gcloud auth application-default login +gcloud auth application-default set-quota-project $GOOGLE_CLOUD_PROJECT +``` +## Running Agent +Using ```adk``` +ADK provides convenient ways to bring up agents locally and interact with them. You may talk to the agent using the CLI: +``` +adk run model_garden_agent +``` +Or on a web interface: +``` +adk web +``` + +The command adk web will start a web server on your machine and print the URL. You may open the URL, select "model_garden_agent" in the top-left drop-down menu, and a chatbot interface will appear on the right. The conversation is initially blank. Here are some example requests you may ask the Model Garden Agent to verify: +``` +Who are you? +``` +``` +I am a helpful agent that assists users in deploying and managing AI models using Vertex AI Model Garden. I can help you with tasks like discovering models, getting setup recommendations, deploying models to endpoints, and running inference. +``` +## Example Interaction + +# Running Test/Eval +For running tests and evaluation, install the extra dependencies: +``` +poetry install --with dev +``` +Then the tests and evaluation can be run from the model_garden_agent directory using the pytest module: +``` +python3 -m pytest tests +python3 -m pytest eval +``` +```tests``` runs the agent on a sample request, and makes sure that every component is functional. ```eval``` is a demonstration of how to evaluate the agent, using the ```AgentEvaluator``` in ADK. It sends a couple requests to the agent and expects that the agent's responses match a pre-defined response reasonably well. + +# Deployment +The Model Garden Agent can be deployed to Vertex AI Agent Engine using the following commands: +``` +poetry install --with deployment +python3 deployment/deploy.py --create +``` +When the deployment finishes, it will print a line like this: +``` +Created remote agent: projects//locations//reasoningEngines/ +``` +If you forgot the AGENT_ENGINE_ID, you can list existing agents using: +``` +python3 deployment/deploy.py --list +``` +The output will be like: +``` +All remote agents: + +123456789 ("academic_research") +- Create time: 2025-05-10 09:33:46.188760+00:00 +- Update time: 2025-05-10 09:34:32.763434+00:00 +``` +You may interact with the deployed agent using the ```test_deployment.py``` script +``` +$ export USER_ID= +$ python3 deployment/test_deployment.py --resource_id=${AGENT_ENGINE_ID} --user_id=${USER_ID} +Found agent with resource ID: ... +Created session for user ID: ... +Type 'quit' to exit. +Input: Hello. What can you do for me? +Response: Hello! I'm an AI Research Assistant. I can help you analyze a seminal academic paper. + +To get started, please provide the seminal paper you wish to analyze as a PDF. +``` +To delete the deployed agent, you may run the following command: +``` +python3 deployment/deploy.py --delete --resource_id=${AGENT_ENGINE_ID} +``` \ No newline at end of file diff --git a/python/agents/model_garden_agent/__init__.py b/python/agents/model_garden_agent/__init__.py new file mode 100644 index 00000000..9bf51936 --- /dev/null +++ b/python/agents/model_garden_agent/__init__.py @@ -0,0 +1,3 @@ +"""Model Garden Agent.""" + +from . import agent diff --git a/python/agents/model_garden_agent/agent.py b/python/agents/model_garden_agent/agent.py new file mode 100644 index 00000000..c4516b97 --- /dev/null +++ b/python/agents/model_garden_agent/agent.py @@ -0,0 +1,131 @@ +"""Agent.""" + +import os + +from google.adk.agents import Agent +from google.adk.agents import SequentialAgent +from google.adk.tools import agent_tool +from google.adk.tools import google_search +from google.api_core import exceptions +import vertexai + +from . import deploy_model_agent +from . import model_discov_agent +from . import model_inference_agent +from . import setup_rec_agent + +NotFound = exceptions.NotFound +InvalidArgument = exceptions.InvalidArgument +GoogleAPIError = exceptions.GoogleAPIError +ServiceUnavailable = exceptions.ServiceUnavailable + +vertexai.init( + project=os.environ.get("GOOGLE_CLOUD_PROJECT", None), + location=os.environ.get("GOOGLE_CLOUD_LOCATION", None), +) + +search_agent = Agent( + model="gemini-2.5-flash", + name="search_agent", + description=""" + Searches the web and preferably Vertex AI Model Garden Platform to help users find AI models for specific tasks using public information. + """, + instruction=""" + You're a specialist in Google Search. + Your purpose is to help users discover and compare AI models from Vertex AI Model Garden. + ALWAYS cite sources when providing information, like the model name and the source of the information directly. + Dont return any information that is not directly available in the sources. + + When a user asks about models to use for a specific task (e.g., image generation), your job is to: + - Search the Vertex AI Model Garden for relevant models + - Return a clean, bulleted list of multiple model options + - Include a short 1-sentence description of each model + - Only include what’s necessary: the model name and what it's good at + - Avoid making up any model names or capabilities not found in documentation + + Preferred sources: + - Vertex AI Model Garden documentation + - Google Cloud blog/model comparison posts (only if relevant to Vertex AI) + - GitHub repos linked from Vertex AI Model Garden + + Output example: + - **Imagen 2** : High-quality text-to-image generation, fast to deploy via Vertex AI with notebooks. + - **SDXL Lite**: Lightweight version of Stable Diffusion, optimized for cost-effective and fast deployment. + - **DreamBooth (Vertex Fine-Tuned)**: Customizable image generation, fine-tuned on your own data. + + Stick to concise summaries and avoid general platform details or features unrelated to the models themselves. + + """, + tools=[google_search], +) + +search_agent_tool = agent_tool.AgentTool(agent=search_agent) +discovery_agent_tool = agent_tool.AgentTool( + agent=model_discov_agent.model_discovery_agent +) +deploy_model_agent_tool = agent_tool.AgentTool( + agent=deploy_model_agent.deploy_model_agent +) +setup_rec_agent_tool = agent_tool.AgentTool( + agent=setup_rec_agent.setup_rec_agent +) +model_inference_agent_tool = agent_tool.AgentTool( + agent=model_inference_agent.model_inference_agent +) + +root_agent = Agent( + model="gemini-2.5-flash", + name="model_garden_deploy_agent", + tools=[ + search_agent_tool, + deploy_model_agent_tool, + model_inference_agent_tool, + discovery_agent_tool, + setup_rec_agent_tool, + ], + description=(""" +A helpful agent that helps users deploy and manage AI models using Vertex AI Model Garden. +This agent coordinates between multiple domain-specific agents to complete tasks such as model +discovery, retrieving setup recommendations, deploying models to endpoints, running inference on deployed models, +listing endpoints, and deleting endpoints. +"""), + instruction=("""" +You are the primary interface for users interacting with the Vertex AI Model Garden Assistant. + +Your goal is to help users: +- Discover, compare, and understand available models +- Get recommendations for deployment setups +- Deploy models to endpoints +- Generate inference code samples + +You should act as a unified assistant — do not reveal sub-agents, tools, or system internals. The user should always feel like they are speaking to a single smart assistant. + +Depending on the user’s request, route the task to the appropriate tool or full workflow. + +Use the following guidance: +- If the user asks for a full deployment journey (e.g., "Help me deploy a model that can generate images"), use the Guided Workflow Agent (a SequentialAgent). +- If the user makes a targeted request (e.g., "List deployable models," "Give me setup recommendations for Gemma"), call the specific tool that handles that task. +- Use natural conversation. Ask clarifying questions if the request is ambiguous. +- Never say you’re using another agent. Just respond with helpful, friendly answers as if you're doing it all. + +You have access to tools that allow you to: +- Search and discover models +- Get configuration recommendations +- Deploy models and list endpoints +- Generate inference examples +- Run full workflows (search, setup, deploy, inference) + +Always maintain context and guide users smoothly through the model lifecycle. +"""), +) + +guided_agent = SequentialAgent( + name="guided_agent", + sub_agents=[ + search_agent, + model_discov_agent.model_discovery_agent, + setup_rec_agent.setup_rec_agent, + deploy_model_agent.deploy_model_agent, + model_inference_agent.model_inference_agent, + ], +) diff --git a/python/agents/model_garden_agent/agent.py.orig b/python/agents/model_garden_agent/agent.py.orig new file mode 100644 index 00000000..405cad5b --- /dev/null +++ b/python/agents/model_garden_agent/agent.py.orig @@ -0,0 +1,131 @@ +"""Agent.""" + +import os + +from google.adk.agents import Agent +from google.adk.agents import SequentialAgent +from google.adk.tools import agent_tool +from google.adk.tools import google_search +from google.api_core import exceptions +import vertexai + +from . import deploy_model_agent +from . import model_discov_agent +from . import model_inference_agent +from . import setup_rec_agent + +NotFound = exceptions.NotFound +InvalidArgument = exceptions.InvalidArgument +GoogleAPIError = exceptions.GoogleAPIError +ServiceUnavailable = exceptions.ServiceUnavailable + +vertexai.init( + project=os.environ.get("GOOGLE_CLOUD_PROJECT", None), + location=os.environ.get("GOOGLE_CLOUD_LOCATION", None), +) + +search_agent = Agent( + model="gemini-2.5-flash", + name="search_agent", + description=""" + Searches the web and preferably Vertex AI Model Garden Platform to help users find AI models for specific tasks using public information. + """, + instruction=""" + You're a specialist in Google Search. + Your purpose is to help users discover and compare AI models from Vertex AI Model Garden. + ALWAYS cite sources when providing information, like the model name and the source of the information directly. + Dont return any information that is not directly available in the sources. + + When a user asks about models to use for a specific task (e.g., image generation), your job is to: + - Search the Vertex AI Model Garden for relevant models + - Return a clean, bulleted list of multiple model options + - Include a short 1-sentence description of each model + - Only include what’s necessary: the model name and what it's good at + - Avoid making up any model names or capabilities not found in documentation + + Preferred sources: + - Vertex AI Model Garden documentation + - Google Cloud blog/model comparison posts (only if relevant to Vertex AI) + - GitHub repos linked from Vertex AI Model Garden + + Output example: + - **Imagen 2** : High-quality text-to-image generation, fast to deploy via Vertex AI with notebooks. + - **SDXL Lite**: Lightweight version of Stable Diffusion, optimized for cost-effective and fast deployment. + - **DreamBooth (Vertex Fine-Tuned)**: Customizable image generation, fine-tuned on your own data. + + Stick to concise summaries and avoid general platform details or features unrelated to the models themselves. + + """, + tools=[google_search], +) + +search_agent_tool = agent_tool.AgentTool(agent=search_agent) +discovery_agent_tool = agent_tool.AgentTool( + agent=model_discov_agent.model_discovery_agent +) +deploy_model_agent_tool = agent_tool.AgentTool( + agent=deploy_model_agent.deploy_model_agent +) +setup_rec_agent_tool = agent_tool.AgentTool( + agent=setup_rec_agent.setup_rec_agent +) +model_inference_agent_tool = agent_tool.AgentTool( + agent=model_inference_agent.model_inference_agent +) + +root_agent = Agent( + model="gemini-2.5-flash", + name="model_garden_deploy_agent", + tools=[ + search_agent_tool, + deploy_model_agent_tool, + model_inference_agent_tool, + discovery_agent_tool, + setup_rec_agent_tool, + ], + description=(""" +A helpful agent that helps users deploy and manage AI models using Vertex AI Model Garden. +This agent coordinates between multiple domain-specific agents to complete tasks such as model +discovery, retrieving setup recommendations, and deploying models to endpoints, listing endpoints, +and deleting endpoints. +"""), + instruction=("""" +You are the primary interface for users interacting with the Vertex AI Model Garden Assistant. + +Your goal is to help users: +- Discover, compare, and understand available models +- Get recommendations for deployment setups +- Deploy models to endpoints +- Generate inference code samples + +You should act as a unified assistant — do not reveal sub-agents, tools, or system internals. The user should always feel like they are speaking to a single smart assistant. + +Depending on the user’s request, route the task to the appropriate tool or full workflow. + +Use the following guidance: +- If the user asks for a full deployment journey (e.g., "Help me deploy a model that can generate images"), use the Guided Workflow Agent (a SequentialAgent). +- If the user makes a targeted request (e.g., "List deployable models," "Give me setup recommendations for Gemma"), call the specific tool that handles that task. +- Use natural conversation. Ask clarifying questions if the request is ambiguous. +- Never say you’re using another agent. Just respond with helpful, friendly answers as if you're doing it all. + +You have access to tools that allow you to: +- Search and discover models +- Get configuration recommendations +- Deploy models and list endpoints +- Generate inference examples +- Run full workflows (search, setup, deploy, inference) + +Always maintain context and guide users smoothly through the model lifecycle. +"""), +) + +guided_agent = SequentialAgent( + name="guided_agent", + sub_agents=[ + search_agent, + model_discov_agent.model_discovery_agent, + setup_rec_agent.setup_rec_agent, + deploy_model_agent.deploy_model_agent, + model_inference_agent.model_inference_agent, + ], +) diff --git a/python/agents/model_garden_agent/deploy_model_agent.py b/python/agents/model_garden_agent/deploy_model_agent.py new file mode 100644 index 00000000..c9712844 --- /dev/null +++ b/python/agents/model_garden_agent/deploy_model_agent.py @@ -0,0 +1,320 @@ +from datetime import datetime +import os +from typing import Optional +from google.adk.agents import Agent +from google.api_core import exceptions +from google.cloud import aiplatform +import vertexai +from vertexai import model_garden + +NotFound = exceptions.NotFound +InvalidArgument = exceptions.InvalidArgument +GoogleAPIError = exceptions.GoogleAPIError +ServiceUnavailable = exceptions.ServiceUnavailable + +vertexai.init( + project=os.environ.get("GOOGLE_CLOUD_PROJECT", None), + location=os.environ.get("GOOGLE_CLOUD_LOCATION", None), +) + + +def deploy_model_to_endpoint( + model_id: str, + endpoint_display_name: Optional[str] = None, + model_display_name: Optional[str] = None, + option_index: Optional[int] = None, +) -> dict: + """Deploys a Vertex AI Model Garden model to an endpoint. + + Args: + model_id: The ID of the model in Model Garden (e.g., + "google/gemma@gemma-2b"). + endpoint_display_name: The display name for the new endpoint. + model_display_name: The display name for the deployed model. + option_index: The index of the deployment option to use. If not provided, + the default deployment option will be used. + + Returns: + dict: status and content or error message. + """ + print(f"[DEBUG] Option index: {option_index}") + + project_id = os.environ["GOOGLE_CLOUD_PROJECT"].lower() + location = os.environ["GOOGLE_CLOUD_LOCATION"].lower() + model_id = model_id.lower() + if endpoint_display_name: + endpoint_display_name = endpoint_display_name.lower() + if model_display_name: + model_display_name = model_display_name.lower() + + aiplatform.init(project=project_id, location=location) + + try: + model = model_garden.OpenModel(model_id) + if option_index is not None: + deploy_options = model.list_deploy_options() + if option_index >= len(deploy_options): + return { + "status": "error", + "error_message": ( + f"Invalid option index {option_index} for model '{model_id}'." + ), + } + selected_option = deploy_options[option_index] + + print(f"[DEBUG] Selected option: {selected_option}") + machine_type = ( + selected_option.dedicated_resources.machine_spec.machine_type + ) + accelerator_type = ( + selected_option.dedicated_resources.machine_spec.accelerator_type + ) + accelerator_count = ( + selected_option.dedicated_resources.machine_spec.accelerator_count + ) + + print(f"[DEBUG] Machine type: {machine_type}") + print(f"[DEBUG] Accelerator type: {accelerator_type}") + print(f"[DEBUG] Accelerator count: {accelerator_count}") + endpoint = model.deploy( + endpoint_display_name=endpoint_display_name, + model_display_name=model_display_name, + machine_type=machine_type, + accelerator_type=accelerator_type, + accelerator_count=accelerator_count, + ) + else: + endpoint = model.deploy( + endpoint_display_name=endpoint_display_name, + model_display_name=model_display_name, + ) + return { + "status": "success", + "Deployed model to endpoint: ": endpoint.resource_name, + } + except InvalidArgument as e: + # If the model_id format is incorrect or the model doesn't exist. + return { + "status": "error", + "error_message": ( + f"Invalid model ID or deployment parameters: {e}. Please check the" + " model ID and try again." + ), + } + except NotFound as e: + # If the model_id cannot be found. + return { + "status": "error", + "error_message": ( + f"Model '{model_id}' not found in Model Garden. Please verify the" + f" model ID and try again. Details: {e}" + ), + } + except ServiceUnavailable as e: # Specific catch for 503 errors + return { + "status": "error", + "error_message": ( + "Deployment failed due to service unavailability (503 error) for" + f" model '{model_id}'. This often means the requested resources" + " (based on the model's default/recommended configuration) are" + f" temporarily overloaded or unavailable in the '{location}'" + " region. Please try deploying again, or consider exploring" + " different deployment configurations or regions using the" + " 'get_recommended_deployment_config' tool if the issue persists." + f" Details: {e}" + ), + } + except GoogleAPIError as e: + # Catch broader API errors, e.g., permission issues, quota limits, etc. + return { + "status": "error", + "error_message": ( + f"Google Cloud API error during deployment: {e}. Please check your" + " project's permissions and quota." + ), + } + except Exception as e: + # Catch any other unexpected errors during deployment + return { + "status": "error", + "error_message": ( + f"An unexpected error occurred during model deployment: {e}" + ), + } + + +def list_endpoints() -> dict: + """Lists all Vertex AI Model Garden Endpoints in the current project and location. + + Returns: + dict: A dictionary containing status and a list of endpoint details, + or an error message. + """ + project_id = os.environ["GOOGLE_CLOUD_PROJECT"].lower() + location = os.environ["GOOGLE_CLOUD_LOCATION"].lower() + + aiplatform.init(project=project_id, location=location) + + try: + filter_str = "labels.mg-deploy:* OR labels.mg-one-click-deploy:*" + endpoints = aiplatform.Endpoint.list(filter=filter_str, location=location) + + if not endpoints: + return { + "status": "success", + "content": ( + "No Model Garden endpoints found in this project and location." + ), + } + + endpoint_list = [] + print(f"[DEBUG] endpoints: {endpoints}") + for ep in endpoints: + raw_time = ep.create_time.isoformat() + dt = datetime.fromisoformat(raw_time.replace("Z", "+00:00")) + formatted_time = dt.strftime("%B %d, %Y at %I:%M %p %Z") + + # Determine deployment status + if ep.traffic_split: + status = "Active" + else: + status = "Inactive" + + endpoint_list.append( + f"- ID: {ep.name.split('/')[-1]}\n" + f" Display Name: {ep.display_name}\n" + f" Status: {status}\n" + f" Created: {formatted_time}" + ) + + formatted_output = ( + "Here are your Model Garden endpoints:\n\n" + "\n\n".join(endpoint_list) + ) + + return { + "status": "success", + "content": formatted_output, + } + except GoogleAPIError as e: + return { + "status": "error", + "error_message": ( + f"Google Cloud API error while listing endpoints: {e}. Please check" + " your project's permissions and network connectivity." + ), + } + except Exception as e: + return { + "status": "error", + "error_message": ( + f"An unexpected error occurred while listing endpoints: {e}" + ), + } + + +def delete_endpoint(endpoint_id: str) -> str: + """Deletes a Vertex AI endpoint by ID. + + Args: + endpoint_id: The ID of the endpoint to delete. + + Returns: + A confirmation string if successful. + """ + project_id = os.environ["GOOGLE_CLOUD_PROJECT"].lower() + location = os.environ["GOOGLE_CLOUD_LOCATION"].lower() + endpoint_id = endpoint_id.lower() + + aiplatform.init(project=project_id, location=location) + try: + endpoint = aiplatform.Endpoint( + endpoint_name=( + f"projects/{project_id}/locations/{location}/endpoints/{endpoint_id}" + ) + ) + endpoint.delete(force=True) + return {"status": "success", "content": f"Deleted endpoint: {endpoint_id}"} + except NotFound as e: + # This exception is raised if the endpoint with the given ID doesn't exist. + return { + "status": "error", + "error_message": ( + f"Endpoint with ID '{endpoint_id}' not found. Please verify the" + f" endpoint ID and try again. Details: {e}" + ), + } + except InvalidArgument as e: + # This could happen if the endpoint_id format is malformed + return { + "status": "error", + "error_message": ( + f"Invalid endpoint ID format: {e}. Please provide a valid endpoint" + " ID." + ), + } + except GoogleAPIError as e: + # Catch broader API errors for deletion + return { + "status": "error", + "error_message": ( + f"Google Cloud API error during endpoint deletion: {e}. Please" + " check your project's permissions." + ), + } + except Exception as e: + # Catch any other unexpected errors during deletion + return { + "status": "error", + "error_message": ( + f"An unexpected error occurred during endpoint deletion: {e}" + ), + } + + +deploy_model_agent = Agent( + model="gemini-2.5-flash", + name="deploy_model_agent", + description=( + "A helpful agent for deploying AI models with Vertex Model Garden and" + " deletes them when no longer needed." + ), + instruction=(""" +You are a sub-agent in a multi-agent system that helps users deploy and manage AI models using Vertex AI Model Garden. +User requests are routed to this agent when they mention deploying or deleting endpoints. +Do not refer to yourself as a sub-agent or mention transfers. +Only respond to requests that fall within the scope of this agent. +If the user asks for something outside of this agent's scope, return control to the main agent. +Your purpose is to deploy AI models on Vertex Model Garden to Vertex AI endpoints, +using either a default or a recommended configuration. + +You are capable of the following functions: +- Deploying selected models using a default or recommended configuration. +- Listing all endpoints in the current project and location. +- Deleting deployed endpoints when the user is done with them. + +When deploying: +- If the user selects a specific option (e.g., "option 1"), use that exact configuration + from the recommendations. +- DO NOT fall back to the default deployment if a config is specified but fails, + unless the user explicitly asks for it. +- Assume the default endpoint and model display name is sufficient. + +After deploying: +- Inform the user that you can help them run inference on the model they just deployed. + +When listing Model Garden endpoints: +-If there are no endpoints, return a friendly message to the user informing them + that they have no model garden endpoints in this project and location. +- If there are model garden endpoints, return a list of endpoints with their ID, display name, + and create time. + +Before deleting an endpoint: +- Always ask the user to confirm the endpoint ID and their intent to delete. +- Do not call the deletion tool without explicit confirmation. +"""), + tools=[ + deploy_model_to_endpoint, + delete_endpoint, + list_endpoints, + ], +) diff --git a/python/agents/model_garden_agent/model_discov_agent.py b/python/agents/model_garden_agent/model_discov_agent.py new file mode 100644 index 00000000..e5e445ba --- /dev/null +++ b/python/agents/model_garden_agent/model_discov_agent.py @@ -0,0 +1,187 @@ +import os +import subprocess + +from google.adk.agents import Agent +from google.api_core import exceptions +import requests +import vertexai +from vertexai import model_garden + +NotFound = exceptions.NotFound +InvalidArgument = exceptions.InvalidArgument +GoogleAPIError = exceptions.GoogleAPIError +ServiceUnavailable = exceptions.ServiceUnavailable + +vertexai.init( + project=os.environ.get("GOOGLE_CLOUD_PROJECT", None), + location=os.environ.get("GOOGLE_CLOUD_LOCATION", None), +) + + +def make_model_garden_search_request(query: str) -> bool: + """Makes an authenticated GET request to the Model Garden search API, and handles all outputting of results and errors. + + Args: + query: The search query string. + + Returns: + True if the request and output were successful, False otherwise. + """ + # Get the Google Cloud access token + try: + process = subprocess.run( + ["gcloud", "auth", "print-access-token"], + capture_output=True, + text=True, + check=True, + ) + access_token = process.stdout.strip() + except subprocess.CalledProcessError as e: + print(f"Error getting access token: {e}") + return None + + # Define the request parameters + project_id = os.environ.get("GOOGLE_CLOUD_PROJECT", None) + publisher = "google" + endpoint = "us-central1-aiplatform.googleapis.com" + + # Construct the full URL with query parameters + url = f"https://{endpoint}/ui/publishers/{publisher}:search" + params = {"query": query} + + print(f"Making request to: {url} with query: '{query}'") + + headers = { + "Authorization": f"Bearer {access_token}", + "Content-Type": "application/json", + "X-Goog-User-Project": project_id, + } + + # Make the GET request + try: + response = requests.get(url, headers=headers, params=params) + response.raise_for_status() # Raise an HTTPError for bad responses (4xx or 5xx) + + search_results = response.json() + raw_data = search_results.get("publisherModels", []) + + # formatted list for direct output + formatted_output = "" + if raw_data: + formatted_lines = [ + f"I found {len(raw_data)} model(s) matching your request:" + ] + for model in raw_data: + display_name = model.get("displayName", "N/A") + overview = model.get("overview", "No description available.") + formatted_lines.append(f"\n- **Model Name:** {display_name}") + formatted_lines.append(f" **Description:** {overview}") + formatted_output = "\n".join(formatted_lines) + else: + formatted_output = ( + f"I could not find any models matching the query: '{query}'." + ) + + return { + "raw_data": raw_data, + "formatted_list": formatted_output, + } + + except requests.exceptions.RequestException as e: + print(f"API request error: {e}") + return { + "raw_data": [], + "formatted_list": f"An API request error occurred: {e}", + } + except Exception as e: + print(f"An unexpected error occurred: {e}") + return { + "raw_data": [], + "formatted_list": f"An unexpected error occurred: {e}", + } + + +def list_deployable_models(model_filter: str) -> dict: + """Lists all deployable models on vertex model garden filtered by the given filter string. + + Args: + model_filter (str): A string for filtering the resulting list of deployable + models. The string can only contain letters, numbers, hyphens (-), + underscores (_), and periods (.) The string will be matched against + specific model names and must therefore not include anything that would + not be found in a model name. + + Returns: + dict: status and content or error message. + """ + result = {} + try: + all_model_garden_models = model_garden.list_deployable_models( + model_filter="", list_hf_models=False + ) + model_garden_results = [ + model + for model in all_model_garden_models + if model_filter.lower() in model + ] + huggingface_results = model_garden.list_deployable_models( + model_filter=model_filter.lower(), list_hf_models=True + ) + model_search_results = model_garden_results + huggingface_results + if not model_search_results: + result["status"] = "error" + result["error_message"] = ( + "No deployable models with the given filter were found. Please try" + " searching again with a different filter." + ) + else: + result["status"] = "success" + result["content"] = ( + f"The number of models found is {len(model_search_results)}." + f" The models found are :{model_search_results}" + ) + + except ValueError as e: + result["status"] = "error" + result["error_message"] = f"{e}" + + return result + + +model_discovery_agent = Agent( + model="gemini-2.5-flash", + name="model_discovery_agent", + description=( + "A helpful agent for discovering deployable models from Vertex AI Model" + " Garden using a filter." + ), + instruction=(""" +You are a specialized agent within a multi-agent system, focused on helping users find and reason about models in the Vertex AI Model Garden catalog. +You should not perform any web searches or answer general knowledge questions. Your knowledge is strictly limited to the Model Garden catalog. + +Your primary role is to interpret a user's request and intelligently use the `make_model_garden_search_request` tool to find and present model information. + +When a user asks to find a model, follow these steps: + +- Step 1: Use the `make_model_garden_search_request` tool to retrieve data. + - Call the `make_model_garden_search_request` tool with the user's query as the `query` argument. + +- Step 2: Determine how to present the output based on user intent. + - If the user's query was a Direct Search** (e.g., "list models with keyword `gemma`" or a specific model name), + take the `formatted_list` string from the tool's output and present it directly to the user. Do not add any extra analysis. + - If the user's query was a Reasoning Search** (e.g., "best lightweight model for text generation"), ignore the `formatted_list`. + Instead, analyze the `raw_data` list to make a recommendation. + - Filter and rank the models based on the user's criteria (e.g., "lightweight" implies low resource requirements, + "best" implies a high trending score or popular downloads). + - Construct a final conversational response that recommends the model(s) and explains the reasoning behind the choice. + +- Step 3: Handle failures and out-of-scope requests. + - If the `make_model_garden_search_request` tool's output indicates that no models were found, state that clearly. + - If the user's request is completely outside the scope of Model Garden (e.g., "What is the weather?"), + indicate that you cannot help with that specific request and return control to the main agent. +"""), + tools=[ + list_deployable_models, + make_model_garden_search_request, + ], +) diff --git a/python/agents/model_garden_agent/model_inference_agent.py b/python/agents/model_garden_agent/model_inference_agent.py new file mode 100644 index 00000000..aea08cb5 --- /dev/null +++ b/python/agents/model_garden_agent/model_inference_agent.py @@ -0,0 +1,251 @@ +import os +from google import genai +from google.adk.agents import Agent +from google.api_core.exceptions import GoogleAPIError +from google.api_core.exceptions import NotFound +from google.api_core.exceptions import ServiceUnavailable +import vertexai +from vertexai import model_garden + +PROJECT_ID = os.environ.get("GOOGLE_CLOUD_PROJECT", None) +LOCATION = os.environ.get("GOOGLE_CLOUD_LOCATION", None) + +vertexai.init( + project=PROJECT_ID, + location=LOCATION, +) + + +def run_inference(endpoint_id: str, prompt: str) -> dict: + """Runs inference on a deployed model given the model name and a text prompt + + and returns a dict containing the model's response as a string if successful. + + Args: + endpoint_id (str): A string that is the endpoint ID of the Vertex AI + endpoint to which the model was deployed. It typically follows the format: + mg-[0-9]{10} (e.g. mg-endpoint-1234567890). It will be concatenated to + construct the full endpoint resource name which follows the format: + projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/{endpoint_id}. + prompt (str): A string prompt to be used to run inference on the model. + + Returns: + dict: status and content containing the model's response or an error + message if unsuccessful. + """ + try: + client = genai.Client( + vertexai=True, + project=PROJECT_ID, + location=LOCATION, + ) + response = client.models.generate_content( + model=f"projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/{endpoint_id}", + contents=prompt, + ).text + return {"status": "success", "content": response} + + except NotFound as e: + return { + "status": "error", + "error_message": ( + "This error is likely due to an invalid endpoint ID being used to" + " run inference. Please ensure the endpoint ID provided is valid." + f" Details: {e}" + ), + } + + except ServiceUnavailable as e: + return { + "status": "error", + "error_message": ( + "The Vertex AI Service or the specific endpoint on which inference" + " is being run is temporarily unavailable. Please try again." + f" Details: {e}" + ), + } + + except GoogleAPIError as e: + return { + "status": "error", + "error_message": ( + f"A Google API Error occurred while running inference. Details: {e}" + ), + } + + except Exception as e: + return { + "status": "error", + "error_message": ( + "An unexpected error occurred during model deployment." + f" Details: {e}" + ), + } + + +def inference_request_guide(model_name: str, endpoint_id: str): + """Returns detailed information on how to run inference for a specific deployed model + + given the model name and endpoint ID of the model. + It specifically shows code snippets on how to run inference on a deployed + model through: + 1. The Vertex AI SDK + 2. The ChatCompletion API of the OpenAI SDK + 3. The GenAI SDK + + Args: + model_name (str): Model Garden model resource name in the format of + publishers/{publisher}/models/{model}@{version}, or a simplified resource + name in the format of {publisher}/{model}@{version}, or a Hugging Face + model ID in the format of {organization}/{model}. + endpoint_id (str): A string denoting the endpoint ID of the Vertex AI + endpoint to which the model was deployed. It typically follows the format: + mg-[0-9]{10,} (e.g. mg-endpoint-1234567890). + + Returns: + dict: status and content or error message. + If successful, the content will be a string with detailed instructions + on how the user can run inference on the deployed model + """ + response = f"""This is how you can run inference on the model {model_name} deployed + to the endpoint {endpoint_id}: + + """ + + try: + sample_request = ( + model_garden.OpenModel(model_name) + .list_deploy_options()[0] + .deploy_metadata.sample_request + ) + + response += f"""The sample request for the model is as follows: + +```{sample_request}``` + + +Based on this sample request, you can run inference on the model using: +(1) The Vertex AI Python SDK: + The following code snippet demonstrates how to do this: + +``` + from google.cloud import aiplatform + + endpoint_name = "projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/{endpoint_id}" + endpoint = aiplatform.Endpoint(endpoint_name=endpoint_name) + prediction = endpoint.predict(\n{sample_request[1:-2]}\n) + print(prediction.predictions[0]) +``` + + +(2) The ChatCompletion API of the OpenAI SDK: + The following code snippet demonstrates this: +``` + import openai + import google.auth + + creds, project = google.auth.default() + auth_req = google.auth.transport.requests.Request() + creds.refresh(auth_req) + + endpoint_url = f"https://{LOCATION}-aiplatform.googleapis.com/v1beta1/projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/{endpoint_id}" + + client = openai.OpenAI(base_url=endpoint_url, api_key=creds.token) + + + # TODO: replace with prompt you would like to use to run inference. + prompt = "Tell me a joke" + + prediction = client.chat.completions.create( + model="", + messages=[{{"role": "user", "content": prompt}}], + ) + print(prediction.choices[0].message.content) +``` + + +(3) The GenAI Python SDK + The code snippet below also demonstrates how to run inference using the GenAI Python SDK: +``` + from google import genai + + client = genai.Client( + vertexai=True, + project={PROJECT_ID}, + location={LOCATION}, + ) + + # TODO: replace with prompt you would like to use to run inference. + prompt = "Tell me a joke" + + response = client.models.generate_content( + model=f"projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/{endpoint_id}", + contents=prompt, + ).text + print(response) +``` + """ + return {"status": "success", "content": response} + + except ValueError as e: + return { + "status": "error", + "error_message": ( + "This error is likely due to an invalid model_name. " + "Please ensure the model name provided is valid." + f" Details: {e}" + ), + } + + except GoogleAPIError as e: + return { + "status": "error", + "error_message": ( + f"A Google API Error occurred while running inference. Details: {e}" + ), + } + + except Exception as e: + return { + "status": "error", + "error_message": ( + "An unexpected error occurred during model deployment." + f" Details: {e}" + ), + } + + +model_inference_agent = Agent( + model="gemini-2.5-flash", + name="model_inference_agent", + description=( + """A helpful agent for assisting the user to run inference on a deployed model.""" + ), + instruction=(""" + You are a sub-agent in a multi-agent system that helps users deploy and manage AI models using Vertex AI Model Garden. +User requests are routed to this agent when they mention running inference on a deployed model. +Do not refer to yourself as a sub-agent or mention transfers and only respond to requests that fall within the scope of this agent. +If the user asks for something outside of this agent's scope, return control to the main agent. +Your purpose is to run inference on a deployed model and to guide the user on how they can run inference on a model they have deployed. + +You are currently capable of: + - Running inference directly on a deployed model given the model's endpoint ID and the string prompt to be used to run inference. + - Giving detailed instructions to the user on how they can run inference requests through one of the following methods: + VertexAI Python SDK, OpenAI SDK, and GenAI Python SDK. + +RULES + 1. In your interactions with the user, start by clarifying if the user would like you to run inference directly on the deployed model or if they would + instead like you to guide them on how they can run inference using the Vertex AI SDK, OpenAI SDK, or GenAI SDK. + 2. If the user provides an endoint ID that is a full endpoint resource name following the format: + projects/[PROJECT_ID]/locations/[LOCATION]/endpoints/[endpoint_id], extract `endpoint_id` specifically + from the resource name and use that as the ID when calling the `run_inference` tool or the `inference_request_guide` tool. + 3. If after asking the user to provide a prompt for running inference, you are unsure if their response is a direct + question to you or a prompt for running inference, ask the user for clarification before running inference or answering their question. + For example, you can ask "Is the above the prompt you would like to use to run inference?" + 4. When guiding the user on how to run inference request, be sure to extract the appropriate model name and endpoint ID from your previous conversations with the user before calling the `inference_request_guide` tool. + 5. When guiding the user on how to run inference request, format all content nested within backticks ``` as a code block. + Do not include any backticks ``` literally in your output. + 6. When guiding the user on how to run inference request, do not format pound signs # as headings. Use them as literal pound signs in your output. +"""), + tools=[run_inference, inference_request_guide], +) diff --git a/python/agents/model_garden_agent/setup_rec_agent.py b/python/agents/model_garden_agent/setup_rec_agent.py new file mode 100644 index 00000000..d5281ad3 --- /dev/null +++ b/python/agents/model_garden_agent/setup_rec_agent.py @@ -0,0 +1,136 @@ +import os +from typing import Optional +from google.adk.agents import Agent +from google.api_core import exceptions +from google.cloud import aiplatform +import vertexai +from vertexai import model_garden + +NotFound = exceptions.NotFound +InvalidArgument = exceptions.InvalidArgument +GoogleAPIError = exceptions.GoogleAPIError +ServiceUnavailable = exceptions.ServiceUnavailable + +vertexai.init( + project=os.environ.get("GOOGLE_CLOUD_PROJECT", None), + location=os.environ.get("GOOGLE_CLOUD_LOCATION", None), +) + + +def get_recommended_deployment_config(model_id: str) -> dict: + """Fetches and formats the recommended deployment configurations for a Model Garden model. + + Args: + model_id: The ID of the model in Model Garden (e.g., + "google/gemma@gemma-2b"). + + Returns: + dict: status and content or error message with deployment options listed + and indexed. + """ + project_id = os.environ["GOOGLE_CLOUD_PROJECT"].lower() + location = os.environ["GOOGLE_CLOUD_LOCATION"].lower() + model_id = model_id.lower() + + aiplatform.init(project=project_id, location=location) + + try: + model = model_garden.OpenModel(model_id) + deploy_options = model.list_deploy_options() + + if not deploy_options: + return { + "status": "warning", + "content": ( + f"No specific deployment options found for model '{model_id}'." + " This might mean the model has default configurations or is not" + " directly deployable via this method." + ), + } + + formatted_options = [] + for i, option in enumerate(deploy_options): + option_str = [f"**Option {i}:**"] + if option.dedicated_resources: + spec = option.dedicated_resources.machine_spec + option_str.append(f" - Machine Type: {spec.machine_type}") + if spec.accelerator_type and spec.accelerator_count: + option_str.append( + f" - Accelerator Type: {spec.accelerator_type.name}" + ) + option_str.append(f" - Accelerator Count: {spec.accelerator_count}") + if option.container_spec: + option_str.append( + f" - Container Image: {option.container_spec.image_uri}" + ) + formatted_options.append("\n".join(option_str)) + + return { + "status": "success", + "content": ( + f"Recommended deployment options for '{model_id}':\n\n" + + "\n\n".join(formatted_options) + ), + } + + except NotFound as e: + return { + "status": "error", + "error_message": ( + f"Model '{model_id}' not found in Model Garden. Cannot fetch" + f" deployment recommendations. Details: {e}" + ), + } + except InvalidArgument as e: + return { + "status": "error", + "error_message": ( + f"Invalid model ID format: {e}. Please provide a valid model ID to" + " get deployment recommendations." + ), + } + except GoogleAPIError as e: + return { + "status": "error", + "error_message": ( + "Google Cloud API error when fetching deployment recommendations:" + f" {e}. Please check your project's permissions." + ), + } + except Exception as e: + return { + "status": "error", + "error_message": ( + "An unexpected error occurred while fetching deployment" + f" recommendations: {e}" + ), + } + + +setup_rec_agent = Agent( + model="gemini-2.5-flash", + name="setup_rec_agent", + description=( + "A helpful agent for providing setup recommendations for deploying AI" + " models." + ), + instruction=(""" +You are a sub-agent in a multi-agent system that helps users deploy and manage AI models using Vertex AI Model Garden. +User requests are routed to this agent when they mention deploying or deleting endpoints. +Do not refer to yourself as a sub-agent or mention transfers. +Only respond to requests that fall within the scope of this agent. +If the user asks for something outside of this agent's scope, return control to the main agent. +Your purpose is to provide setup recommendations for deploying AI models. + +You are capable of the following: +- Listing all recommended deployment configurations for a given model ID. + +When listing deployment options: +- Clearly show each one with a numbered index (e.g., "Option 0", "Option 1"). +- Include relevant details like machine type and accelerator (if available). +-Once the user selects an option and wants to deploy or do something else, transfer control to the root agent. +"""), + tools=[ + get_recommended_deployment_config, + ], +)