Stagehand incorrectly identifies regular DOM input element as a Shadow DOM

Stagehand's `act()` method incorrectly identifies regular DOM input element as being inside Shadow DOM, returning `"method": "not-supported"` and `"selector": "not-supported"` for elements that are accessible via standard DOM queries.

## Environment

- **Stagehand Version**: `^2.4.2`
- **Browser**: Chromium (Playwright)
- **LLM**: Ollama (gemma3:12b) using CustomOpenAIClient
- **FE Framework**: React with Material-UI (MUI) components
- **OS**: macOS (darwin 24.5.0)


Stagehand correctly identifies the target element but incorrectly reports it as being inside Shadow DOM:

```
[2025-08-06 18:12:10.123 +0300] INFO: Acting on instruction: click on create new
[2025-08-06 18:12:10.456 +0300] INFO: Found element: button "Create New"
[2025-08-06 18:12:10.789 +0300] INFO: Action completed successfully
[2025-08-06 18:12:12.091 +0300] INFO: Acting on instruction: fill the name input field
[2025-08-06 18:12:12.091 +0300] INFO: LLM identified: textbox: Enter name
[2025-08-06 18:12:12.091 +0300] ERROR: Element is inside a shadow DOM: 868
    category: "observation"
[2025-08-06 18:12:12.091 +0300] INFO: found elements
    category: "observation"
    elements: [
      {
        "description": "an element inside a shadow DOM",
        "method": "not-supported",
        "selector": "not-supported"
      }
    ]
```

CustomOpenAI.ts (complete implementation):
```typescript
/**
 * Based on the official Stagehand custom OpenAI client template
 * Modified for Ollama integration via OpenAI-compatible API
 */
import {
  AvailableModel,
  CreateChatCompletionOptions,
  LLMClient,
} from "@browserbasehq/stagehand";
import OpenAI from "openai";
import { zodResponseFormat } from "openai/helpers/zod";
import type {
  ChatCompletion,
  ChatCompletionAssistantMessageParam,
  ChatCompletionContentPartImage,
  ChatCompletionContentPartText,
  ChatCompletionCreateParamsNonStreaming,
  ChatCompletionMessageParam,
  ChatCompletionSystemMessageParam,
  ChatCompletionUserMessageParam,
} from "openai/resources/chat/completions";
import { z } from "zod";

class CreateChatCompletionResponseError extends Error {
  constructor(message: string) {
    super(message);
    this.name = "CreateChatCompletionResponseError";
  }
}

function validateZodSchema(schema: z.ZodTypeAny, data: unknown) {
  try {
    schema.parse(data);
    return true;
  } catch {
    return false;
  }
}

export class CustomOpenAIClient extends LLMClient {
  public type = "openai" as const;
  private client: OpenAI;

  constructor({ modelName, client }: { modelName: string; client: OpenAI }) {
    super(modelName as AvailableModel);
    this.client = client;
    this.modelName = modelName as AvailableModel;
  }

  async createChatCompletion<T = ChatCompletion>({
    options,
    retries = 3,
    logger,
  }: CreateChatCompletionOptions): Promise<T> {
    const { image, requestId, ...optionsWithoutImageAndRequestId } = options;

    if (image) {
      console.warn(
        "Image provided. Vision is not currently supported for openai"
      );
    }

    logger({
      category: "openai",
      message: "creating chat completion",
      level: 1,
      auxiliary: {
        options: {
          value: JSON.stringify({
            ...optionsWithoutImageAndRequestId,
            requestId,
          }),
          type: "object",
        },
        modelName: {
          value: this.modelName,
          type: "string",
        },
      },
    });

    let responseFormat: any = undefined;
    if (options.response_model) {
      responseFormat = zodResponseFormat(
        options.response_model.schema,
        options.response_model.name
      );
    }

    const { response_model, ...openaiOptions } = {
      ...optionsWithoutImageAndRequestId,
      model: this.modelName,
    };

    const formattedMessages: ChatCompletionMessageParam[] =
      options.messages.map((message) => {
        if (Array.isArray(message.content)) {
          const contentParts = message.content.map((content) => {
            if ("image_url" in content && content.image_url) {
              const imageContent: ChatCompletionContentPartImage = {
                image_url: {
                  url: content.image_url.url,
                },
                type: "image_url",
              };
              return imageContent;
            } else {
              const textContent: ChatCompletionContentPartText = {
                text: content.text || "",
                type: "text",
              };
              return textContent;
            }
          });

          if (message.role === "system") {
            const formattedMessage: ChatCompletionSystemMessageParam = {
              ...message,
              role: "system",
              content: contentParts.filter(
                (content): content is ChatCompletionContentPartText =>
                  content.type === "text"
              ),
            };
            return formattedMessage;
          } else if (message.role === "user") {
            const formattedMessage: ChatCompletionUserMessageParam = {
              ...message,
              role: "user",
              content: contentParts,
            };
            return formattedMessage;
          } else {
            const formattedMessage: ChatCompletionAssistantMessageParam = {
              ...message,
              role: "assistant",
              content: contentParts.filter(
                (content): content is ChatCompletionContentPartText =>
                  content.type === "text"
              ),
            };
            return formattedMessage;
          }
        }

        const formattedMessage: ChatCompletionUserMessageParam = {
          role: "user",
          content: message.content || "",
        };

        return formattedMessage;
      });

    const body: ChatCompletionCreateParamsNonStreaming = {
      ...openaiOptions,
      model: this.modelName,
      messages: formattedMessages,
      response_format: responseFormat,
      stream: false,
      tools: options.tools?.map((tool) => ({
        function: {
          name: tool.name,
          description: tool.description,
          parameters: tool.parameters,
        },
        type: "function",
      })),
    };

    const response = await this.client.chat.completions.create(body);

    logger({
      category: "openai",
      message: "response",
      level: 1,
      auxiliary: {
        response: {
          value: JSON.stringify(response),
          type: "object",
        },
        requestId: {
          value: requestId || "",
          type: "string",
        },
      },
    });

    if (options.response_model) {
      const extractedData = response.choices[0].message.content;
      if (!extractedData) {
        throw new CreateChatCompletionResponseError("No content in response");
      }
      const parsedData = JSON.parse(extractedData);

      if (!validateZodSchema(options.response_model.schema, parsedData)) {
        if (retries > 0) {
          return this.createChatCompletion({
            options,
            logger,
            retries: retries - 1,
          });
        }

        throw new CreateChatCompletionResponseError("Invalid response schema");
      }

      return {
        data: parsedData,
        usage: {
          prompt_tokens: response.usage?.prompt_tokens ?? 0,
          completion_tokens: response.usage?.completion_tokens ?? 0,
          total_tokens: response.usage?.total_tokens ?? 0,
        },
      } as T;
    }

    return {
      data: response.choices[0].message.content,
      usage: {
        prompt_tokens: response.usage?.prompt_tokens ?? 0,
        completion_tokens: response.usage?.completion_tokens ?? 0,
        total_tokens: response.usage?.total_tokens ?? 0,
      },
    } as T;
  }
}
```
stagehand.config.ts:
```typescript
import { Stagehand } from "@browserbasehq/stagehand";
import OpenAI from "openai";
import { CustomOpenAIClient } from "./external_clients/customOpenAI";

export const createStagehandWithOllama = () => {
  return new Stagehand({
    env: "LOCAL" as const,
    llmClient: new CustomOpenAIClient({
      modelName: "gemma3:12b",
      client: new OpenAI({
        apiKey: "ollama",
        baseURL: "http://localhost:11434/v1",
      }),
    }),
    localBrowserLaunchOptions: {
      headless: false,
    },
    domSettleTimeoutMs: 30000,
    enableCaching: true,
    verbose: 1 as const,
    selfHeal: true,
  });
};
```

Some more info I gathered:

### DOM Analysis Shows No Shadow DOM

Manual inspection via Playwright reveals:

- **0 iframes** on the page
- **0 Shadow DOM hosts** on the page
- **Regular input element** accessible at: `Input 20: placeholder="Enter name", type="text", name="null"`

### LLM Correctly Identifies Element

The LLM successfully identifies the target: `textbox: Enter name` - indicating the accessibility tree parsing is working correctly.

### Playwright Can Interact Successfully

Standard Playwright can successfully interact with the same element:

```typescript
const nameInput = page.locator('input[placeholder="Enter name"]');
await nameInput.fill("test"); // Works perfectly
```

### Test Setup

```typescript
// Using CustomOpenAI client for Ollama integration
const stagehand = createStagehandWithOllama();
await stagehand.init();
```

Since I can't provide the web app code / URL (its an internal tool in a company i work in), this is a scenario of what i tried to test:

1. Navigate to a form containing MUI input components
2. Use Stagehand to interact with buttons/dropdowns (these work fine)
3. Attempt to use `await page.act("fill the input field")` on a text input
4. Observe the Shadow DOM error despite no Shadow DOM being present

### Form Structure

The failing input element is within:

- **React application**
- **Material-UI (MUI) form components**
- **Standard HTML form element** (no custom shadow roots)
- **Regular DOM hierarchy** (verified via `document.querySelectorAll('*')`)
- **Part of MUI TextField component** (but rendered as standard DOM)


### Element Details

```html
<input placeholder="Enter name" type="text" name="null" />
```

This suggests an issue in:

- Shadow DOM detection logic (false positive)
- XPath/selector generation for certain input patterns
- Element traversal within form contexts


## Workaround in case some one needs it as i don't think it happens with a standard input element

Currently using a hybrid approach:

```typescript
// Use Stagehand for elements it handles well
await stagehand.page.act("click on create new");
await stagehand.page.act("click on item1");

// Fallback to Playwright for problematic inputs
const input = stagehand.page.locator('input[placeholder="Enter name"]');
await input.fill("test");
```

Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Stagehand incorrectly identifies regular DOM input element as a Shadow DOM #943

Environment

DOM Analysis Shows No Shadow DOM

LLM Correctly Identifies Element

Playwright Can Interact Successfully

Test Setup

Form Structure

Element Details

Workaround in case some one needs it as i don't think it happens with a standard input element

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Stagehand incorrectly identifies regular DOM input element as a Shadow DOM #943

Description

Environment

DOM Analysis Shows No Shadow DOM

LLM Correctly Identifies Element

Playwright Can Interact Successfully

Test Setup

Form Structure

Element Details

Workaround in case some one needs it as i don't think it happens with a standard input element

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions