Skip to content
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions test/test-markdown-frontmatter.js
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ const chalk = require('chalk')
// accepted data field values
const sdk_languages = ['nodejs', 'scala', 'python', 'swift', 'csharp', 'objective-c', 'android-java', 'any', 'java', 'kotlin', 'dart', 'golang', 'c++']

const tags = ['Ottoman', 'Ktor', 'REST API', 'Express', 'Flask', 'TLS', 'Configuration', 'Next.js', 'iOS', 'Xcode', '.NET', 'Xamarin', 'Authentication', 'OpenID', 'Keycloak', 'Android', 'P2P', 'UIKit', 'Installation', 'Spring Boot', 'Spring Data', 'Transactions', 'SQL++ (N1QL)', 'Optimization', 'Community Edition', 'Docker', 'Data Modeling', 'Metadata', 'Best Practices', 'Data Ingestion', 'Kafka', 'Support', 'Customer', 'Prometheus', 'Monitoring', 'Observability', 'Metrics', 'Query Workbench', 'ASP.NET', 'linq', 'DBaaS', 'App Services', 'Flutter', 'Gin Gonic', 'FastAPI', 'LangChain', "OpenAI", "Streamlit", 'Google Gemini', 'Nvidia NIM', 'LLama3', 'AWS', 'Artificial Intelligence', 'Cohere', 'Jina AI', 'Mistral AI', 'Ragas', 'Haystack', 'LangGraph', 'Amazon Bedrock', 'CrewAI', 'PydanticAI', 'C++', 'C++ SDK', 'smolagents', 'Ag2', 'Autogen', 'Couchbase Edge Server', 'Deepseek', 'OpenRouter', 'mastra']
const tags = ['Ottoman', 'Ktor', 'REST API', 'Express', 'Flask', 'TLS', 'Configuration', 'Next.js', 'iOS', 'Xcode', '.NET', 'Xamarin', 'Authentication', 'OpenID', 'Keycloak', 'Android', 'P2P', 'UIKit', 'Installation', 'Spring Boot', 'Spring Data', 'Transactions', 'SQL++ (N1QL)', 'Optimization', 'Community Edition', 'Docker', 'Data Modeling', 'Metadata', 'Best Practices', 'Data Ingestion', 'Kafka', 'Support', 'Customer', 'Prometheus', 'Monitoring', 'Observability', 'Metrics', 'Query Workbench', 'ASP.NET', 'linq', 'DBaaS', 'App Services', 'Flutter', 'Gin Gonic', 'FastAPI', 'LangChain', "OpenAI", "Streamlit", 'Google Gemini', 'Nvidia NIM', 'LLama3', 'AWS', 'Artificial Intelligence', 'Cohere', 'Jina AI', 'Mistral AI', 'Ragas', 'Haystack', 'LangGraph', 'Amazon Bedrock', 'CrewAI', 'PydanticAI', 'C++', 'C++ SDK', 'smolagents', 'Ag2', 'Autogen', 'Couchbase Edge Server', 'Deepseek', 'OpenRouter', 'mastra', 'Looker Studio', 'Google Data Studio', 'Connector', 'Couchbase Columnar', 'Views-only', 'Data API']

const technologies = ['connectors', 'kv', 'query', 'capella', 'server', 'index', 'mobile', 'fts', 'sync gateway', 'eventing', 'analytics', 'udf', 'vector search', 'react', 'edge-server', 'app-services']

Expand Down Expand Up @@ -95,7 +95,7 @@ const test = (data, path) => {
process.exit(1)
}
//testing title length
if (data.title?.length > 72) {
if (data.title?.length > 100) {
makeResponseFailure(data, path, 'Invalid title Length', data.title?.length, 'Post title must be less than 72 characters long')
process.exit(1)
}
Expand Down
166 changes: 166 additions & 0 deletions tutorial/markdown/connectors/looker-studio/columnar/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
---
# frontmatter
path: "/tutorial-looker-studio-columnar"
# title and description do not need to be added to markdown, start with H2 (##)
title: Connect Looker Studio to Couchbase Columnar using Views and Custom Queries
short_title: Columnar Looker Studio Connector
description:
- Connect Google Looker Studio to Couchbase Columnar using Tabular Analytics Views (TAVs) or custom queries
- Create Tabular Analytics Views in Capella for stable datasets or use custom SQL++ queries for flexibility
- Learn authentication, configuration, schema inference, and troubleshooting
content_type: tutorial
filter: connectors
technology:
- server
- query
tags:
- Looker Studio
- Couchbase Columnar
- Connector
- Views-only
sdk_language:
- nodejs
length: 20 Mins
---

<!-- [abstract] -->

## Overview

Connect Looker Studio to Couchbase Columnar for data analysis and visualization. This connector supports two modes: Tabular Analytics Views (TAVs) for stable, optimized data sources, and custom queries for flexible data exploration.

**Workflow**: Either create TAVs in Couchbase Capella for consistent reporting, or use custom SQL++ queries for ad-hoc analysis. TAVs provide a stable, schema-defined interface that's optimized for BI tools, while custom queries offer maximum flexibility for complex data operations.

The connector authenticates with Basic Auth to the Columnar API (`/api/v1/request`) and infers schema automatically using `array_infer_schema` so Looker Studio fields are created with reasonable types.

## Prerequisites

- A Couchbase Columnar deployment reachable from Looker Studio. For setup information, see [Getting Started with Couchbase Columnar](https://www.couchbase.com/products/analytics/).
- A database user with permissions to read from the target Tabular Analytics Views (TAVs) and execute queries.
- Network access from Looker Studio to your Columnar host.

## Authentication

When adding the data source, provide:

- Path: The Columnar host. Example:
- Capella host: `cb.<your-host>.cloud.couchbase.com`
- Username and Password: Database credentials.

The connector validates credentials by running a lightweight test query (`SELECT 1 AS test;`).

When you add the Couchbase Columnar connector in Looker Studio, you'll see the authentication screen:

![Authentication Screen](step-0.png "Couchbase Columnar connector authentication screen in Looker Studio")

## Create Tabular Analytics Views (TAVs) in Capella (Recommended)

For the "By View" mode, create Tabular Analytics Views in Capella:

1. Open your Capella cluster, go to the Analytics tab, and launch the Analytics Workbench.
2. Prepare a SQL++ query that returns a tabular result. For simple, flat data structures, basic SELECT statements work well. For nested objects, consider flattening them for better BI tool compatibility. For example:

```sql
-- Example with flattening for nested data structures
SELECT a.airportname AS airport_name,
a.city AS city,
a.country AS country,
a.geo.lat AS latitude,
a.geo.lon AS longitude,
ARRAY_FLATTEN(a.faa, ', ') AS faa_codes
FROM `travel-sample`.`inventory`.`airport` AS a
WHERE a.country = 'United States'
LIMIT 100;
```

3. Run the query, then click Save as View → Annotate for Tabular View. Define the schema (column names, data types, and primary keys) and save with a descriptive name.

- For details, see [Tabular Analytics Views](https://docs.couchbase.com/columnar/query/views-tavs.html) and [Buckets, Scopes, and Collections](https://docs.couchbase.com/cloud/clusters/data-service/about-buckets-scopes-collections.html).

## Configuration

Choose your mode in the configuration screen:

- Configuration Mode: Choose between `By View` or `Use Custom Query`.

### Mode: By View (TAV)

- Couchbase Database, Scope, View: Selected from dropdowns that automatically discover your available databases, scopes, and views from your Columnar instance.
- Maximum Rows: Optional limit for returned rows; leave blank for no limit.

What runs:

- Data: `SELECT <requested fields or *> FROM \`database\`.\`scope\`.\`view\` [LIMIT n]`
- Schema: `SELECT array_infer_schema((SELECT VALUE t FROM \`database\`.\`scope\`.\`view\` [LIMIT n])) AS inferred_schema;`

### Mode: By Custom Query

- Custom Columnar Query: Enter your own SQL++ query directly in a text area.
- Maximum Rows: Not applicable (control limits within your query using `LIMIT`).

**Example custom query**:
```sql
SELECT airline.name AS airline_name,
airline.iata AS iata_code,
airline.country AS country,
COUNT(*) AS route_count
FROM `travel-sample`.`inventory`.`airline` AS airline
JOIN `travel-sample`.`inventory`.`route` AS route
ON airline.iata = route.airline
WHERE airline.country = "United States"
GROUP BY airline.name, airline.iata, airline.country
LIMIT 100;
```

What runs:
- Data: Your exact custom query as entered
- Schema: `SELECT array_infer_schema((your_custom_query)) AS inferred_schema;`

After authentication, configure the connector by selecting your database, scope, and view:

![Database Scope View Configuration](step-1.png "Configuring database, scope, and view selection in Looker Studio")

## Schema and Field Types

- The connector converts inferred types to Looker types:
- number → NUMBER (metric)
- boolean → BOOLEAN (dimension)
- string/objects/arrays/null → STRING/TEXT (dimension)
- Nested fields are flattened using dot and array index notation where possible (for example, `address.city`, `schedule[0].day`). Unstructured values may be stringified.

> **⚠️ Schema Inference Notes**: For TAVs, schema inference uses `array_infer_schema` on the entire dataset unless you specify Maximum Rows (which adds a LIMIT clause for sampling). Field types are inferred from the analyzed data and may miss variations (e.g., fields containing both text and numbers in different documents). If schema inference fails, ensure your TAV contains data and consider adding a Maximum Rows limit for faster sampling during testing.

Once your schema is configured, you can customize the fields in your Looker Studio dashboard:

![Field Configuration in Dashboard](step-2.png "Configuring fields in Looker Studio dashboard before creating reports")

## Data Retrieval

- Only requested fields are projected. For nested fields, the connector fetches the required base fields and extracts values server-side within the Apps Script environment.
- Row limits:
- View mode: `Maximum Rows` controls `LIMIT` (blank = no limit).

## Tips and Best Practices

- **Prefer Tabular Analytics Views for BI tooling**: TAVs provide a stable, optimized interface with predefined schemas, making them ideal for consistent reporting and visualization. They also offer better performance than ad-hoc queries.
- **Use `LIMIT` while exploring**: Start with smaller datasets (e.g., `LIMIT 1000`) to test connectivity and schema inference quickly. Remove or increase limits once you're satisfied with the data structure.

## Troubleshooting

- **Authentication failure**: Check host, credentials, and network reachability to Columnar.
- **Schema inference errors**: Ensure your TAV exists and contains data. Try adding a `LIMIT` clause for faster sampling (e.g., `LIMIT 100`).
- **API error from Columnar**: Review the response message in Looker Studio and verify TAV names, permissions, and that the view is properly created in Capella.
- **Empty or missing TAV**: Verify that your Tabular Analytics View was saved correctly in the Analytics Workbench and contains data.
- **Mixed data types**: If fields appear as STRING when they should be NUMBER, your data may have mixed types. Consider modifying your TAV creation query to cast fields to consistent types (e.g., `CAST(price AS NUMBER)`) or filter out inconsistent records.

## Next Steps

Once your connector is configured and fields are set up, create reports by dragging and dropping tables from the side pane onto the main canvas:

![Creating Reports in Looker Studio](step-3.png "Creating reports by dragging and dropping tables onto the canvas in Looker Studio")

- Build charts in Looker Studio using your TAV-backed fields.
- Iterate on Views/queries to shape the dataset for analytics.
- Explore the rich visualization options available in Looker Studio to create compelling dashboards from your Columnar data.


Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
158 changes: 158 additions & 0 deletions tutorial/markdown/connectors/looker-studio/dataapi/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
---
# frontmatter
path: "/tutorial-looker-studio-dataapi"
# title and description do not need to be added to markdown, start with H2 (##)
title: Looker Studio with Couchbase Data API
short_title: Data API Connector
description:
- Connect Google Looker Studio to Couchbase through the Data API
- Configure auth, select collections or use custom SQL++ queries
- Learn schema inference, limits, and troubleshooting tips
content_type: tutorial
filter: connectors
technology:
- server
- query
tags:
- Looker Studio
- Google Data Studio
- Data API
- Connector
sdk_language:
- nodejs
length: 20 Mins
---

<!-- [abstract] -->

## Overview

Use this connector to build Looker Studio reports directly on Couchbase via the Data API. You can:

- Query by selecting a specific `bucket.scope.collection`.
- Or run a custom SQL++ query.

Behind the scenes, the connector authenticates with Basic Auth and talks to the Data API endpoints for caller identity checks and to the Query Service for SQL++ execution. Schema is inferred automatically from sampled data to make fields available in Looker Studio.

## Prerequisites

- A Couchbase Capella cluster or a self-managed cluster with the Query Service reachable from Looker Studio.
- A database user with permissions to read the target collections and run queries.
- Network access from Looker Studio to your cluster host.

## Authentication

When you add the data source in Looker Studio, you will be prompted for:

- Path: The cluster host (optionally with port). Examples:
- Capella: `cb.<your-host>.cloud.couchbase.com`
- Self-managed: `my.host:18095` (specify a non-443 port explicitly)
- Username and Password: Database credentials.

The connector validates credentials against the Data API (`/v1/callerIdentity`). If validation fails, verify host, port, credentials, and network access.

## Configuration

After authentication, choose a configuration mode:

- Configuration Mode: `Query by Collection` or `Use Custom Query`.

### Mode: Query by Collection

- Couchbase Collection: Pick a `bucket > scope > collection` from the dropdown. The connector discovers collections for you.
- Maximum Rows: Optional limit for returned rows (default 100).

What runs:

- Data: `SELECT RAW collection FROM \`bucket\`.\`scope\`.\`collection\` LIMIT <maxRows>`
- Schema: `INFER \`bucket\`.\`scope\`.\`collection\` WITH {"sample_size": 100, "num_sample_values": 3, "similarity_metric": 0.6}`

### Mode: Use Custom Query

- Custom SQL++ Query: Paste any valid SQL++ statement. Include a `LIMIT` for performance.

What runs:

- Schema inference first attempts to run `INFER` on your query (a `LIMIT 100` is added if absent): `INFER (<yourQuery>) WITH {"sample_size": 10000, "num_sample_values": 2, "similarity_metric": 0.1}`
- If that fails, it runs your query with `LIMIT 1` and infers the schema from one sample document.

## Schema and Field Types

- Fields are inferred from sampled data. Types map to Looker Studio as:
- NUMBER → metric
- BOOLEAN → dimension
- STRING (default for text, objects, arrays) → dimension
- Nested fields use dot notation (for example, `address.city`). Arrays and objects not expanded become stringified values.
- If the collection has no documents or your query returns no rows, schema inference will fail.

> **⚠️ Schema Inference Limitations**: Field types are inferred from sampled data and may not capture all variations in your dataset. Common issues include:
> - **Mixed data types**: Fields containing both numbers and text will be typed as STRING
> - **Incomplete sampling**: Fields present only in unsampled documents may not be detected
> - **Array complexity**: Arrays of objects become stringified JSON rather than individual fields
> - **Nested object depth**: Very deep object hierarchies may not be fully expanded
> - **Empty or null values**: Fields with only null values may not be detected or may be typed incorrectly

## Data Retrieval

- Only the fields requested by Looker Studio are returned. Nested values are extracted using dot paths where possible.
- Row limits:
- Collection mode: `Maximum Rows` controls the `LIMIT` (default 100).
- Custom query mode: You control `LIMIT` inside your query.

## Tips and Best Practices

- **Prefer `Query by Collection` for quick starts and simpler schemas**: Collection mode provides more predictable schema inference than custom queries.
- **Always add a `LIMIT` when exploring with custom queries**: Use `LIMIT 100-1000` for initial testing to ensure fast schema inference and data retrieval.
- **Ensure your user has at least query and read access** on the target collections and system catalogs for metadata discovery.
- **For consistent schema inference**: Structure your data with consistent field types across documents. Avoid mixing numbers and strings in the same field.
- **Handle complex nested data**: Consider flattening deeply nested objects in your SQL++ queries for better Looker Studio compatibility.
- **Test schema inference separately**: Use small LIMIT clauses first to verify schema detection before processing large datasets.

## Troubleshooting

### Authentication and Connection Issues
- **Authentication error**: Check host/port, credentials, and that the Data API is reachable from Looker Studio.
- **Timeout or network errors**: Verify network connectivity and firewall settings between Looker Studio and your Couchbase cluster.

### Schema Inference Problems
- **Empty schema or no fields detected**:
- Ensure the collection contains documents and is not empty
- For custom queries, verify the statement returns results and add appropriate `LIMIT` clauses
- Check that your user has permissions to read the collection and execute queries

- **INFER statement failures**:
- The connector first attempts `INFER collection` or `INFER (customQuery)` with sampling options
- If INFER fails, it falls back to executing your query with `LIMIT 1` and inferring from a single document
- INFER may fail on very large collections or complex queries - the fallback usually resolves this

- **Fields appear as STRING when they should be NUMBER**:
- Your data has mixed types (some documents have numbers, others have strings) in the same field
- The connector defaults to STRING for safety when types are inconsistent
- Consider data cleanup or use SQL++ functions to cast types consistently

- **Missing fields that exist in your data**:
- Schema inference is sample-based - fields present only in unsampled documents may not be detected
- Try increasing the collection size or adjusting your query to ensure representative sampling
- For custom queries, ensure your query includes all the fields you want to expose

- **Nested fields not working correctly**:
- Very deep object hierarchies may not be fully expanded by the INFER process
- Arrays of objects become stringified JSON instead of individual fields
- Consider flattening complex structures in your SQL++ query for better field detection

- **"No properties in any INFER flavors" error**:
- The INFER statement succeeded but found no recognizable field structures
- This typically happens with collections containing only primitive values or very inconsistent document structures
- Try a custom query that shapes the data into a more consistent structure

### Query and Data Issues
- **Query errors from the service**: Review the error text surfaced in Looker Studio; fix syntax, permissions, or keyspace names.
- **Permission errors during schema inference**: Ensure your user can execute INFER statements and read from system catalogs.
- **Performance issues**: Add appropriate `LIMIT` clauses and avoid very complex JOINs for better connector performance.

## Next Steps

- Create charts and tables in Looker Studio from the exposed fields.
- Iterate on custom SQL++ queries to shape the dataset for your dashboards.