Add ArangoDB Instrumentation #3829

mdgilene · 2025-10-10T17:51:44Z

Description

Adds instrumentation for the python-arango library

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Originally implemented this instrumentation for a project at work. Unit tests verify ArangoDB client executions are intercepted and relevant request/response information is attached to the span.

Does This PR Require a Core Repo Change?

Yes. - Link to PR:
No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

Followed the style guidelines of this project
Changelogs have been updated
Unit tests have been added
Documentation has been updated

linux-foundation-easycla · 2025-10-10T17:51:50Z

The committers listed above are authorized under a signed CLA.

✅ login: mdgilene / name: Matt Gilene (3b74aca, 41fbbaa, 55b9552, c197328)

mdgilene · 2025-10-10T18:00:41Z

First time contributing here so looking for feedback.

thompson-tomo · 2025-10-14T07:27:53Z

...pentelemetry-instrumentation-arangodb/src/opentelemetry/instrumentation/arangodb/__init__.py

+        attributes = {
+            db_attributes.DB_SYSTEM_NAME: "arangodb",
+            db_attributes.DB_NAMESPACE: instance.db_name,
+            db_attributes.DB_OPERATION_NAME: request.endpoint,
+            db_attributes.DB_QUERY_TEXT: textwrap.dedent(query.strip("\n")),
+        }


As per https://opentelemetry.io/docs/specs/semconv/database/database-spans/ server.address/server.port should also be included.

Added server address and port info

thompson-tomo · 2025-10-14T07:31:20Z

...pentelemetry-instrumentation-arangodb/src/opentelemetry/instrumentation/arangodb/__init__.py

+        else:
+            query = str(request.data)
+
+        attributes = {


Is there any case the table/collection could be extracted from the request and set?

Potentially? It would likely only be a guess at best. ArangoDB queries use a "SQL-ish" language called AQL. AQL queries can operate on one or more collections at a time. I'm not sure how this is handled in other database implementations, but open to suggestions.

So it would only be set if the query is known to only be operating on 1 table/collection. Looking at the http api there is api's for collections & documents would those be traced? If so you should be able to get collection from url.

Yes those should be traced. However it looks like those APIs take the collection name as part of the URL path. So would likely need to try to parse it out

_api/collection/<collection-name> or _api/document/<collection-name>

thompson-tomo · 2025-10-14T07:40:12Z

...pentelemetry-instrumentation-arangodb/src/opentelemetry/instrumentation/arangodb/__init__.py

+            for key, value in bind_vars.items():
+                attributes[f"db.query.parameter.{key}"] = json.dumps(value)
+
+        attributes["db.query.options"] = json.dumps(options)


This feels too generic for me, I would recommend identifying if there is any options which would be important and defining them as arrangodb.*.

I went through and explicitly defined which query options to pull into the attributes. There are more than what I have defined but these seemed to me to be the "most useful"

thompson-tomo · 2025-10-14T07:46:39Z

...pentelemetry-instrumentation-arangodb/src/opentelemetry/instrumentation/arangodb/__init__.py

+            stats = extra.get("stats")
+            for key, value in stats.items():
+                attributes["db.execution.stats." + key] = value


Would avoid this as it is too generic and I'd remove it. I would suggest populating the key stats into metrics.

I just removed the stats part of this. I haven't really worked with metrics a whole lot so I will leave that as a future enhancement if needed and just focus on the traces.

thompson-tomo · 2025-10-14T07:50:57Z

...pentelemetry-instrumentation-arangodb/src/opentelemetry/instrumentation/arangodb/__init__.py

+            warnings = extra.get("warnings")
+            attributes["db.execution.warnings"] = json.dumps(warnings)


This feels more like data that would be on an event. I would leave off as no where else do we log warnings on the span.

Moved these warnings to span events

thompson-tomo

Looks better but I feel we need to be careful about how many attributes we are adding to ensure that they all bring value as each attribute has a cost associated with adding them.

thompson-tomo · 2025-10-14T23:47:15Z

...pentelemetry-instrumentation-arangodb/src/opentelemetry/instrumentation/arangodb/__init__.py

+        if "count" in response.body:
+            attributes["arangodb.response.count"] = response.body.get("count")


What does count represent? Number of items?

thompson-tomo · 2025-10-14T23:49:58Z

...pentelemetry-instrumentation-arangodb/src/opentelemetry/instrumentation/arangodb/__init__.py

+        if "hasMore" in response.body:
+            attributes["arangodb.response.hasMore"] = response.body.get(
+                "hasMore"
+            )


I assume that has more is related to pagination? If so I don't see much benefit of it.

thompson-tomo · 2025-10-14T23:51:54Z

...pentelemetry-instrumentation-arangodb/src/opentelemetry/instrumentation/arangodb/__init__.py

+            if "failOnWarning" in options:
+                attributes[arangodb_attributes.FAIL_ON_WARNING] = options.get(
+                    "failOnWarning"
+                )


This shouldn't be needed as the trace should convey this ie trace errored with xyz reason and trace has warning.

thompson-tomo · 2025-10-14T23:53:53Z

...pentelemetry-instrumentation-arangodb/src/opentelemetry/instrumentation/arangodb/__init__.py

+            if "maxRuntime" in options:
+                attributes[arangodb_attributes.MAX_RUNTIME] = options.get(
+                    "maxRuntime"
+                )


Is this max execution time of query? If so would leave out as the error reason should convey when the limit has been hit.

thompson-tomo · 2025-10-14T23:58:16Z

...pentelemetry-instrumentation-arangodb/src/opentelemetry/instrumentation/arangodb/__init__.py

+            if "fullCount" in options:
+                attributes[arangodb_attributes.FULL_COUNT] = options.get(
+                    "fullCount"
+                )


Leave this out as relates to stats

thompson-tomo · 2025-10-15T00:21:46Z

...pentelemetry-instrumentation-arangodb/src/opentelemetry/instrumentation/arangodb/__init__.py

+            if "allowRetry" in options:
+                attributes[arangodb_attributes.ALLOW_RETRY] = options.get(
+                    "allowRetry"
+                )


I would leave this out as the usage is limited to batch reads & network failures.

thompson-tomo · 2025-10-15T00:36:15Z

...pentelemetry-instrumentation-arangodb/src/opentelemetry/instrumentation/arangodb/__init__.py

+            if "cache" in options:
+                attributes[arangodb_attributes.CACHE] = options.get("cache")


Rather than using true/false as the value, I would use skip for false & use for true.

thompson-tomo · 2025-10-15T00:37:14Z

...y-instrumentation-arangodb/src/opentelemetry/instrumentation/arangodb/arangodb_attributes.py

+
+ALLOW_RETRY = "arangodb.options.allowRetry"
+
+CACHE = "arangodb.options.cache"


Suggest naming this arangodb.query.cache.

thompson-tomo · 2025-10-15T00:42:58Z

...y-instrumentation-arangodb/src/opentelemetry/instrumentation/arangodb/arangodb_attributes.py

+
+MAX_RUNTIME = "arangodb.options.maxRuntime"
+
+STREAM = "arangodb.options.stream"


Suggest naming this arangodb.query.stream.

thompson-tomo · 2025-10-15T00:44:55Z

...pentelemetry-instrumentation-arangodb/src/opentelemetry/instrumentation/arangodb/__init__.py

+            if "usePlanCache" in options:
+                attributes[arangodb_attributes.USE_PLAN_CACHE] = options.get(
+                    "usePlanCache"
+                )


I can't see this option in the js sdk, suggest removing it.

mdgilene requested a review from a team as a code owner October 10, 2025 17:51

mdgilene added 2 commits October 10, 2025 13:54

implement instrumentation

55b9552

Update changelog

3b74aca

mdgilene force-pushed the feat/arangodb-instrumentor branch from 9aa83e6 to 3b74aca Compare October 10, 2025 17:55

mdgilene changed the title ~~[WIP] Add ArangoDB Instrumentation~~ Add ArangoDB Instrumentation Oct 10, 2025

Merge branch 'main' into feat/arangodb-instrumentor

c197328

thompson-tomo reviewed Oct 14, 2025

View reviewed changes

Be more explicit about options and warnings

41fbbaa

thompson-tomo reviewed Oct 15, 2025

View reviewed changes

xrmx added this to @xrmx's Python PR digest Oct 15, 2025

xrmx moved this to Ready for review in @xrmx's Python PR digest Oct 15, 2025

xrmx moved this from Ready for review to Reviewed PRs that need fixes in @xrmx's Python PR digest Oct 15, 2025

		warnings = extra.get("warnings")
		attributes["db.execution.warnings"] = json.dumps(warnings)

		if "count" in response.body:
		attributes["arangodb.response.count"] = response.body.get("count")

		if "cache" in options:
		attributes[arangodb_attributes.CACHE] = options.get("cache")


		ALLOW_RETRY = "arangodb.options.allowRetry"

		CACHE = "arangodb.options.cache"


		MAX_RUNTIME = "arangodb.options.maxRuntime"

		STREAM = "arangodb.options.stream"

Add ArangoDB Instrumentation #3829

Are you sure you want to change the base?

Add ArangoDB Instrumentation #3829

Conversation

mdgilene commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

How Has This Been Tested?

Does This PR Require a Core Repo Change?

Checklist:

Uh oh!

linux-foundation-easycla bot commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mdgilene commented Oct 10, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thompson-tomo Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thompson-tomo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mdgilene commented Oct 10, 2025 •

edited

Loading

linux-foundation-easycla bot commented Oct 10, 2025 •

edited

Loading

thompson-tomo Oct 14, 2025 •

edited

Loading