Skip to content

Conversation

@cathteng
Copy link
Member

@cathteng cathteng commented Mar 15, 2024

Opsgenie has a complex rate limiting strategy: https://docs.opsgenie.com/docs/api-rate-limiting 😞

Currently, when someone saves a metric alert, if they have Opsgenie trigger actions, all of them are validated consecutively by POSTing to the authenticate integration API, which we appear to do in order to validate the integration key since we don't do anything with the response. Opsgenie doesn't have an API to verify integration keys, so our approach has been to hit an API to check if it's an authorized request.

Due to to Opsgenie's rate limiting strategy, it is easy for someone to get rate limited for this API because we are calling this POST repeatedly when saving an alert rule. The current response is "Invalid integration key" regardless of the actual status code of the API, which is not helpful.

We should be validating the integration key as it is saved rather than doing so upon alert save, because people might have multiple Opsgenie trigger actions per alert. Thus we can prevent invalid integration keys from being saved in the first place. I also switched the API we try to hit to check the validity of the integration key to a GET rather than a POST to hopefully increase the rate limit.

Also modified parsing the error so the error messages when filling out the form are more informative.

Screen.Recording.2024-03-15.at.14.50.54.mov

@cathteng cathteng requested review from a team and leeandher March 15, 2024 20:37
@cathteng cathteng requested a review from a team as a code owner March 15, 2024 20:37
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Mar 15, 2024
Comment on lines -36 to -42
# This doesn't work if the team name is "." or "..", which Opsgenie allows for some reason
# despite their API not working with these names.
def get_team_id(self, team_name: str) -> BaseApiResponseX:
params = {"identifierType": "name"}
quoted_name = quote(team_name)
path = f"/teams/{quoted_name}"
return self.get(path=path, headers=self._get_auth_headers(), params=params)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is unused

Comment on lines 185 to 196
except ApiError as e:
logger.info(
"opsgenie.authorization_error",
extra={"error": str(e), "status_code": e.code},
)
if e.code == 429:
raise ApiRateLimitedError(
"Too many requests. Please try updating one team/key at a time."
)
elif e.code == 401:
raise ApiUnauthorized(f"Invalid integration key {integration_key}")
raise
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we know how this affects our UX on the FE/product?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, i improved the UX by slightly changing what we return from the API if we raise an error

@codecov
Copy link

codecov bot commented Mar 15, 2024

Codecov Report

Attention: Patch coverage is 90.62500% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 84.32%. Comparing base (12e816b) to head (a1d596a).
Report is 10 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #67081   +/-   ##
=======================================
  Coverage   84.32%   84.32%           
=======================================
  Files        5306     5306           
  Lines      237081   237100   +19     
  Branches    41001    41008    +7     
=======================================
+ Hits       199907   199938   +31     
+ Misses      36956    36944   -12     
  Partials      218      218           
Files Coverage Δ
.../integrations/organization_integrations/details.py 98.38% <100.00%> (+3.22%) ⬆️
src/sentry/incidents/logic.py 95.52% <100.00%> (+0.38%) ⬆️
src/sentry/integrations/opsgenie/client.py 100.00% <100.00%> (+6.25%) ⬆️
src/sentry/integrations/opsgenie/integration.py 93.85% <88.00%> (-1.65%) ⬇️

... and 5 files with indirect coverage changes

Copy link
Contributor

@vartec vartec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some opinions, but nothing blocking

@cathteng cathteng force-pushed the cathy/opsgenie/verify-integration-key branch from b73a53b to a1d596a Compare March 18, 2024 18:25
@cathteng cathteng merged commit bd52d45 into master Mar 18, 2024
@cathteng cathteng deleted the cathy/opsgenie/verify-integration-key branch March 18, 2024 19:32
@github-actions github-actions bot locked and limited conversation to collaborators Apr 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants