Skip to content

Conversation

@ilyasabdellaoui
Copy link

Introduced BaseOffice365User abstract base class to standardize Office 365 user handling. Added MultiOffice365Users to manage multiple emails from config. Added client_emails (comma-separated) in OutlookDataSource config. Resolved issue with fetching too many users causing SMTP server not found error.

Closes #2843

Checklists

Pre-Review Checklist

  • this PR does NOT contain credentials of any kind, such as API keys or username/passwords (double check config.yml.example)
  • this PR has a meaningful title
  • this PR links to all relevant github issues that it fixes or partially addresses
  • this PR has a thorough description
  • Covered the changes with automated tests
  • Tested the changes locally
  • Considered corresponding documentation changes
  • Contributed any configuration settings changes to the configuration reference
  • if you added or changed Rich Configurable Fields for a Native Connector, you made a corresponding PR in Kibana

@ilyasabdellaoui ilyasabdellaoui requested a review from a team September 23, 2024 02:07
@cla-checker-service
Copy link

cla-checker-service bot commented Sep 23, 2024

💚 CLA has been signed

@ilyasabdellaoui ilyasabdellaoui marked this pull request as draft September 23, 2024 02:14
@ilyasabdellaoui ilyasabdellaoui marked this pull request as ready for review September 23, 2024 02:15
Introduced BaseOffice365User abstract base class to standardize Office 365 user handling.
Added MultiOffice365Users to manage multiple emails from config.
Added client_emails (comma-separated) in OutlookDataSource config.
Resolved issue with fetching too many users causing SMTP server not found error.
@ilyasabdellaoui ilyasabdellaoui force-pushed the feature/office365-multi-user-support branch from 37f7483 to 897fed8 Compare September 23, 2024 02:32
@seanstory
Copy link
Member

buildkite test this

Copy link
Member

@seanstory seanstory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Thanks for submitting a patch! I've left a few comments, but overall I think this is a pretty clean approach and solves a clear ask.

Outside of the below comments, I'd also request that you change the

Closes #2843

in your PR description to use "Relates to" or "Part of", since another part of the issue you filed is that the current solution is limited to 999 users. I expect that's something we can better control with the graph API and another separate configuration, for folks who want "all users", and have more than 999.

@seanstory
Copy link
Member

Screenshot 2024-09-24 at 3 45 50 PM

The tests pass, but the linter is currently failing. You can fix this by running make autoformat when you're done making changes.

@ilyasabdellaoui ilyasabdellaoui requested a review from a team as a code owner September 29, 2024 12:24
@ilyasabdellaoui
Copy link
Author

ilyasabdellaoui commented Sep 29, 2024

in your PR description to use "Relates to" or "Part of", since another part of the issue you filed is that the current solution is limited to 999 users. I expect that's something we can better control with the graph API and another separate configuration, for folks who want "all users", and have more than 999.

@seanstory After rechecking the code, I realized I was mistaken in my initial description. The code isn't limited to fetching the top 999 users—it handles pagination using the @odata.nextLink property from the Microsoft Graph API, which retrieves all users. I missed that detail. I'll update my issue description to reflect the correct behavior.

async def get_users(self):
access_token = await self._fetch_token()
url = f"https://graph.microsoft.com/v1.0/users?$top={TOP}"
while True:
try:
async with self._get_session.get(
url=url,
headers={
"Authorization": f"Bearer {access_token}",
"Content-Type": "application/json",
},
) as response:
json_response = await response.json()
yield json_response
url = json_response.get("@odata.nextLink")
if url is None:
break
except Exception:
raise

@ilyasabdellaoui ilyasabdellaoui marked this pull request as draft September 29, 2024 13:25
…Cloud connector

- Refactored the  method to utilize Microsoft Graph's JSON batching feature.
- The method now processes client emails in batches of up to 20 to optimize API requests.
- Added error handling for individual user fetch errors within the batch response.
- If any batch request fails, the method collects the errors and raises an exception with details.
@seanstory
Copy link
Member

buildkite test this

@ilyasabdellaoui
Copy link
Author

ilyasabdellaoui commented Sep 30, 2024

buildkite test this

Lint still fails due to async-related issues. Autoformat doesn't solve the lint errors. I'll investigate further after completing test functions. If unable to resolve, I'll seek assistance

@artem-shelkovnikov
Copy link
Member

@ilyasabdellaoui do you want to make this PR ready for review, or is there anything we can help with?

@ilyasabdellaoui ilyasabdellaoui marked this pull request as ready for review November 11, 2024 03:03
@ilyasabdellaoui
Copy link
Author

@artem-shelkovnikov Thanks for checking in! I encountered some linting issues, which I've now resolved in my latest commit, particularly related to async iteration. Everything is fixed and ready for review.

@artem-shelkovnikov
Copy link
Member

buildkite test this

Copy link
Member

@artem-shelkovnikov artem-shelkovnikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Left a comment about configurable field to make it consistent with other connectors, but otherwise great change.

Additional question - have you done any benchmarks on how faster the connector gets?

I know that you're saving a lot of time on network, but throttling-wise I believe this change makes not impact, is it true?

@ilyasabdellaoui
Copy link
Author

Looks good! Left a comment about configurable field to make it consistent with other connectors, but otherwise great change.

Additional question - have you done any benchmarks on how faster the connector gets?

I know that you're saving a lot of time on network, but throttling-wise I believe this change makes not impact, is it true?

I haven't run any benchmarks yet, but I expect this change will improve performance by reducing unnecessary user fetches.

As for throttling, the client_emails configuration is most effective at avoiding it when the list of emails is small, by limiting the API calls to only the relevant users, it reduces the load on the API. For larger email lists, though, the impact on throttling might not be as important!

@artem-shelkovnikov
Copy link
Member

buildkite test this

@artem-shelkovnikov
Copy link
Member

Change looks good, but I'll not merge immediately, will address a bit later - we need to do same change in Kibana to make it properly work for native connectors. I'll follow-up and merge both at the same time

@artem-shelkovnikov
Copy link
Member

Hi @ilyasabdellaoui, it's on my radar - we currently don't have a running subscription with. Outlook to be able to manually test it, it's ongoing work and we'll get back to this PR once it's been properly tested.

@ilyasabdellaoui
Copy link
Author

Thanks for the update @artem-shelkovnikov! Let me know if I can assist with anything to facilitate testing.

@seanstory seanstory removed their request for review April 1, 2025 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enhancement: Allow Outlook Cloud connector to specify client emails instead of defaulting to all users

3 participants