-
Notifications
You must be signed in to change notification settings - Fork 0
Summarise Subreddits #71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… trailing slash or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (1)
18-22: URI parsing optimization implemented.The previous review comment about double parsing has been addressed. The current implementation parses each URI once using
TryCreatewith an out parameter, then filters and extracts hosts.tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs (1)
12-12: Class name vs filename mismatch.This issue was previously flagged. If the filename is
RedditPostContentExtractorIntegrationTests.cs, the class name should match.
🧹 Nitpick comments (2)
tests/Elzik.Breef.Api.Tests.Functional/BreefTestsDocker.cs (1)
160-179: LGTM! Timeout handling is well-implemented.The timeout mechanism using
CancellationTokenSourcewith proper exception handling is solid. The descriptiveTimeoutExceptionmessage will aid debugging container startup issues.Optionally, consider making the timeout configurable via an environment variable for flexibility in different CI/CD environments:
- private const int ContainerStartTimeoutSeconds = 30; + private static readonly int ContainerStartTimeoutSeconds = + int.TryParse(Environment.GetEnvironmentVariable("BREEF_CONTAINER_TIMEOUT_SECONDS"), out var timeout) + ? timeout + : 30;src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/Raw/IRawRedditPostClient.cs (1)
7-9: Consider externalizing the version string.The User-Agent header contains a hardcoded version "1.0.0" which could become stale as the project evolves. Consider extracting this to a configuration option or deriving it from assembly metadata.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
README.md(3 hunks)src/Elzik.Breef.Application/BreefGenerator.cs(0 hunks)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/IRedditPostClient.cs(1 hunks)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/NewInSubreddit.cs(1 hunks)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/Raw/IRawRedditPostClient.cs(1 hunks)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs(1 hunks)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs(1 hunks)src/Elzik.Breef.Infrastructure/Wallabag/TokenResponse.cs(1 hunks)src/Elzik.Breef.Infrastructure/Wallabag/WallabagEntry.cs(0 hunks)src/Elzik.Breef.Infrastructure/Wallabag/WallabagEntryCreateRequest.cs(1 hunks)tests/Elzik.Breef.Api.Tests.Functional/BreefTestsBase.cs(0 hunks)tests/Elzik.Breef.Api.Tests.Functional/BreefTestsDocker.cs(3 hunks)tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs(1 hunks)tests/Elzik.Breef.Infrastructure.Tests.Integration/Wallabag/WallabagClientTests.cs(0 hunks)tests/Elzik.Breef.Infrastructure.Tests.Integration/Wallabag/WallabagOptionsTests.cs(0 hunks)tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/Client/RawRedditPostTransformerTests.cs(1 hunks)tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/RedditOptionsTests.cs(1 hunks)tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/SubRedditExtractorTests.cs(1 hunks)
💤 Files with no reviewable changes (5)
- tests/Elzik.Breef.Api.Tests.Functional/BreefTestsBase.cs
- src/Elzik.Breef.Application/BreefGenerator.cs
- src/Elzik.Breef.Infrastructure/Wallabag/WallabagEntry.cs
- tests/Elzik.Breef.Infrastructure.Tests.Integration/Wallabag/WallabagOptionsTests.cs
- tests/Elzik.Breef.Infrastructure.Tests.Integration/Wallabag/WallabagClientTests.cs
🚧 Files skipped from review as they are similar to previous changes (4)
- src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs
- README.md
- tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/Client/RawRedditPostTransformerTests.cs
- src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/NewInSubreddit.cs
🧰 Additional context used
🧬 Code graph analysis (6)
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs (7)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditPostContentExtractor.cs (1)
RedditPostContentExtractor(8-74)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/Raw/RawRedditPostTransformer.cs (1)
RawRedditPostTransformer(6-178)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RedditPostClient.cs (1)
RedditPostClient(5-12)tests/Elzik.Breef.Infrastructure.Tests.Integration/TestOutputFakeLogger.cs (1)
TestOutputFakeLogger(7-43)src/Elzik.Breef.Infrastructure/HttpDownloaderOptions.cs (1)
HttpDownloaderOptions(5-11)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (1)
RedditOptions(5-33)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (1)
SubRedditContentExtractor(7-75)
tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/SubRedditExtractorTests.cs (2)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (1)
RedditOptions(5-33)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (5)
SubRedditContentExtractor(7-75)Task(30-40)Task(42-46)Task(48-74)CanHandle(12-28)
tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/RedditOptionsTests.cs (1)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (1)
RedditOptions(5-33)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (1)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/Raw/RawRedditPostTransformer.cs (3)
List(107-130)List(132-163)List(165-177)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/Raw/IRawRedditPostClient.cs (3)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/IRedditPostClient.cs (1)
Task(5-5)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RedditPostClient.cs (1)
Task(7-11)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/Raw/RawRedditPost.cs (1)
RawRedditPost(5-7)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/IRedditPostClient.cs (2)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/Raw/RawRedditPostTransformer.cs (1)
RedditPost(8-41)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RedditPost.cs (1)
RedditPost(3-7)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Test report
- GitHub Check: build-ubuntu
- GitHub Check: build-ubuntu
🔇 Additional comments (9)
src/Elzik.Breef.Infrastructure/Wallabag/TokenResponse.cs (1)
1-1: LGTM! Good cleanup.Removing the unused
Refitimport is appropriate cleanup. The TokenResponse DTO remains correct.src/Elzik.Breef.Infrastructure/Wallabag/WallabagEntryCreateRequest.cs (1)
1-1: LGTM! Good cleanup.Removing the unused
Refitimport is appropriate cleanup. The WallabagEntryCreateRequest DTO remains correct.tests/Elzik.Breef.Api.Tests.Functional/BreefTestsDocker.cs (3)
11-11: LGTM! Timeout constant improves clarity.Adding a named constant for the container startup timeout enhances code readability and maintainability.
83-83: LGTM! Wait strategy update is more precise.Using
UntilInternalTcpPortIsAvailable(8080)provides a clearer and more direct readiness check for the containerized service.
162-166: LGTM! Null checks are appropriate defensive programming.The null checks for
_testContainerare correct defensive programming for a nullable field. While the current logic ensures the container is initialized when!_dockerIsUnavailable, these checks satisfy null safety requirements and guard against future code modifications.Also applies to: 186-189
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/IRedditPostClient.cs (1)
1-7: LGTM!Clean interface definition following async conventions and maintaining single responsibility.
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs (1)
32-207: LGTM!Comprehensive integration test coverage with proper skip logic for GitHub Actions, varied URL format testing, and thorough validation of the Reddit post extraction pipeline.
tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/RedditOptionsTests.cs (1)
1-180: LGTM!Well-structured unit tests with comprehensive coverage of RedditOptions behavior, including default values, configuration binding, domain extraction, and edge cases. Tests follow AAA pattern and use appropriate assertions.
tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/SubRedditExtractorTests.cs (1)
1-509: LGTM!Excellent test coverage with comprehensive validation of SubRedditContentExtractor behavior, including URL validation, image extraction with fallbacks, edge cases for invalid/inaccessible images, and proper exception handling. The mocking strategy is clean and the helper method improves test maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (2)
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/Client/SubredditClientTests.cs (2)
44-44: Hard-coded base URL should use configuration.Same issue as in the first test - the base URL is hard-coded. Apply the same refactor suggested for line 17.
12-12: Typo in method name: 'SubReddit' should be 'Subreddit'.The method name contains inconsistent casing:
GetNewInSubReddit_ValidSubreddit_ReturnsNewInSubreddit. The first instance uses "SubReddit" (capital R) while the second uses "Subreddit" (lowercase r). The correct casing should be "Subreddit" throughout to match C# naming conventions.Apply this diff:
- public async Task GetNewInSubReddit_ValidSubreddit_ReturnsNewInSubreddit() + public async Task GetNewInSubreddit_ValidSubreddit_ReturnsNewInSubreddit()
🧹 Nitpick comments (14)
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs (2)
62-111: Consider documenting brittleness of hardcoded URLs and expected image URLs.These tests validate the image resolution fallback chain correctly. However, the hardcoded Reddit post URLs (lines 68, 85, 102) and exact image URL assertions (lines 76, 93, 110) are brittle and may break if:
- Reddit changes image CDN URLs or query parameters
- Posts or subreddit images are modified or deleted
This is acceptable for integration tests, but consider adding comments explaining why these specific posts were chosen and documenting that these tests may require updates if the external data changes.
141-169: Good error handling coverage.The tests properly validate error scenarios for invalid URLs and non-existent posts.
Note: Line 168 asserts the exact Refit error message format, which could be brittle if Refit updates its error messages. Consider asserting only the status code or exception type if this becomes problematic in the future.
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/IRawNewInSubredditTransformer.cs (1)
3-6: Async naming + cancellation on public API.Rename to TransformAsync and accept CancellationToken to allow upstream cancellation. Non-breaking within this PR scope if propagated to implementer and call sites.
Apply:
-public interface IRawNewInSubredditTransformer +public interface IRawNewInSubredditTransformer { - Task<NewInSubreddit> Transform(RawNewInSubreddit rawNewInSubreddit); + Task<NewInSubreddit> TransformAsync(RawNewInSubreddit rawNewInSubreddit, CancellationToken cancellationToken = default); }src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RawNewInSubredditTransformer.cs (1)
16-26: Harden transform: dedupe IDs, skip nulls, tolerate per‑item failures.Avoid failing the whole batch if a single post fetch throws; also filter nulls and remove duplicates. Keeps order stable.
Apply:
- var postIds = rawNewInSubreddit.Data.Children + var postIds = rawNewInSubreddit.Data.Children .Where(child => child.Data?.Id != null) - .Select(child => child.Data!.Id!) - .ToList(); + .Select(child => child.Data!.Id!) + .Distinct() + .ToList(); - var postTasks = postIds.Select(id => redditPostClient.GetPost(id)); - var posts = await Task.WhenAll(postTasks); + var postTasks = postIds.Select(async id => + { + try + { + return await redditPostClient.GetPost(id); + } + catch + { + return null; + } + }); + var posts = (await Task.WhenAll(postTasks)).Where(p => p is not null)!; - newInSubreddit.Posts.AddRange(posts); + newInSubreddit.Posts.AddRange(posts!);If you adopt CancellationToken on the interface, thread it through with redditPostClient support.
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/ISubredditClient.cs (1)
3-6: Follow async naming and support cancellation.Use GetNewInSubredditAsync and add CancellationToken.
Apply:
-public interface ISubredditClient +public interface ISubredditClient { - Task<NewInSubreddit> GetNewInSubreddit(string subRedditName); + Task<NewInSubreddit> GetNewInSubredditAsync(string subredditName, CancellationToken cancellationToken = default); }src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/Raw/IRawSubredditClient.cs (1)
7-13: Expose query params and cancellation on Refit API.Add raw_json=1 to avoid HTML escaping issues; surface limit for paging; include CancellationToken for resilience.
Apply:
public interface IRawSubredditClient { - [Get("/r/{subRedditName}/new.json")] + [Get("/r/{subredditName}/new.json?raw_json=1&limit={limit}")] [Headers("User-Agent: breef/1.0.0 (https://github.com/elzik/breef)")] - Task<RawNewInSubreddit> GetNewInSubreddit(string subRedditName); + Task<RawNewInSubreddit> GetNewInSubreddit(string subredditName, int limit = 25, CancellationToken cancellationToken = default); - [Get("/r/{subRedditName}/about.json")] + [Get("/r/{subredditName}/about.json?raw_json=1")] [Headers("User-Agent: breef/1.0.0 (https://github.com/elzik/breef)")] - Task<AboutSubreddit> GetAboutSubreddit(string subRedditName); + Task<AboutSubreddit> GetAboutSubreddit(string subredditName, CancellationToken cancellationToken = default); }src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/SubredditClient.cs (1)
7-12: Validate input and thread cancellation.Guard empty subreddit names, and plumb CancellationToken (if you adopt it on interfaces).
Apply:
- public async Task<NewInSubreddit> GetNewInSubreddit(string subRedditName) + public async Task<NewInSubreddit> GetNewInSubreddit(string subRedditName) { - var rawNewInSubreddit = await rawSubredditClient.GetNewInSubreddit(subRedditName); + ArgumentException.ThrowIfNullOrWhiteSpace(subRedditName); + var rawNewInSubreddit = await rawSubredditClient.GetNewInSubreddit(subRedditName); return await transformer.Transform(rawNewInSubreddit); }If you rename to GetNewInSubredditAsync/TransformAsync and add CancellationToken, update calls accordingly.
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RawNewInSubreddit.cs (1)
5-39: Limit raw DTO visibility.These are transport-layer shapes; consider making them internal to reduce public API surface.
tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/Client/SubredditClientTests.cs (1)
62-74: Verify interactions with mocks.Add Received assertions to ensure the pipeline is exercised once and with expected args.
Apply:
var result = await _client.GetNewInSubreddit(subRedditName); // Assert result.ShouldNotBeNull(); result.Posts.ShouldNotBeNull(); result.Posts.Count.ShouldBe(1); result.Posts[0].Post.Id.ShouldBe("post1"); result.Posts[0].Post.Title.ShouldBe("Test Post"); + await _mockRawClient.Received(1).GetNewInSubreddit(subRedditName); + await _mockTransformer.Received(1).Transform(Arg.Is<RawNewInSubreddit>(x => + x.Data?.Children?.Count == 1 && + x.Data.Children[0].Data?.Id == "post1"));src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (1)
53-79: Make image extraction resilient (network/JSON errors) and prefer Uri constants.Guard DownloadAsync/JSON parsing, use TryGetProperty for data, keep fallback on any failure, and use Uri.UriScheme* constants.
Apply:
- private async Task<string> ExtractImageUrlAsync(Uri subRedditBaseUri) + private async Task<string> ExtractImageUrlAsync(Uri subRedditBaseUri) { - Uri subRedditAboutUri = new(subRedditBaseUri, "about.json"); - var jsonContent = await _httpDownloader.DownloadAsync(subRedditAboutUri.AbsoluteUri); - - string[] imageKeys = ["banner_background_image", "banner_img", "mobile_banner_image", "icon_img", "community_icon"]; - - using var doc = JsonDocument.Parse(jsonContent); - var data = doc.RootElement.GetProperty("data"); - - foreach (var imageKey in imageKeys) - { - if (data.TryGetProperty(imageKey, out var prop)) - { - var imageUrl = prop.GetString(); - if (!string.IsNullOrWhiteSpace(imageUrl) && - Uri.TryCreate(imageUrl, UriKind.Absolute, out var uri) && - (uri.Scheme == "http" || uri.Scheme == "https") && - await _httpDownloader.TryGet(imageUrl)) - { - return imageUrl; - } - } - } - - return _redditOptions.FallbackImageUrl; + try + { + Uri subRedditAboutUri = new(subRedditBaseUri, "about.json"); + var jsonContent = await _httpDownloader.DownloadAsync(subRedditAboutUri.AbsoluteUri); + + string[] imageKeys = ["banner_background_image", "banner_img", "mobile_banner_image", "icon_img", "community_icon"]; + + using var doc = JsonDocument.Parse(jsonContent); + if (!doc.RootElement.TryGetProperty("data", out var data) || data.ValueKind != JsonValueKind.Object) + { + return _redditOptions.FallbackImageUrl; + } + + foreach (var imageKey in imageKeys) + { + if (data.TryGetProperty(imageKey, out var prop)) + { + var imageUrl = prop.GetString(); + if (!string.IsNullOrWhiteSpace(imageUrl) && + Uri.TryCreate(imageUrl, UriKind.Absolute, out var uri) && + (uri.Scheme == Uri.UriSchemeHttp || uri.Scheme == Uri.UriSchemeHttps) && + await _httpDownloader.TryGet(imageUrl)) + { + return imageUrl; + } + } + } + } + catch (Exception) // Http/JSON/cancellation—fallback below + { + } + return _redditOptions.FallbackImageUrl; }tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/SubRedditExtractorTests.cs (1)
11-11: Consider renaming for clarity.The class name
SubRedditExtractorTestscould be more explicit asSubRedditContentExtractorTeststo clearly indicate the class under test.- public class SubRedditExtractorTests + public class SubRedditContentExtractorTeststests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/Client/SubredditClientTests.cs (3)
17-17: Hard-coded base URL should use configuration.The test directly instantiates the Refit client with a hard-coded URL. For consistency with the production setup (see
Program.cswhich usesRedditOptions.DefaultBaseAddress), consider extracting the base URL to a test configuration or constant.Example refactor:
+ private const string RedditBaseUrl = "https://www.reddit.com/"; + [SkippableFact] public async Task GetNewInSubreddit_ValidSubreddit_ReturnsNewInSubreddit() { // Arrange Skip.If(IsRunningInGitHubWorkflow, "Skipped because requests to reddit.com from GitHub workflows are " + "always blocked meaning this test case always fails. This must be run locally instead."); - var client = RestService.For<IRawSubredditClient>("https://www.reddit.com/"); + var client = RestService.For<IRawSubredditClient>(RedditBaseUrl);
11-36: Consider adding error case tests.The test only validates the happy path with a valid subreddit name. Consider adding test cases for error scenarios such as:
- Invalid/non-existent subreddit names
- Network failures or timeouts
- Rate limiting responses from Reddit's API
53-57: Image URL assertions may be overly strict.The test asserts that all image URLs (
IconImg,BannerImg,BannerBackgroundImage,MobileBannerImage,CommunityIcon) are not null. However, not all subreddits have all these images configured. This could cause the test to fail for subreddits without complete image metadata.Consider changing to:
aboutSubreddit.ShouldNotBeNull(); aboutSubreddit.Data.ShouldNotBeNull(); aboutSubreddit.Data.PublicDescription.ShouldNotBeNull(); - aboutSubreddit.Data.IconImg.ShouldNotBeNull(); - aboutSubreddit.Data.BannerImg.ShouldNotBeNull(); - aboutSubreddit.Data.BannerBackgroundImage.ShouldNotBeNull(); - aboutSubreddit.Data.MobileBannerImage.ShouldNotBeNull(); - aboutSubreddit.Data.CommunityIcon.ShouldNotBeNull(); + // Image URLs are optional and may be null or empty for some subreddits + // Just verify the properties exist (no assertion on values)Or verify that at least one image URL is populated if the test specifically requires it.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
src/Elzik.Breef.Api/Program.cs(2 hunks)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/IRawNewInSubredditTransformer.cs(1 hunks)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/ISubredditClient.cs(1 hunks)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/NewInSubreddit.cs(1 hunks)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/Raw/IRawSubredditClient.cs(1 hunks)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RawNewInSubreddit.cs(1 hunks)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RawNewInSubredditTransformer.cs(1 hunks)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/SubredditClient.cs(1 hunks)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs(1 hunks)tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/Client/SubredditClientTests.cs(1 hunks)tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs(1 hunks)tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/Client/RawNewInSubredditTransformerTests.cs(1 hunks)tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/Client/SubredditClientTests.cs(1 hunks)tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/SubRedditExtractorTests.cs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/NewInSubreddit.cs
🧰 Additional context used
🧬 Code graph analysis (12)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/IRawNewInSubredditTransformer.cs (2)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/NewInSubreddit.cs (1)
NewInSubreddit(3-6)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RawNewInSubreddit.cs (1)
RawNewInSubreddit(5-9)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RawNewInSubredditTransformer.cs (3)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/IRawNewInSubredditTransformer.cs (1)
Task(5-5)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/NewInSubreddit.cs (1)
NewInSubreddit(3-6)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RawNewInSubreddit.cs (1)
RawNewInSubreddit(5-9)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/ISubredditClient.cs (1)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/NewInSubreddit.cs (1)
NewInSubreddit(3-6)
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/Client/SubredditClientTests.cs (2)
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/Client/RedditPostClientTests.cs (1)
SkippableFact(12-72)src/Elzik.Breef.Api/Program.cs (1)
Task(26-134)
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs (6)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditPostContentExtractor.cs (1)
RedditPostContentExtractor(8-74)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/Raw/RawRedditPostTransformer.cs (1)
RawRedditPostTransformer(6-178)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RedditPostClient.cs (1)
RedditPostClient(5-12)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (1)
RedditOptions(5-33)src/Elzik.Breef.Infrastructure/HttpDownloaderOptions.cs (1)
HttpDownloaderOptions(5-11)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (5)
SubRedditContentExtractor(8-80)Task(34-45)Task(47-51)Task(53-79)CanHandle(16-32)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (4)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (1)
RedditOptions(5-33)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditPostContentExtractor.cs (2)
CanHandle(15-32)Task(34-73)src/Elzik.Breef.Api/Program.cs (1)
Task(26-134)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/ISubredditImageExtractor.cs (1)
Task(5-5)
tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/Client/RawNewInSubredditTransformerTests.cs (4)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RawNewInSubredditTransformer.cs (2)
RawNewInSubredditTransformer(3-28)Task(5-27)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/IRawNewInSubredditTransformer.cs (1)
Task(5-5)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RawNewInSubreddit.cs (4)
RawNewInSubreddit(5-9)RawListingData(11-15)RawChild(17-21)RawPostData(23-39)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RedditPost.cs (2)
RedditPostContent(9-19)RedditComment(21-29)
tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/SubRedditExtractorTests.cs (5)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (2)
RedditOptions(5-33)List(24-32)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (5)
SubRedditContentExtractor(8-80)Task(34-45)Task(47-51)Task(53-79)CanHandle(16-32)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/NewInSubreddit.cs (1)
NewInSubreddit(3-6)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RedditPost.cs (3)
RedditPost(3-7)RedditPostContent(9-19)RedditComment(21-29)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/ISubredditImageExtractor.cs (1)
Task(5-5)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/Raw/IRawSubredditClient.cs (5)
src/Elzik.Breef.Api/Program.cs (1)
Task(26-134)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/ISubredditClient.cs (1)
Task(5-5)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/SubredditClient.cs (1)
Task(7-12)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RawNewInSubreddit.cs (1)
RawNewInSubreddit(5-9)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/AboutSubreddit.cs (1)
AboutSubreddit(5-9)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/SubredditClient.cs (3)
src/Elzik.Breef.Api/Program.cs (1)
Task(26-134)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/ISubredditClient.cs (1)
Task(5-5)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/NewInSubreddit.cs (1)
NewInSubreddit(3-6)
src/Elzik.Breef.Api/Program.cs (11)
src/Elzik.Breef.Infrastructure/HttpDownloaderOptions.cs (1)
HttpDownloaderOptions(5-11)src/Elzik.Breef.Infrastructure/HttpDownloader.cs (2)
HttpDownloader(7-39)HttpDownloader(11-19)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (1)
RedditOptions(5-33)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/Raw/RawRedditPostTransformer.cs (1)
RawRedditPostTransformer(6-178)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RedditPostClient.cs (1)
RedditPostClient(5-12)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RawNewInSubredditTransformer.cs (1)
RawNewInSubredditTransformer(3-28)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/SubredditClient.cs (1)
SubredditClient(5-13)src/Elzik.Breef.Infrastructure/ContentExtractors/HtmlContentExtractor.cs (1)
HtmlContentExtractor(6-83)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (1)
SubRedditContentExtractor(8-80)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditPostContentExtractor.cs (1)
RedditPostContentExtractor(8-74)src/Elzik.Breef.Infrastructure/ContentExtractors/ContentExtractorStrategy.cs (2)
ContentExtractorStrategy(6-36)ContentExtractorStrategy(11-24)
tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/Client/SubredditClientTests.cs (5)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/SubredditClient.cs (2)
SubredditClient(5-13)Task(7-12)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/ISubredditClient.cs (1)
Task(5-5)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RawNewInSubreddit.cs (4)
RawNewInSubreddit(5-9)RawListingData(11-15)RawChild(17-21)RawPostData(23-39)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/NewInSubreddit.cs (1)
NewInSubreddit(3-6)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RedditPost.cs (1)
RedditPostContent(9-19)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-ubuntu
- GitHub Check: build-ubuntu
- GitHub Check: Analyze (csharp)
🔇 Additional comments (17)
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs (5)
1-17: LGTM! Clean test setup.The imports are appropriate, and the GitHub Actions detection pattern is a good approach for conditionally skipping integration tests in CI environments.
36-60: Excellent test coverage for URL format variations.The test properly validates the complete extraction pipeline, from URL parsing through to deserialized content structure. Good use of multiple URL formats in the theory data.
113-139: LGTM! Comprehensive structure validation.The test thoroughly validates that the complete Reddit data structure is properly extracted and serialized, including defensive handling for posts that may have no comments.
171-204: Excellent URL validation coverage!These tests comprehensively validate the
CanHandlemethod logic, covering:
- Case insensitivity
- Trailing slash variations
- Optional title segment
- Invalid domains (including
old.reddit.comand non-Reddit domains)- Invalid path structures
Correctly uses non-skippable
[Theory]since these tests don't require network calls.
206-210: LGTM! Clean skip helper.The helper method provides a clear, reusable way to skip tests in CI with good documentation explaining why Reddit requests fail in GitHub workflows.
src/Elzik.Breef.Api/Program.cs (5)
7-10: LGTM!The new using statements properly support the Reddit content extractor components registered in the DI container.
64-68: LGTM!HttpDownloader configuration and registration are properly set up with validation.
70-87: LGTM!Reddit options configuration and Refit client registrations are properly configured with consistent base address setup.
89-92: LGTM!Transformers and typed Reddit clients are properly registered with appropriate dependencies.
94-107: LGTM!Content extractor registrations properly implement the strategy pattern with specific and default extractors. The dual registration of
SubRedditContentExtractor(as concrete type and asISubredditImageExtractor) is intentional and correct for supporting different injection scenarios.tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/SubRedditExtractorTests.cs (3)
41-113: Excellent test coverage!The
CanHandletests comprehensively validate URL parsing, domain matching, and configurable base address support across various scenarios.
115-254: LGTM!The
ExtractAsynctests thoroughly validate image resolution, content serialization, title formatting, and URL parsing with clear test scenarios.
256-487: Excellent comprehensive test coverage!The
GetSubredditImageUrlAsynctests are exceptionally thorough, covering image priority, fallback logic, URL validation, and importantly, security concerns by explicitly rejecting non-HTTP schemes (javascript:, file://, data: URIs). The test coverage includes edge cases and error propagation.tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/Client/RawNewInSubredditTransformerTests.cs (3)
18-162: LGTM!The transformation tests thoroughly validate both successful transformations with complete data and graceful handling of null/empty edge cases.
164-357: LGTM!The filtering tests properly validate that invalid children (null data or null IDs) are skipped, and the single post test comprehensively validates all fields including nested comments.
359-406: Excellent testing of critical scenarios!The null argument validation is properly tested with ParamName verification. The concurrent fetching test is particularly well-designed, using
TaskCompletionSourceto coordinate and verify that posts are fetched concurrently viaTask.WhenAll, ensuring both correctness and performance.tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/Client/SubredditClientTests.cs (1)
7-59: AI summary describes different content than what exists in the code.The AI summary mentions a test class
SubredditClientTestswith a test methodGetNewInSubreddit_ValidSubredditName_ReturnsTransformedResult(), but the actual code contains a classRawSubredditClientTestswith test methodsGetNewInSubReddit_ValidSubreddit_ReturnsNewInSubreddit()andGetAboutSubreddit_ValidSubreddit_ReturnsAboutSubreddit(). The summary appears to be describing a different file or an earlier version.
....Infrastructure.Tests.Integration/ContentExtractors/Reddit/Client/RawSubredditClientTests.cs
Show resolved
Hide resolved
...eef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/Client/SubredditClientTests.cs
Outdated
Show resolved
Hide resolved
...Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (7)
src/Elzik.Breef.Api/Program.cs (1)
75-87: Refit clients: add User-Agent and a timeout; dedupe config.Reddit expects a UA, and lack of a timeout can hang request threads. Reuse HttpDownloaderOptions.UserAgent and set a reasonable timeout. Also, both clients share identical setup; factor to a local helper to avoid drift.
Apply this diff:
builder.Services.AddRefitClient<IRawRedditPostClient>() .ConfigureHttpClient((provider, client) => { var redditOptions = provider.GetRequiredService<IOptions<RedditOptions>>().Value; - client.BaseAddress = new Uri(redditOptions.DefaultBaseAddress); + var dlOpts = provider.GetRequiredService<IOptions<HttpDownloaderOptions>>().Value; + client.BaseAddress = new Uri(redditOptions.DefaultBaseAddress); + client.DefaultRequestHeaders.UserAgent.ParseAdd(dlOpts.UserAgent); + client.Timeout = TimeSpan.FromSeconds(30); }); builder.Services.AddRefitClient<IRawSubredditClient>() .ConfigureHttpClient((provider, client) => { var redditOptions = provider.GetRequiredService<IOptions<RedditOptions>>().Value; - client.BaseAddress = new Uri(redditOptions.DefaultBaseAddress); + var dlOpts = provider.GetRequiredService<IOptions<HttpDownloaderOptions>>().Value; + client.BaseAddress = new Uri(redditOptions.DefaultBaseAddress); + client.DefaultRequestHeaders.UserAgent.ParseAdd(dlOpts.UserAgent); + client.Timeout = TimeSpan.FromSeconds(30); });Optional: extract the lambda into a local method:
static void ConfigureRedditHttpClient(IServiceProvider provider, HttpClient client) { var reddit = provider.GetRequiredService<IOptions<RedditOptions>>().Value; var dl = provider.GetRequiredService<IOptions<HttpDownloaderOptions>>().Value; client.BaseAddress = new Uri(reddit.DefaultBaseAddress); client.DefaultRequestHeaders.UserAgent.ParseAdd(dl.UserAgent); client.Timeout = TimeSpan.FromSeconds(30); }Then reference with
.ConfigureHttpClient(ConfigureRedditHttpClient).tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs (6)
1-9: Add Http namespaces and drop unused using.You add HttpStatusCode and HttpClient; also remove the unused Xunit.Abstractions import.
using Elzik.Breef.Infrastructure.ContentExtractors.Reddit; using Elzik.Breef.Infrastructure.ContentExtractors.Reddit.Client; using Elzik.Breef.Infrastructure.ContentExtractors.Reddit.Client.Raw; using Microsoft.Extensions.Options; +using System.Net; +using System.Net.Http; using Refit; using Shouldly; using System.Text.Json; -using Xunit.Abstractions;
16-17: Keep clients as fields for assertions in tests.Having these as fields lets you compute expected images without hard-coding URLs.
- private readonly RedditPostContentExtractor _extractor; + private readonly RedditPostContentExtractor _extractor; + private readonly IRedditPostClient _redditPostClient; + private readonly ISubredditImageExtractor _subredditImageExtractor;
18-34: Set explicit User‑Agent and timeout for Refit clients (reduces flakiness/429s).Reddit often requires a UA; add UA and sane timeouts to Refit HttpClients. Also assign fields for later assertions.
public RedditPostContentExtractorTests() { - var rawRedditClient = RestService.For<IRawRedditPostClient>("https://www.reddit.com/"); + var redditHttpClient = new HttpClient + { + BaseAddress = new Uri("https://www.reddit.com/"), + Timeout = TimeSpan.FromSeconds(15) + }; + redditHttpClient.DefaultRequestHeaders.UserAgent.ParseAdd("breef-integration-tests/1.0 (+https://github.com/elzik/breef)"); + var rawRedditClient = RestService.For<IRawRedditPostClient>(redditHttpClient); var transformer = new RawRedditPostTransformer(); var redditPostClient = new RedditPostClient(rawRedditClient, transformer); - var rawSubredditClient = RestService.For<IRawSubredditClient>("https://www.reddit.com/"); + var subredditHttpClient = new HttpClient + { + BaseAddress = new Uri("https://www.reddit.com/"), + Timeout = TimeSpan.FromSeconds(15) + }; + subredditHttpClient.DefaultRequestHeaders.UserAgent.ParseAdd("breef-integration-tests/1.0 (+https://github.com/elzik/breef)"); + var rawSubredditClient = RestService.For<IRawSubredditClient>(subredditHttpClient); var rawNewInSubredditTransformer = new RawNewInSubredditTransformer(redditPostClient); var subredditClient = new SubredditClient(rawSubredditClient, rawNewInSubredditTransformer); var redditOptions = Options.Create(new RedditOptions()); var httpDownloaderOptions = Options.Create(new HttpDownloaderOptions { UserAgent = "breef-integration-tests" }); var httpDownloader = new HttpDownloader(new Microsoft.Extensions.Logging.Abstractions.NullLogger<HttpDownloader>(), httpDownloaderOptions); var subredditImageExtractor = new SubredditContentExtractor(subredditClient, httpDownloader, redditOptions); + _redditPostClient = redditPostClient; + _subredditImageExtractor = subredditImageExtractor; _extractor = new RedditPostContentExtractor(redditPostClient, subredditImageExtractor, redditOptions); }
96-111: Use the configured fallback constant instead of duplicating the URL literal.Reduces brittleness if the fallback URL changes in options.
// Act var result = await _extractor.ExtractAsync(urlWithNoImage); // Assert - result.ShouldNotBeNull(); - result.PreviewImageUrl.ShouldNotBeNull(); - result.PreviewImageUrl.ShouldBe("https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Lockup_Logo.svg"); + var expected = new RedditOptions().FallbackImageUrl; + + result.ShouldNotBeNull(); + result.PreviewImageUrl.ShouldNotBeNull(); + result.PreviewImageUrl.ShouldBe(expected);
156-169: Assert ApiException.StatusCode, not Message.Error messages vary by Refit/version; StatusCode is stable.
// Assert - ex.Message.ShouldBe("Response status code does not indicate success: 404 (Not Found)."); + ex.StatusCode.ShouldBe(HttpStatusCode.NotFound);
14-15: Generalize CI skip detection beyond GitHub Actions.Also honor CI=true to avoid failures on other CI providers.
- private static bool IsRunningInGitHubWorkflow => Environment.GetEnvironmentVariable("GITHUB_ACTIONS") == "true"; + private static bool IsRunningInCi => + string.Equals(Environment.GetEnvironmentVariable("GITHUB_ACTIONS"), "true", StringComparison.OrdinalIgnoreCase) || + string.Equals(Environment.GetEnvironmentVariable("CI"), "true", StringComparison.OrdinalIgnoreCase); @@ - Skip.If(IsRunningInGitHubWorkflow, reason); + Skip.If(IsRunningInCi, reason);Also applies to: 206-210
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/Elzik.Breef.Api/Program.cs(2 hunks)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs(1 hunks)tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/Client/RawSubredditClientTests.cs(1 hunks)tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs(1 hunks)tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/SubRedditExtractorTests.cs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs
🧰 Additional context used
🧬 Code graph analysis (4)
src/Elzik.Breef.Api/Program.cs (11)
src/Elzik.Breef.Infrastructure/HttpDownloaderOptions.cs (1)
HttpDownloaderOptions(5-11)src/Elzik.Breef.Infrastructure/HttpDownloader.cs (2)
HttpDownloader(7-39)HttpDownloader(11-19)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (1)
RedditOptions(5-33)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/Raw/RawRedditPostTransformer.cs (1)
RawRedditPostTransformer(6-178)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RedditPostClient.cs (1)
RedditPostClient(5-12)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RawNewInSubredditTransformer.cs (1)
RawNewInSubredditTransformer(3-28)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/SubredditClient.cs (1)
SubredditClient(5-13)src/Elzik.Breef.Infrastructure/ContentExtractors/HtmlContentExtractor.cs (1)
HtmlContentExtractor(6-83)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (1)
SubredditContentExtractor(8-80)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditPostContentExtractor.cs (1)
RedditPostContentExtractor(8-74)src/Elzik.Breef.Infrastructure/ContentExtractors/ContentExtractorStrategy.cs (2)
ContentExtractorStrategy(6-36)ContentExtractorStrategy(11-24)
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/Client/RawSubredditClientTests.cs (2)
src/Elzik.Breef.Api/Program.cs (1)
Task(26-134)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (3)
Task(34-45)Task(47-51)Task(53-79)
tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/SubRedditExtractorTests.cs (3)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (2)
RedditOptions(5-33)List(24-32)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (5)
SubredditContentExtractor(8-80)Task(34-45)Task(47-51)Task(53-79)CanHandle(16-32)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/NewInSubreddit.cs (1)
NewInSubreddit(3-6)
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs (5)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditPostContentExtractor.cs (1)
RedditPostContentExtractor(8-74)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/Raw/RawRedditPostTransformer.cs (1)
RawRedditPostTransformer(6-178)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RedditPostClient.cs (1)
RedditPostClient(5-12)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (1)
RedditOptions(5-33)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (5)
SubredditContentExtractor(8-80)Task(34-45)Task(47-51)Task(53-79)CanHandle(16-32)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-ubuntu
- GitHub Check: build-ubuntu
- GitHub Check: Analyze (csharp)
🔇 Additional comments (3)
src/Elzik.Breef.Api/Program.cs (3)
7-10: Imports look good and scoped to the new Reddit pipeline.
70-74: RedditOptions binding/validation is solid.
94-107: Strategy wiring is correct; default extractor last. Add light tests to lock behavior.Order places specific extractors before the catch‑all HtmlContentExtractor; good. Consider a couple of unit tests asserting selection for:
- https://www.reddit.com/r/csharp
- https://www.reddit.com/r/csharp/comments/POSTID/title
- a non‑Reddit URL falling back to HTML.
....Infrastructure.Tests.Integration/ContentExtractors/Reddit/Client/RawSubredditClientTests.cs
Show resolved
Hide resolved
...Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs
Outdated
Show resolved
Hide resolved
...Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs
Outdated
Show resolved
Hide resolved
tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/SubRedditExtractorTests.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (2)
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs (2)
69-84: Hard-coded image URL makes test brittle.Line 83 asserts against a specific Reddit image URL that may change over time (e.g., CDN rotation, re-encoding, query parameter changes). This makes the test fragile.
The previous review suggested dynamically fetching the expected URL from the API instead of hard-coding it. This approach would make the test resilient to URL changes while still verifying that the extractor correctly retrieves the post image.
86-101: Hard-coded subreddit fallback URL makes test brittle.Line 100 asserts against a specific subreddit image URL. Subreddit images can change when moderators update banners, icons, or community graphics.
The previous review suggested fetching the expected fallback URL dynamically via
ISubredditImageExtractor.GetSubredditImageUrlAsync("bristol")to make the test resilient to subreddit image changes.
🧹 Nitpick comments (2)
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs (2)
103-118: Consider reading fallback URL from configuration.Line 117 hard-codes the Reddit fallback URL, which duplicates the default value from
RedditOptions.FallbackImageUrl. If the configuration default changes, this test will need updating.While this assertion is less brittle than the previous two (since it's a configuration value rather than live Reddit data), reading from the configured options would eliminate duplication:
// Assert result.ShouldNotBeNull(); result.PreviewImageUrl.ShouldNotBeNull(); - result.PreviewImageUrl.ShouldBe("https://redditinc.com/hubfs/Reddit%20Inc/Brand/Reddit_Lockup_Logo.svg"); + var expectedFallback = new RedditOptions().FallbackImageUrl; + result.PreviewImageUrl.ShouldBe(expectedFallback);
213-217: Consider extracting the skip reason to a constant.The default reason string spans lines 213-214. Extracting it to a class-level constant would improve readability.
public class RedditPostContentExtractorTests { private static bool IsRunningInGitHubWorkflow => Environment.GetEnvironmentVariable("GITHUB_ACTIONS") == "true"; + private const string GitHubWorkflowSkipReason = "Skipped because requests to reddit.com from GitHub workflows " + + "are always blocked meaning this test case always fails. This must be run locally instead."; private readonly RedditPostContentExtractor _extractor; // ... rest of class ... - private static void SkipIfInGitHubWorkflow(string reason = "Skipped because requests to reddit.com from GitHub workflows " + - "are always blocked meaning this test case always fails. This must be run locally instead.") + private static void SkipIfInGitHubWorkflow(string reason = GitHubWorkflowSkipReason) { Skip.If(IsRunningInGitHubWorkflow, reason); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
README.md(3 hunks)src/Elzik.Breef.Api/Program.cs(2 hunks)src/Elzik.Breef.Infrastructure/ContentExtractors/HtmlContentExtractor.cs(2 hunks)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs(1 hunks)src/Elzik.Breef.Infrastructure/HttpClientOptions.cs(1 hunks)tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/HtmlContentExtractorTests.cs(3 hunks)tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs(1 hunks)tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/SubRedditExtractorTests.cs(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (8)
README.md (1)
src/Elzik.Breef.Infrastructure/WebPageDownLoaderOptions.cs (1)
WebPageDownLoaderOptions(5-11)
src/Elzik.Breef.Infrastructure/HttpClientOptions.cs (2)
src/Elzik.Breef.Infrastructure/WebPageDownLoaderOptions.cs (1)
WebPageDownLoaderOptions(5-11)tests/Elzik.Breef.Infrastructure.Tests.Integration/WebPageDownLoaderOptionsTests.cs (1)
WebPageDownLoaderOptionsTests(7-44)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (4)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (1)
RedditOptions(5-33)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditPostContentExtractor.cs (2)
CanHandle(15-32)Task(34-73)src/Elzik.Breef.Api/Program.cs (1)
Task(26-141)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/ISubredditImageExtractor.cs (1)
Task(5-5)
tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/SubRedditExtractorTests.cs (4)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (2)
RedditOptions(5-33)List(24-32)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (5)
SubredditContentExtractor(8-85)CanHandle(16-32)Task(34-45)Task(47-51)Task(53-84)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/NewInSubreddit.cs (1)
NewInSubreddit(3-6)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RedditPost.cs (3)
RedditPost(3-7)RedditPostContent(9-19)RedditComment(21-29)
src/Elzik.Breef.Api/Program.cs (10)
src/Elzik.Breef.Infrastructure/HttpClientOptions.cs (1)
HttpClientOptions(5-14)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (1)
RedditOptions(5-33)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/Raw/RawRedditPostTransformer.cs (1)
RawRedditPostTransformer(6-178)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RedditPostClient.cs (1)
RedditPostClient(5-12)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RawNewInSubredditTransformer.cs (1)
RawNewInSubredditTransformer(3-28)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/SubredditClient.cs (1)
SubredditClient(5-13)src/Elzik.Breef.Infrastructure/ContentExtractors/HtmlContentExtractor.cs (1)
HtmlContentExtractor(6-84)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (1)
SubredditContentExtractor(8-85)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditPostContentExtractor.cs (1)
RedditPostContentExtractor(8-74)src/Elzik.Breef.Infrastructure/ContentExtractors/ContentExtractorStrategy.cs (2)
ContentExtractorStrategy(6-36)ContentExtractorStrategy(11-24)
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/HtmlContentExtractorTests.cs (3)
tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/SubRedditExtractorTests.cs (12)
MockHttpMessageHandler(549-581)MockHttpMessageHandler(556-562)Fact(163-179)Fact(181-197)Fact(199-242)Fact(308-324)Fact(326-343)Fact(345-371)Fact(373-399)Fact(401-416)Task(564-580)Task(592-595)src/Elzik.Breef.Infrastructure/ContentExtractors/HtmlContentExtractor.cs (3)
HtmlContentExtractor(6-84)CanHandle(83-83)Task(8-21)src/Elzik.Breef.Api/Program.cs (1)
Task(26-141)
src/Elzik.Breef.Infrastructure/ContentExtractors/HtmlContentExtractor.cs (4)
src/Elzik.Breef.Api/Program.cs (1)
Task(26-141)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (4)
Task(34-45)Task(47-51)Task(53-84)CanHandle(16-32)tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/HtmlContentExtractorTests.cs (1)
Task(67-74)src/Elzik.Breef.Infrastructure/ContentExtractors/ContentExtractorStrategy.cs (2)
Task(28-35)CanHandle(26-26)
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs (7)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditPostContentExtractor.cs (1)
RedditPostContentExtractor(8-74)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/Raw/RawRedditPostTransformer.cs (1)
RawRedditPostTransformer(6-178)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RedditPostClient.cs (1)
RedditPostClient(5-12)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/SubredditClient.cs (1)
SubredditClient(5-13)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (1)
RedditOptions(5-33)src/Elzik.Breef.Infrastructure/HttpClientOptions.cs (1)
HttpClientOptions(5-14)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (5)
SubredditContentExtractor(8-85)Task(34-45)Task(47-51)Task(53-84)CanHandle(16-32)
🪛 markdownlint-cli2 (0.18.1)
README.md
144-144: Unordered list indentation
Expected: 0; Actual: 2
(MD007, ul-indent)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-ubuntu
- GitHub Check: build-ubuntu
- GitHub Check: Analyze (csharp)
🔇 Additional comments (23)
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs (6)
43-67: LGTM!Good test coverage of URL format variations (with/without trailing slash, with/without title). The assertions properly validate both the extract structure and the deserialized Reddit post content.
120-146: LGTM!Thorough validation of the complete Reddit structure. The conditional check for comments (lines 140-145) is appropriate since not all posts have comments.
148-161: LGTM!Good coverage of invalid URL formats. The test cases appropriately cover malformed URLs, wrong paths, wrong domains, and incorrect segment structures.
163-176: LGTM!Proper validation of 404 handling for non-existent posts. The assertion on the exception message ensures the correct error is thrown.
178-194: LGTM!Comprehensive coverage of valid URL patterns, including case-insensitivity (line 183-184) and trailing slash variations.
196-211: LGTM!Good coverage of invalid URL patterns. The test on line 202 correctly validates that
old.reddit.comis rejected, which aligns with the currentRedditOptionsdomain allowlist (www.reddit.comandreddit.comonly).src/Elzik.Breef.Infrastructure/HttpClientOptions.cs (1)
1-14: LGTM! Clean configuration class.The
HttpClientOptionsclass is well-structured with appropriate data annotations and sensible defaults. The timeout validation ensures positive values, and the browser-like User-Agent string is a standard practice for HTTP clients.README.md (1)
83-109: Excellent Reddit configuration documentation.The Reddit configuration section is comprehensive and clearly explains the exact domain matching behavior, which is crucial for users to understand. The examples and warnings about domain variants (reddit.com vs www.reddit.com) will prevent common configuration mistakes.
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/HtmlContentExtractorTests.cs (3)
17-44: Good test migration to IHttpClientFactory.The test has been properly updated to use
IHttpClientFactorywith aMockHttpMessageHandler, which is the correct pattern for testing HTTP interactions without actual network calls.
46-58: Appropriate test for fallback extractor behavior.The
CanHandle_AnyString_CanHandletest correctly verifies thatHtmlContentExtractoraccepts any string, which aligns with its role as the default/fallback extractor in theContentExtractorStrategy.
65-75: Clean mock implementation.The
MockHttpMessageHandlerimplementation is simple and effective, using modern C# primary constructor syntax for clean test setup.src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (3)
8-32: Solid URL validation logic.The
CanHandlemethod correctly validates Reddit subreddit URLs by checking:
- URI validity
- Domain matching against configured Reddit domains
- Path structure (/r/{subreddit})
The domain validation using
RedditOptions.AllDomainsprovides flexible configuration support.
34-45: Clear extraction logic.The
ExtractAsyncmethod properly handles URL normalization (trailing slash), extracts the subreddit name, fetches content, and resolves the image URL. The flow is clean and easy to follow.
47-51: LGTM!The
GetSubredditImageUrlAsyncmethod provides a clean public interface for getting subreddit images, properly constructing the URI and delegating to the shared extraction logic.src/Elzik.Breef.Api/Program.cs (3)
64-75: Excellent IHttpClientFactory configuration.The migration from manual HttpClient management to
IHttpClientFactorywith a named client is the correct approach. The configuration properly sets timeout and user agent fromHttpClientOptions, following best practices for HttpClient lifetime management.
77-94: Well-structured Reddit client configuration.The Reddit options and Refit client registrations are properly configured with validation and consistent base address setup. The pattern is clean and maintainable.
96-114: Clean strategy pattern implementation.The content extractor registration uses a proper factory pattern to compose the
ContentExtractorStrategywith specific extractors (Reddit) and a fallback (HTML). The ordering ensures Reddit URLs are handled by specialized extractors while other URLs fall back to generic HTML extraction.tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/SubRedditExtractorTests.cs (4)
19-113: Comprehensive CanHandle test coverage.The test suite thoroughly validates URL handling including:
- Various URL formats (with/without trailing slash, case-insensitive)
- Invalid URLs and paths
- Custom Reddit instances
- Domain validation
This coverage ensures the extractor correctly identifies Reddit subreddit URLs.
115-259: Thorough ExtractAsync test coverage.The tests validate all key aspects of content extraction:
- Image URL resolution and fallback behavior
- Title formatting
- JSON content serialization
- Subreddit name extraction from various URL formats
261-496: Excellent edge case and security testing.The
GetSubredditImageUrlAsynctests cover critical security scenarios:
- URL validation (data URIs, FTP, file://, javascript:, mailto:)
- Invalid URI formats
- Mixed valid/invalid URL prioritization
- Exception propagation
This thorough testing helps prevent security issues and ensures robust fallback behavior.
534-596: Clean test infrastructure.The helper methods and mock handlers are well-designed:
CreateJsonWithImageKeysimplifies test data creationMockHttpMessageHandlersupports both success and failure pathsThrowingMockHttpMessageHandlerenables exception testingThis infrastructure makes tests readable and maintainable.
src/Elzik.Breef.Infrastructure/ContentExtractors/HtmlContentExtractor.cs (2)
4-21: Excellent refactor to IHttpClientFactory.The migration from direct
IWebPageDownloaderdependency toIHttpClientFactoryfollows .NET best practices:
- Uses named client ("BreefDownloader") configured centrally
- Avoids HttpClient lifetime management issues
- Maintains clean separation of concerns
83-83: Correct implementation for default extractor.The
CanHandlemethod returningtrueunconditionally is appropriate forHtmlContentExtractor's role as the default/fallback handler in theContentExtractorStrategy. This ensures any URL not matched by specialized extractors will be handled.
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs
Outdated
Show resolved
Hide resolved
...Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs (3)
69-84: Consider deriving expected image URL from API response.The hard-coded image URL on line 83 makes this test brittle. If Reddit modifies its CDN URLs, changes image processing parameters, or the image is removed, the test will fail.
For improved resilience, consider fetching the post via the client and asserting against the actual
ImageUrlproperty rather than a fixed string:// Act var result = await _extractor.ExtractAsync(urlWithKnownGoodImage); // Assert + var uri = new Uri(urlWithKnownGoodImage); + var postId = uri.AbsolutePath.Trim('/').Split('/')[3]; + var expectedPost = await redditPostClient.GetPost(postId); + result.ShouldNotBeNull(); result.PreviewImageUrl.ShouldNotBeNull(); - result.PreviewImageUrl.ShouldBe("https://preview.redd.it/olmpl5vmp3tf1.jpeg?auto=webp&s=1cb106a6fab1ddd48bcf8e9afdd2a06ca22d46ba"); + result.PreviewImageUrl.ShouldBe(expectedPost.Post.ImageUrl);Note: You'll need access to
redditPostClient(currently local in the constructor). Consider promoting it to a field.
86-101: Consider deriving expected subreddit image from extractor.The hard-coded subreddit image URL on line 100 creates the same brittleness concern. Subreddit icons and banners frequently change, which would break this assertion.
A previous review suggested resolving the expected URL via
subredditImageExtractorat runtime:// Act var result = await _extractor.ExtractAsync(urlWithNoImage); // Assert + var subredditName = new Uri(urlWithNoImage).AbsolutePath.Trim('/').Split('/')[1]; + var expectedImageUrl = await subredditImageExtractor.GetSubredditImageUrlAsync(subredditName); + result.ShouldNotBeNull(); result.PreviewImageUrl.ShouldNotBeNull(); - result.PreviewImageUrl.ShouldBe("https://b.thumbs.redditmedia.com/fMCtUDLMEEt1SrDtRyg1v1xiXVoXmP_3dxScj1kgzoE.png"); + result.PreviewImageUrl.ShouldBe(expectedImageUrl);Note: You'll need access to
subredditImageExtractor(currently local in the constructor). Consider promoting it to a field.
163-176: Consider asserting on exception properties rather than message text.The hard-coded exception message on line 175 couples this test to Refit's exact message format. If Refit changes its error message template, this test will break.
For improved resilience, consider asserting on the status code or exception type instead:
// Act - var ex = await Should.ThrowAsync<ApiException>(() => _extractor.ExtractAsync(url)); + var ex = await Should.ThrowAsync<ApiException>(() => _extractor.ExtractAsync(url)); // Assert - ex.Message.ShouldBe("Response status code does not indicate success: 404 (Not Found)."); + ex.StatusCode.ShouldBe(System.Net.HttpStatusCode.NotFound);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs (6)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditPostContentExtractor.cs (1)
RedditPostContentExtractor(8-74)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/Raw/RawRedditPostTransformer.cs (1)
RawRedditPostTransformer(6-178)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RedditPostClient.cs (1)
RedditPostClient(5-12)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (1)
RedditOptions(5-33)src/Elzik.Breef.Infrastructure/HttpClientOptions.cs (1)
HttpClientOptions(5-14)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (5)
SubredditContentExtractor(8-85)Task(34-45)Task(47-51)Task(53-84)CanHandle(16-32)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-ubuntu
- GitHub Check: build-ubuntu
- GitHub Check: Analyze (csharp)
🔇 Additional comments (10)
tests/Elzik.Breef.Infrastructure.Tests.Integration/ContentExtractors/Reddit/RedditPostContentExtractorTests.cs (10)
1-17: LGTM!The class structure is well-organized with appropriate usings, the
sealedkeyword prevents unintended inheritance, andIDisposableis properly implemented for resource cleanup.
19-41: LGTM!The constructor properly wires up real Reddit clients with a mocked
IHttpClientFactory. TheHttpClientis now stored in a field and disposed in theDispose()method, correctly addressing the resource leak concern from previous reviews.
43-67: LGTM!Good coverage of URL format variations with comprehensive assertions validating both the
Extractresult and the deserializedRedditPoststructure.
103-118: LGTM!This test correctly validates the global fallback scenario by asserting against the default
RedditOptions.FallbackImageUrl. This is appropriate as it's testing configuration-driven behavior.
120-146: LGTM!Comprehensive validation of the complete Reddit data structure, with appropriate conditional checks for optional elements like comments.
148-161: LGTM!Solid negative test coverage with various invalid URL patterns, ensuring proper exception handling.
178-194: LGTM!Excellent coverage of valid URL patterns, including case-insensitivity, optional title segments, and trailing slashes.
196-211: LGTM!Comprehensive negative test coverage including missing segments, wrong path structures, extra segments, and unsupported domains.
213-217: LGTM!Clean helper method with an informative default message explaining why these tests must run locally.
219-222: LGTM!Proper implementation of
IDisposablewith safe disposal of theHttpClientresource.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (3)
34-45: Consider adding input validation for consistency.Unlike
RedditPostContentExtractor.ExtractAsync(which validates the URL format and domain before processing), this method assumesCanHandlehas already been called. While the strategy pattern likely ensures this, adding validation would make the method more robust and provide clearer error messages if called directly.Consider adding validation similar to
RedditPostContentExtractor:public async Task<Extract> ExtractAsync(string webPageUrl) { + if (!Uri.TryCreate(webPageUrl, UriKind.Absolute, out Uri? webPageUri)) + throw new InvalidOperationException($"Invalid URL format: '{webPageUrl}'. " + + $"URL must be a valid absolute URI."); + + var requestDomain = webPageUri.Host; + + if (!_redditOptions.AllDomains.Any(allowedDomain => + requestDomain.Equals(allowedDomain, StringComparison.OrdinalIgnoreCase))) + { + var supportedDomains = string.Join(", ", _redditOptions.AllDomains); + throw new InvalidOperationException($"Unsupported domain: '{requestDomain}'. " + + $"Supported domains: {supportedDomains}"); + } + var webPageUri = new Uri(webPageUrl.EndsWith(UrlPathSeparator) ? webPageUrl : webPageUrl + UrlPathSeparator, UriKind.Absolute); var webPageParts = webPageUri.AbsolutePath.Trim(UrlPathSeparator).Split(UrlPathSeparator); + + if (!(webPageParts.Length == 2 && + webPageParts[0].Equals("r", StringComparison.OrdinalIgnoreCase))) + { + throw new InvalidOperationException($"Unsupported Reddit URL format: '{webPageUrl}'. " + + $"Expected format: 'https://[reddit-domain]/r/[subreddit]'."); + } + var subredditName = webPageParts[^1];
47-51: Add input validation for the subredditName parameter.The method doesn't validate the
subredditNameparameter before using it to construct a URI. Invalid input (null, empty, or containing invalid characters) will cause unclear exceptions.public async Task<string> GetSubredditImageUrlAsync(string subredditName) { + if (string.IsNullOrWhiteSpace(subredditName)) + throw new ArgumentException("Subreddit name cannot be null or whitespace.", nameof(subredditName)); + var subRedditBaseUri = new Uri($"{_redditOptions.DefaultBaseAddress}/r/{subredditName}/"); return await ExtractImageUrlAsync(subRedditBaseUri); }
53-84: Consider reusing the HttpClient instance.The method creates an
HttpClientinstance twice (lines 56 and 73) with the same name. WhileIHttpClientFactorymanages the underlying handlers efficiently, creating the client once and reusing it would be cleaner and avoid unnecessary allocations.private async Task<string> ExtractImageUrlAsync(Uri subRedditBaseUri) { Uri subRedditAboutUri = new(subRedditBaseUri, "about.json"); var httpClient = _httpClientFactory.CreateClient("BreefDownloader"); var jsonContent = await httpClient.GetStringAsync(subRedditAboutUri.AbsoluteUri); string[] imageKeys = ["banner_background_image", "banner_img", "mobile_banner_image", "icon_img", "community_icon"]; using var doc = JsonDocument.Parse(jsonContent); var data = doc.RootElement.GetProperty("data"); foreach (var imageKey in imageKeys) { if (data.TryGetProperty(imageKey, out var prop)) { var imageUrl = prop.GetString(); if (!string.IsNullOrWhiteSpace(imageUrl) && Uri.TryCreate(imageUrl, UriKind.Absolute, out var uri) && (uri.Scheme == "http" || uri.Scheme == "https")) { - var client = _httpClientFactory.CreateClient("BreefDownloader"); - using var response = await client.GetAsync(imageUrl); + using var response = await httpClient.GetAsync(imageUrl); if (response.IsSuccessStatusCode) { return imageUrl; } } } } return _redditOptions.FallbackImageUrl; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (5)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (1)
RedditOptions(5-33)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditPostContentExtractor.cs (2)
CanHandle(15-32)Task(34-73)src/Elzik.Breef.Api/Program.cs (1)
Task(26-141)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/RedditPostClient.cs (1)
Task(7-11)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/ISubredditImageExtractor.cs (1)
Task(5-5)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-ubuntu
- GitHub Check: build-ubuntu
- GitHub Check: Analyze (csharp)
🔇 Additional comments (2)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (2)
8-14: LGTM - Clean primary constructor usage.The primary constructor syntax is well-applied, and the selective field storage is appropriate since
subredditClientis only referenced once.
16-32: LGTM - Robust URL validation.The validation logic correctly identifies subreddit URLs with appropriate domain and path checks.
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs
Show resolved
Hide resolved
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (4)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (1)
RedditOptions(5-33)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditPostContentExtractor.cs (2)
CanHandle(15-32)Task(34-73)src/Elzik.Breef.Api/Program.cs (1)
Task(26-141)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/ISubredditImageExtractor.cs (1)
Task(5-5)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-ubuntu
- GitHub Check: build-ubuntu
- GitHub Check: Analyze (csharp)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs
Show resolved
Hide resolved
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/SubRedditExtractorTests.cs (2)
537-538: Minor: Redundant mock setup.This mock configuration duplicates the setup in the constructor (line 22-23). While harmless, it's unnecessary since the constructor already configures this default behavior.
You can remove these lines:
- _mockSubredditClient.GetNewInSubreddit("subreddit") - .Returns(new NewInSubreddit { Posts = new List<RedditPost>() }); -
285-306: Weak test assertion noted.The comment at lines 304-305 acknowledges that this test cannot verify the exact URL called due to
MockHttpMessageHandlerlimitations. While the test is valid (it ensures no exception is thrown), it provides limited verification of the actual behavior.Consider enhancing
MockHttpMessageHandlerto capture the last requested URL for verification, or use a more sophisticated HTTP mocking library that provides request inspection capabilities.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/SubRedditExtractorTests.cs(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/SubRedditExtractorTests.cs (3)
src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/RedditOptions.cs (2)
RedditOptions(5-33)List(24-32)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/SubRedditContentExtractor.cs (5)
SubredditContentExtractor(8-85)CanHandle(16-32)Task(34-45)Task(47-51)Task(53-84)src/Elzik.Breef.Infrastructure/ContentExtractors/Reddit/Client/NewInSubreddit.cs (1)
NewInSubreddit(3-6)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-ubuntu
- GitHub Check: build-ubuntu
- GitHub Check: Analyze (csharp)
🔇 Additional comments (5)
tests/Elzik.Breef.Infrastructure.Tests.Unit/ContentExtractors/Reddit/SubRedditExtractorTests.cs (5)
19-39: Excellent test setup!The constructor properly initializes all mocks with sensible defaults. The use of
MockHttpMessageHandlerto simulate HTTP responses is a clean approach that avoids external dependencies.
41-113: Comprehensive URL validation coverage!The
CanHandletests thoroughly validate URL parsing across various scenarios including case-insensitivity, trailing slashes, custom Reddit instances, and invalid formats. The parameterized tests usingTheoryandInlineDataare well-chosen.
115-259: Thorough ExtractAsync test coverage!These tests comprehensively validate the extraction workflow including image URL resolution, fallback behavior, content serialization, and subreddit name parsing. The test at lines 199-242 that verifies the serialize/deserialize roundtrip is particularly good practice.
261-496: Outstanding security-aware test coverage!The
GetSubredditImageUrlAsynctests are exceptionally thorough, particularly the security-focused scenarios testing unsuitable URL schemes (lines 418-428:javascript:,data:,file:,ftp:,mailto:). This demonstrates defense-in-depth thinking. The tests also properly validate image priority order, fallback chains, and error propagation.
551-613: Well-designed test infrastructure!The helper methods and custom HTTP message handlers provide clean, reusable test infrastructure. The
MockHttpMessageHandlerwith conditional response support (lines 566-598) elegantly handles both success and failure scenarios for specific URLs, and theThrowingMockHttpMessageHandler(lines 600-613) enables clean exception propagation testing.



Summary by CodeRabbit
New Features
Tests
Documentation
Chores