-
Notifications
You must be signed in to change notification settings - Fork 0
Improve Per-extractor Instructions #170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThe SubredditContent.md instruction file was updated to replace general summarization guidance with a new structured specification for summarizing Reddit subreddit JSON data, defining input formats, output requirements in HTML, specific inclusion/exclusion rules, and post-level attribution patterns. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR updates the SubredditContent.md summarization instructions to provide more detailed and structured guidance for summarizing Reddit subreddit data. The update transforms the brief instructions into a comprehensive guide with clear sections for task description, input structure, requirements, and output format.
Key Changes:
- Restructured instructions from a simple bullet list to a well-organized document with markdown sections
- Added detailed specifications for handling Reddit-specific data structures (posts, comments, nested replies)
- Introduced HTML link formatting requirements for post titles and comment author attributions
- Clarified summary length constraints and high-scoring content prioritization
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ## Requirements | ||
|
|
||
| 1. **Overview**: Describe the general state/themes of the subreddit | ||
| 2. **Posts**: Sumarise every post with a thematic summary of its comments |
Copilot
AI
Nov 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'Sumarise' to 'Summarise'.
| 2. **Posts**: Sumarise every post with a thematic summary of its comments | |
| 2. **Posts**: Summarise every post with a thematic summary of its comments |
|
|
||
| - Strictly well-formatted HTML output | ||
| - Brief overview of themes covered in this specific JSON document | ||
| - DO not include a general description of the subreddit itself |
Copilot
AI
Nov 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent capitalization: 'DO not' should be 'Do not' to match the standard capitalization pattern used throughout the document.
| - DO not include a general description of the subreddit itself | |
| - Do not include a general description of the subreddit itself |
| - Maximum 200 words OR 10% of original length (whichever is shorter) | ||
| - For a top-level post, include post title as HTML link: `<a href="POST_URL">Title</a>` | ||
| - Where a post's replies are highly scoring, also summarise them and include author attribution with comment links: `<a href="COMMENT_URL">@author</a>`. Link to the author's comment, not to the author's profile. | ||
| 1. **Exclude**: |
Copilot
AI
Nov 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent numbering in the ordered list. This item is numbered '1' but should be '4' to continue the sequence from the previous items (1. Overview, 2. Posts, 3. Summaries).
| 1. **Exclude**: | |
| 4. **Exclude**: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/Elzik.Breef.Infrastructure/SummarisationInstructions/SubredditContent.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
src/Elzik.Breef.Infrastructure/SummarisationInstructions/SubredditContent.md
[grammar] ~14-~14: Ensure spelling is correct
Context: ...e/themes of the subreddit 2. Posts: Sumarise every post with a thematic summary of i...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~30-~30: Use a hyphen to join words.
Context: ...reddit itself - Summaries of the highest scoring top-level posts
(QB_NEW_EN_HYPHEN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-ubuntu
- GitHub Check: Analyze (csharp)
| ## Requirements | ||
|
|
||
| 1. **Overview**: Describe the general state/themes of the subreddit | ||
| 2. **Posts**: Sumarise every post with a thematic summary of its comments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix spelling: "Sumarise" → "Summarise" (or "Summarize").
Line 14 contains a spelling error. Please use the standard English spelling for your project's convention.
-2. **Posts**: Sumarise every post with a thematic summary of its comments
+2. **Posts**: Summarise every post with a thematic summary of its comments📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 2. **Posts**: Sumarise every post with a thematic summary of its comments | |
| 2. **Posts**: Summarise every post with a thematic summary of its comments |
🧰 Tools
🪛 LanguageTool
[grammar] ~14-~14: Ensure spelling is correct
Context: ...e/themes of the subreddit 2. Posts: Sumarise every post with a thematic summary of i...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🤖 Prompt for AI Agents
In src/Elzik.Breef.Infrastructure/SummarisationInstructions/SubredditContent.md
around line 14, fix the misspelling "Sumarise" by replacing it with the
project's chosen standard spelling ("Summarise" or "Summarize") and ensure the
same spelling is used consistently elsewhere in this file or the repo's
documentation.
| - Maximum 200 words OR 10% of original length (whichever is shorter) | ||
| - For a top-level post, include post title as HTML link: `<a href="POST_URL">Title</a>` | ||
| - Where a post's replies are highly scoring, also summarise them and include author attribution with comment links: `<a href="COMMENT_URL">@author</a>`. Link to the author's comment, not to the author's profile. | ||
| 1. **Exclude**: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix numbered list ordering: restart at "4" instead of "1".
Line 19 restarts the list numbering when it should continue the sequence from the "Summaries" requirement above. This creates confusion about the hierarchy and structure of requirements.
-1. **Exclude**:
+4. **Exclude**:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 1. **Exclude**: | |
| 4. **Exclude**: |
🤖 Prompt for AI Agents
In src/Elzik.Breef.Infrastructure/SummarisationInstructions/SubredditContent.md
around line 19, the numbered list incorrectly restarts at "1" for the "Exclude"
section; update the markdown list numbering so it continues the sequence (use
"4." instead of "1.") to maintain proper ordered-list flow and hierarchy with
the preceding "Summaries" items.
| - DO not include a general description of the subreddit itself | ||
| - Summaries of the highest scoring top-level posts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Complete the instruction on line 30.
Line 30 ("Summaries of the highest scoring top-level posts") is a noun phrase without a verb, making it an incomplete instruction. Clarify whether this is meant to instruct inclusion, focus, or filtering of output.
Consider revising to:
-- DO not include a general description of the subreddit itself
-- Summaries of the highest scoring top-level posts
+- DO not include a general description of the subreddit itself
+- Include summaries of the highest scoring top-level posts📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - DO not include a general description of the subreddit itself | |
| - Summaries of the highest scoring top-level posts | |
| - DO not include a general description of the subreddit itself | |
| - Include summaries of the highest scoring top-level posts |
🧰 Tools
🪛 LanguageTool
[grammar] ~30-~30: Use a hyphen to join words.
Context: ...reddit itself - Summaries of the highest scoring top-level posts
(QB_NEW_EN_HYPHEN)
🤖 Prompt for AI Agents
In src/Elzik.Breef.Infrastructure/SummarisationInstructions/SubredditContent.md
around lines 29-30, line 30 is an incomplete noun phrase ("Summaries of the
highest scoring top-level posts"); change it to an imperative instruction
clarifying intent, e.g., "Include concise summaries of the highest-scoring
top-level posts (limit to top N posts or top posts from the past X timeframe)";
ensure it states whether to include, focus on, or filter posts and optionally
specify limits/time range for consistency.
|



Summary by CodeRabbit