Fix FAQ schema text extraction and expand coverage #16214

asafashirov · 2025-10-09T01:26:37Z

Summary

Fixed FAQ schema answer text extraction bug where trim function was being misused in pipeline mode
Expanded FAQ schema coverage from 5 pages to 46+ pages by including what-is pages
Implemented context-aware H3 detection to prevent non-question section headers from being treated as FAQ questions
Removed 9 unused legacy schema template files

Changes

Replace buggy pipeline usage of trim with proper function call syntax using trim $var \" \\t\\n\\r\" (7 instances)
Expand FAQ schema coverage to what-is pages and remove restrictive type checking
Add context-aware logic: FAQ pages treat all H3s as questions, other pages require H3 to end with ?
Delete unused course-entity.html collector and 8 legacy content schema files

Technical Details

The bug was caused by incorrect usage of Hugo's trim function. When used in pipeline mode with a cutset parameter, it was returning the cutset string itself ("\t\n\r") instead of the trimmed content. Fixed by using proper function call syntax: trim (expression) \" \\t\\n\\r\".

The FAQ structured data was generating empty answer text (just '\t\n\r') for all FAQ pages, causing Google Search Console validation errors. The issue was with Hugo's trim function when used in pipeline mode with the cutset ' \t\n\r' - it was returning the cutset itself instead of the trimmed text. Solution: Replace all instances of 'trim " \t\n\r"' with 'strings.TrimSpace' which properly trims whitespace characters. This fixes the 'Missing field "text" (in "mainEntity.acceptedAnswer")' error reported by Google Search Console.

Previously, FAQ schema was limited to 'docs' type pages with /faq in URL or 'faq' in title. This was too restrictive and missed valuable Q&A content. Changes: - Remove 'docs' type restriction to capture Q&A content anywhere - Add support for 'what-is' pages (41 pages with consistent Q&A format) - Add detection for pages with 'frequently asked' in title - Maintain support for existing FAQ pages This expansion allows FAQ structured data to appear on: - 5 existing FAQ pages (already working) - 41 what-is pages (newly added) - Any future pages with FAQ/Frequently Asked in title Benefits: - Better SEO with rich snippets for question-based searches - Improved discovery through Google's 'People also ask' features - Better machine readability for AI/LLM tools - More comprehensive Q&A coverage across the site

Previously, ALL H3 headers were treated as questions, causing non-question section headers to be incorrectly included in FAQ structured data. Changes: - In FAQ pages (/faq in URL): All H3s treated as questions (backward compatible) - In other pages: H3 must end with '?' to be a question - H2 questions still require '?' everywhere (unchanged) Results: - FAQ pages: No change, still extract all H3s (9-21 questions) - What-is IaC: Reduced from 12 to 4 questions (removed false positives) - What-is Secrets: Reduced from 38 to 18 questions (only real questions) This ensures FAQ schema only contains actual questions that users might search for, improving SEO quality and search engine understanding.

@graph

Deleted 9 unused schema templates: - course-entity.html collector (never referenced) - 8 legacy content schemas (article, blog, code, course, event, howto, product-software, qa) These files were part of old non-@graph schema implementation and are no longer used. All current schemas (BlogPosting, HowTo, FAQPage, TechArticle, etc.) continue to work correctly. Kept course-list.html as it's actively used by tutorials/section.html.

Changed from incorrect pipeline usage of trim to proper function call syntax. Hugo's trim function requires 2 arguments: the string and the cutset.

Replace all remaining strings.TrimSpace with trim function using proper Hugo syntax

pulumi-bot · 2025-10-09T01:45:11Z

Your site preview for commit f4291ae is ready! 🎉

http://www-testing-pulumi-docs-origin-pr-16214-f4291ae1.s3-website.us-west-2.amazonaws.com.

asafashirov added 4 commits October 8, 2025 09:58

asafashirov had a problem deploying to testing October 9, 2025 01:26 — with GitHub Actions Failure

This comment was marked as resolved.

Sign in to view

Fix Hugo trim function syntax error

9cb61d3

Changed from incorrect pipeline usage of trim to proper function call syntax. Hugo's trim function requires 2 arguments: the string and the cutset.

asafashirov had a problem deploying to testing October 9, 2025 01:33 — with GitHub Actions Failure

Fix remaining strings.TrimSpace instances

f4291ae

Replace all remaining strings.TrimSpace with trim function using proper Hugo syntax

asafashirov temporarily deployed to testing October 9, 2025 01:38 — with GitHub Actions Inactive

asafashirov enabled auto-merge (squash) October 9, 2025 01:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix FAQ schema text extraction and expand coverage #16214

Fix FAQ schema text extraction and expand coverage #16214

Uh oh!

asafashirov commented Oct 9, 2025 •

edited

Loading

Uh oh!

This comment was marked as resolved.

pulumi-bot commented Oct 9, 2025

Uh oh!

Uh oh!

Fix FAQ schema text extraction and expand coverage #16214

Are you sure you want to change the base?

Fix FAQ schema text extraction and expand coverage #16214

Uh oh!

Conversation

asafashirov commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Technical Details

Uh oh!

This comment was marked as resolved.

pulumi-bot commented Oct 9, 2025

Uh oh!

Uh oh!

asafashirov commented Oct 9, 2025 •

edited

Loading