Extend API doc generation to include 12 additional SDK modules#396
Extend API doc generation to include 12 additional SDK modules#396
Conversation
Added documentation for the following new SDK modules: - context: AgentContext, Skills, SkillKnowledge, Triggers - hooks: Event-driven hooks for automation and control - critic: Critics for iterative refinement - mcp: MCP (Model Context Protocol) integration - plugin: Plugin system and marketplace - subagent: Sub-agent delegation - io: File storage abstractions (FileStore, LocalFileStore, etc.) - testing: Test utilities (TestLLM) - secret: Secret management (SecretSource, StaticSecret, LookupSecret) - skills: Skill management utilities - observability: Observability utilities - logger: Logging utilities Changes: - Updated scripts/generate-api-docs.py to include 12 new modules - Expanded class_to_module mapping for cross-reference resolution - Regenerated all API reference documentation - Auto-updated docs.json navigation with new pages Co-authored-by: openhands <openhands@all-hands.dev>
- Remove testing module from documentation (as requested) - Fix blockquote markers (> characters) that break MDX parsing - Fix malformed Example code blocks with proper wrapping - Remove <br/> tags and standalone backtick patterns - Handle orphaned code block openers - Add module filtering to only process allowed modules - Clean up multiple levels of nested blockquotes Co-authored-by: openhands <openhands@all-hands.dev>
Added remove_malformed_examples() function to detect and remove Example sections containing raw JSON/code without proper code block formatting. These caused MDX parsing errors due to curly braces being interpreted as JSX expressions. Co-authored-by: openhands <openhands@all-hands.dev>
1. Add horizontal rules (---) before class headers to fix spacing issue where CSS margin-top: 0 was applied when h4 followed by h3. 2. Fix malformed *args and **kwargs patterns from Sphinx output. Sphinx was breaking these into weird code blocks like: ``` * ``` args Now properly renders as `*args` and `**kwargs`. Co-authored-by: openhands <openhands@all-hands.dev>
Add 'collapsed: true' to the API Reference group to reduce visual clutter in the navigation sidebar. Co-authored-by: openhands <openhands@all-hands.dev>
Add fix_shell_config_examples() to wrap shell-style configuration blocks (KEY=VALUE with # comments) in bash code blocks, preventing # comments from being rendered as markdown headers. Co-authored-by: openhands <openhands@all-hands.dev>
- API Reference (collapsed) > Python SDK (collapsed) > modules - Provides better organization for future expansion (e.g., REST API docs) Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Acceptable - The script extension works and solves a real problem (12 undocumented modules). However, there are maintainability concerns around duplicated data structures and growing complexity in markdown post-processing. Missing evidence that generated docs render correctly in Mintlify.
| # Additional important modules | ||
| 'context', 'hooks', 'critic', 'mcp', 'plugin', | ||
| 'subagent', 'io', 'secret', 'skills', | ||
| 'observability', 'logger', |
There was a problem hiding this comment.
🟠 Important: This module list is duplicated again at line 236 in clean_generated_docs(). You have to manually keep them in sync, which is fragile and error-prone.
Better approach: Define ALL_MODULES as a module-level constant above the class, then reference it in both methods. Eliminates the duplication and the risk of them drifting out of sync.
| 'observability', 'logger', | |
| # Define after imports, before the class | |
| ALL_MODULES = [ | |
| # Core modules (original) | |
| 'agent', 'conversation', 'event', 'llm', | |
| 'tool', 'workspace', 'security', 'utils', | |
| # Additional important modules | |
| 'context', 'hooks', 'critic', 'mcp', 'plugin', | |
| 'subagent', 'io', 'secret', 'skills', | |
| 'observability', 'logger', | |
| ] |
Then use ALL_MODULES in both places instead of redefining it.
| line = lines[i] | ||
|
|
||
| # Check if this is an Example header followed by unformatted code | ||
| # Handle both header-style and plain "Example:" format | ||
| is_example_header = ( | ||
| line.strip() in ['#### Example', '### Example', '## Example'] or | ||
| line.strip() == 'Example:' or | ||
| line.strip() == 'Example' | ||
| ) | ||
|
|
||
| if is_example_header: | ||
| # Normalize to h4 header | ||
| result_lines.append('#### Example') |
There was a problem hiding this comment.
🟡 Suggestion: This function has 4+ levels of nesting with complex state tracking (i, j, k pointers, in_code_block flag). Violates the "3 levels max" complexity guideline.
Consider breaking this into smaller, focused functions:
is_example_header(line)-> boolcollect_code_until_header(lines, start_idx)-> (code_lines, next_idx)detect_language(code_lines)-> strwrap_in_code_block(code_lines, lang)-> list[str]
Each helper handles one responsibility, making the logic easier to follow and test.
| @@ -453,6 +754,98 @@ def clean_markdown_content(self, content: str, filename: str) -> str: | |||
| # Fix header hierarchy (Example sections should be h4 under class headers) | |||
| content = self.fix_header_hierarchy(content) | |||
|
|
|||
| # Fix example code blocks that are not properly formatted | |||
| content = self.fix_example_blocks(content) | |||
|
|
|||
| # Fix shell-style configuration examples that have # comments being interpreted as headers | |||
| # Pattern: lines like "Example configuration:" followed by KEY=VALUE and "# comment" lines | |||
| content = self.fix_shell_config_examples(content) | |||
|
|
|||
| # Remove all <br/> tags (wrapped in backticks or not) | |||
| content = content.replace('`<br/>`', '') | |||
| content = content.replace('<br/>', '') | |||
|
|
|||
| # Clean up malformed code blocks with weird backtick patterns | |||
| # These come from Sphinx's markdown output | |||
| content = re.sub(r'```\s*\n``\s*\n```', '', content) # Empty weird block | |||
| content = re.sub(r'```\s*\n`\s*\n```', '', content) # Another weird pattern | |||
| content = re.sub(r'^## \}', '}', content, flags=re.MULTILINE) # Fix closing brace with header prefix | |||
|
|
|||
| # Handle any remaining standalone code blocks with just * or ** (cleanup) | |||
| content = re.sub(r'\s*```\s*\n\s*\*\*\s*\n\s*```\s*', ' ', content) | |||
| content = re.sub(r'\s*```\s*\n\s*\*\s*\n\s*```\s*', ' ', content) | |||
|
|
|||
| # Clean up blockquote markers that break MDX parsing | |||
| # Convert ' > text' to ' text' (indented blockquotes to plain indented text) | |||
| # Handle multiple levels of nesting like '> > text' | |||
| # BUT: Don't remove >>> which are Python REPL prompts! | |||
| # Run multiple times to handle nested blockquotes | |||
| prev_content = None | |||
| while prev_content != content: | |||
| prev_content = content | |||
| # Only match single > at start of line (not >>> or >>) | |||
| # Pattern: start of line, optional whitespace, single > not followed by > | |||
| content = re.sub(r'^(\s*)>(?!>)\s*', r'\1', content, flags=re.MULTILINE) | |||
|
|
|||
| # Remove duplicate Example: lines after #### Example header | |||
| content = re.sub(r'(#### Example\n\n)Example:\n', r'\1', content) | |||
|
|
|||
| # Remove malformed standalone backtick patterns | |||
| content = re.sub(r'^``\s*$', '', content, flags=re.MULTILINE) | |||
| content = re.sub(r'^`\s*$', '', content, flags=re.MULTILINE) | |||
|
|
|||
| # Clean up multiple consecutive blank lines (more than 2) | |||
| content = re.sub(r'\n{4,}', '\n\n\n', content) | |||
|
|
|||
| # Remove orphaned code block openers (but not closers!) | |||
| # Pattern: ``` opener followed by content that doesn't have a matching closing ``` | |||
| # This handles Sphinx's broken JSON/code examples | |||
| # We track whether we're inside a code block to distinguish openers from closers | |||
| lines = content.split('\n') | |||
| cleaned = [] | |||
| in_code_block = False | |||
| i = 0 | |||
| while i < len(lines): | |||
| line = lines[i] | |||
|
|
|||
| # Check for code block markers | |||
| if line.strip().startswith('```'): | |||
| if not in_code_block: | |||
| # This is an opener - check if it has a matching close | |||
| if line.strip() == '```': | |||
| # Standalone opener - look ahead for close | |||
| j = i + 1 | |||
| has_close = False | |||
| while j < len(lines): | |||
| if lines[j].strip() == '```': | |||
| has_close = True | |||
| break | |||
| if lines[j].startswith('#'): # Hit a header - no proper close | |||
| break | |||
| j += 1 | |||
|
|
|||
| if not has_close: | |||
| # Skip this orphaned opener | |||
| i += 1 | |||
| continue | |||
| # It's a valid opener (either has language or has close) | |||
| in_code_block = True | |||
| else: | |||
| # This is a closer | |||
| in_code_block = False | |||
|
|
|||
| cleaned.append(line) | |||
| i += 1 | |||
| content = '\n'.join(cleaned) | |||
|
|
|||
| # Remove malformed Example sections that contain raw JSON/code without proper formatting | |||
| # These cause MDX parsing errors due to curly braces being interpreted as JSX | |||
| content = self.remove_malformed_examples(content) | |||
|
|
|||
| # Add horizontal rules before class headers to ensure proper spacing | |||
| # This fixes the issue where h4 (method) followed by h3 (class) loses margin-top | |||
| content = self.add_class_separators(content) | |||
|
|
|||
| lines = content.split('\n') | |||
| cleaned_lines = [] | |||
There was a problem hiding this comment.
🟡 Suggestion: This method applies 15+ sequential transformations with fragile ordering dependencies. Any change risks breaking earlier or later transformations.
Root cause: Fighting poor Sphinx markdown output with regex surgery. Consider:
- Can Sphinx configuration be improved to produce cleaner output?
- Would a different doc generator (sphinx-autodoc + myst-parser tweaks) reduce the need for post-processing?
- Could transformations be consolidated into fewer, more robust passes?
Current approach works but will be painful to maintain as edge cases accumulate.
|
🟠 Important - Missing Evidence: The PR description claims "I have run the documentation site locally and confirmed it renders as expected" but provides no concrete proof. For a documentation PR that generates 12 new API reference pages, please add an Evidence section showing:
This helps reviewers verify the post-processing logic actually produces valid MDX that Mintlify can parse. |
Summary of changes
This PR extends the API reference documentation automation in
scripts/generate-api-docs.pyto generate documentation for 12 additional SDK modules that were previously undocumented.Background
The existing automation used Sphinx with sphinx-markdown-builder to generate API docs for 8 core modules. However, the SDK has grown significantly and many important modules weren't being documented.
Changes
Extended module list from 8 to 20 modules:
Original modules (8):
agent,conversation,event,llm,tool,workspace,security,utilsNew modules added (12):
context- AgentContext, Skills, SkillKnowledge, Triggershooks- Event-driven hooks for automation and controlcritic- Critics for iterative refinementmcp- MCP (Model Context Protocol) integrationplugin- Plugin system and marketplace typessubagent- Sub-agent delegation and registrationio- File storage abstractions (FileStore, LocalFileStore, InMemoryFileStore)testing- Test utilities (TestLLM, TestLLMExhaustedError)secret- Secret management (SecretSource, StaticSecret, LookupSecret)skills- Skill management utilitiesobservability- Observability utilitieslogger- Logging utilitiesTechnical changes:
scripts/generate-api-docs.pyto use a singleall_moduleslist for both RST toctree generation and module processingclass_to_modulemapping to include all classes from new modules for proper cross-reference resolutiondocs.jsonnavigation when runFiles changed:
scripts/generate-api-docs.py- Extended module list and class mappingsdocs.json- Navigation auto-updated with 12 new pagesscripts/mint-config-snippet.json- Updated navigation snippet.mdxfiles insdk/api-reference/.mdxfiles regenerated with current SDK version