Stub File Detection Script
Overview
The scripts/detect_stub_files.py script is a Python-based tool that intelligently detects documentation stub files requiring completion. It replaces the original bash-based detection with smart filtering to avoid false positives from code blocks and documentation examples.
Purpose
This script is part of the automated documentation workflow that:
- Identifies incomplete documentation files marked with stub indicators
- Prioritizes stub files based on location and content markers
- Generates detailed analysis for GitHub issue creation
- Avoids flagging code examples and meta-documentation as stubs
Functions
is_in_code_block(content, marker_pos)
Checks if a position in markdown content is within a code block to avoid false positive detection.
Parameters:
content(str): The full content of the filemarker_pos(int): Position of the marker to check
Returns:
bool: True if the position is within any type of code block
Detects:
- Fenced code blocks (``` or ~~~)
- Inline code blocks (
code) - Indented code blocks (4+ spaces)
Example:
content = "This is a STUB: marker outside code\n```\nSTUB: inside code\n```"
pos = content.find("STUB:")
is_code = is_in_code_block(content, pos) # False for first, True for second
is_documentation_about_stubs(file_path, content)
Determines if a file documents the stub/workflow system itself to exclude it from stub detection.
Parameters:
file_path(str): Path to the file being analyzedcontent(str): File content
Returns:
bool: True if file is meta-documentation about stub systems
Detection criteria:
- Filename contains: DOCUMENTATION_WORKFLOW, AUTO_DOCUMENTATION, WORKFLOW, STUB_SYSTEM
- Content mentions multiple workflow indicators
Example:
is_meta = is_documentation_about_stubs(
"docs/development/AUTO_DOCUMENTATION_WORKFLOWS.md",
content_with_workflow_docs
) # Returns True
find_stub_markers_outside_code_blocks(file_path)
Finds stub markers in a file that are not within code blocks or meta-documentation.
Parameters:
file_path(str): Path to the markdown file to analyze
Returns:
List[str]: List of stub markers found outside code blocks
Detected markers:
- STUB:, TODO:, PLACEHOLDER, COMING SOON, TBD
- NOT YET IMPLEMENTED, UNDER CONSTRUCTION, DRAFT, INCOMPLETE
Example:
markers = find_stub_markers_outside_code_blocks("docs/features/NEW_FEATURE.md")
# Returns ["STUB:", "TODO:"] if those markers exist outside code blocks
find_stub_files_by_name(docs_dir=”docs”)
Finds stub files using naming convention patterns.
Parameters:
docs_dir(str): Directory to search (defaults to “docs”)
Returns:
Set[str]: Set of file paths matching stub naming patterns
Patterns detected:
*_STUB.md,STUB_*.mdTODO_*.md,*_TODO.md*_DRAFT.md,DRAFT_*.md
Example:
stub_files = find_stub_files_by_name()
# Returns {"docs/features/STUB_NEW_FEATURE.md", "docs/TODO_API_DOCS.md"}
find_stub_files_by_content(docs_dir=”docs”)
Finds stub files by analyzing content for stub markers outside code blocks.
Parameters:
docs_dir(str): Directory to search for markdown files
Returns:
Dict[str, List[str]]: Dictionary mapping file paths to lists of found markers
Example:
content_stubs = find_stub_files_by_content()
# Returns {"docs/features/incomplete.md": ["STUB:", "TODO:"]}
Priority System
The script assigns priorities based on file location and content:
- Critical: Files in
docs/development/or containing CRITICAL:/URGENT:/IMPORTANT: markers - Moderate: Files in
docs/features/,docs/testing/,docs/troubleshooting/ - Minor: Other documentation files
Usage
Command Line
cd scripts
python3 detect_stub_files.py
Output Format
The script outputs JSON analysis to stdout and progress messages to stderr:
{
"stub_files": [
{
"file": "docs/features/STUB_FEATURE.md",
"priority": "moderate",
"type": "feature",
"effort": "low",
"size_bytes": 245,
"line_count": 12,
"markers": "STUB:, TODO:"
}
],
"total_stubs": 1,
"critical_stubs": 0,
"moderate_stubs": 1,
"minor_stubs": 0,
"analysis_timestamp": "2025-06-09T16:23:00Z",
"summary": {
"message": "Found 1 stub files requiring completion",
"breakdown": "Critical: 0, Moderate: 1, Minor: 0",
"next_action": "Process stubs by priority"
}
}
Integration with Workflows
This script is used by GitHub Actions workflows for:
- Automated documentation gap detection
- GitHub issue creation for incomplete documentation
- Documentation quality monitoring
- CI/CD documentation compliance checks
Advanced Features
Smart Code Block Detection
- Handles multiple code block formats
- Correctly identifies indented code blocks
- Processes nested and complex markdown structures
False Positive Prevention
- Excludes workflow documentation that contains example markers
- Ignores markers in code examples
- Filters meta-documentation about the stub system itself
Comprehensive Analysis
- Estimates completion effort based on file size
- Categorizes stubs by documentation type
- Provides actionable prioritization
- Generates workflow-ready JSON output
Error Handling
The script includes robust error handling:
- Gracefully handles file read errors
- Continues processing when individual files fail
- Provides meaningful error messages
- Safe encoding handling for various file types
Dependencies
- Python 3.6+
- Standard library only (no external dependencies)
- Pathlib for cross-platform file handling
- Regular expressions for pattern matching
Quick Start for Developers
This script is part of the automated documentation system. For most developers, you don’t need to run this script directly - it’s automatically executed by GitHub workflows.
When You Need This Information
- Understanding how stub files are detected
- Debugging stub file detection issues
- Contributing to the detection logic
- Creating custom stub file patterns
Common Developer Workflow
- Create stub files with markers like
STUB:,TODO:,PLACEHOLDER: - Push to repository - automation detects stubs within 24 hours
- Review generated issues - system creates detailed completion tasks
- Complete documentation - system automatically closes issues when done
For complete workflow information, see the Documentation Workflows Hub.
Related Documentation
Core Documentation System
- Documentation Workflows Hub - Central guide to all documentation automation
- GitHub Documentation Workflow - Issue creation and management system
- Auto Documentation Workflows - Quality monitoring and evolution tracking
Implementation Guides
- Development Documentation Index - All development documentation
- Documentation Organization Guidelines - File organization standards
- CLAUDE.md - Complete project development guidelines
Related Tools
- Tool Confirmation Usage - Frontend component documentation patterns