Visual Testing Infrastructure: Network Navigation Challenges¶

Problem Statement¶

Our visual testing infrastructure is experiencing consistent timeout and navigation failures when attempting to capture screenshots of documentation pages.

Observed Symptoms¶

Pages fail to load within the specified timeout (30-45 seconds)
Intermittent network-related errors
Inconsistent page capture behavior

Diagnostic Approach¶

Immediate Investigation Priorities¶

Local Server Diagnostics
Verify local server (localhost:8000) is running correctly
Check server response times
Validate server configuration
Network Environment Analysis
Examine network interface configurations
Investigate localhost resolution
Check for potential network interference
Playwright Configuration
Review browser context and page navigation settings
Analyze timeout and wait conditions
Investigate potential browser-specific issues

Proposed Investigation Workflow¶

Phase 1: Basic Diagnostics¶

Manually verify all pages load correctly in a web browser
Check local server startup process
Validate mkdocs build process
Run network diagnostics script

Phase 2: Detailed Network Analysis¶

Implement comprehensive network logging
Capture detailed request/response metadata
Analyze network request patterns
Identify potential bottlenecks

Phase 3: Browser and Playwright Configuration¶

Experiment with different wait conditions
Adjust timeout strategies
Test with different browser types (Chromium, Firefox, WebKit)
Implement more granular error handling

Timeout Strategy Investigation¶

Hypothesis Development¶

Minimal Timeout Hypothesis
Objective: Understand page load characteristics under strict time constraints
Proposed Timeout: 3 seconds
Key Questions:
- Which pages load within 3 seconds?
- What are the common characteristics of quickly loading pages?
- Are there systemic bottlenecks in page generation?

Diagnostic Information Collection¶

Page Load Metrics¶

Capture page load times
Record successful vs. failed page captures
Categorize pages by load time complexity

Detailed Metrics to Track¶

Page Path | Load Time | Status | Network Requests | Error Details
---------------------------------------------------------------------------
/index.html | 0.5s | Success | 12 requests | None
/tech/arch.html | 3.2s | Timeout | 45 requests | Network idle not reached

Experimental Parameters¶

Base Timeout: 3 seconds
Wait Condition: 'networkidle'
Capture Strategy:
Sequential captures
Detailed logging
Minimal retry mechanism

Investigative Workflow¶

Phase 1: Baseline Measurement¶

Run visual tests with 3-second timeout
Collect comprehensive logs
Analyze initial results
Identify percentage of pages loading
Categorize load time distributions

Phase 2: Comparative Analysis¶

Compare results with manual browser loading
Investigate discrepancies between automated and manual loading
Identify potential environmental factors

Potential Investigation Vectors¶

Network Layer¶

Verify local server configuration
Check network interface performance
Analyze DNS resolution
Investigate potential network bottlenecks

Page Generation¶

Examine mkdocs build process
Check for complex page generation logic
Identify pages with heavy dynamic content

Browser Automation¶

Test with different wait conditions
Experiment with alternative navigation strategies
Compare performance across browser types

Logging and Tracing Strategy¶

Enhanced Logging Configuration¶

logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler('visual_test_diagnostics.log'),
        logging.StreamHandler()
    ]
)

Expected Outcomes¶

Comprehensive understanding of page load characteristics
Identification of systemic loading issues
Data-driven approach to timeout and loading strategies

Risks and Limitations¶

3-second timeout is extremely aggressive
May not capture full page loading complexity
Potential for false negative results

Mitigation Strategies¶

Implement flexible timeout mechanisms
Develop adaptive loading strategies
Create fallback capture methods

Decision Tree¶

Is page loading within 3s?
│
├── Yes → Capture Screenshot
│
└── No  → Investigate
    │
    ├── Network Issues
    │   ├── DNS Resolution
    │   ├── Server Performance
    │   └── Network Latency
    │
    └── Page Generation
        ├── Complex Dynamic Content
        ├── Large Asset Dependencies
        └── Mkdocs Generation Overhead

Documentation Tracking¶

Investigation Start: 2024-12-22
Initial Timeout: 3 seconds
Primary Goal: Understand page load characteristics

MkDocs Site Generation and Path Mapping¶

Source to Build Path Transformation¶

MkDocs converts Markdown source files to HTML using a specific path transformation:

Path Mapping Examples¶

Source Path (Markdown)	Generated HTML Path
`docs/index.md`	`site/index.html`
`docs/notebook/index.md`	`site/notebook/index.html`
`docs/notebook/features/bulk-upload/index.md`	`site/notebook/features/bulk-upload/index.html`
`docs/tech/architecture.md`	`site/tech/architecture/index.html`

Key Transformation Rules¶

.md files are converted to index.html
Subdirectory structure is preserved
Root docs/ directory maps to site/
Markdown filenames become index.html

Capture Strategy Considerations¶

Always use index.html for directory paths
Preserve full subdirectory structure
Handle potential nested directory scenarios

Path Normalization Example¶

def normalize_page_path(page_path):
    """
    Convert source Markdown path to generated HTML path

    Args:
        page_path (str): Original Markdown file path

    Returns:
        str: Corresponding HTML file path in site directory
    """
    # Remove .md extension
    html_path = page_path.replace('.md', '')

    # Ensure index.html for directories
    if not html_path.endswith('/index'):
        html_path += '/index'

    return f"{html_path}.html"

Potential Capture Paths¶

notebook/features/bulk-upload/index.md → site/notebook/features/bulk-upload/index.html
tech/architecture.md → site/tech/architecture/index.html

Debugging Path Issues¶

Verify source Markdown file existence
Check MkDocs build configuration
Validate site directory structure
Use logging to track path transformations

MkDocs Path Mapping Guide

MkDocs Site Directory Structure¶

Directory Overview¶

The site/ directory is generated by MkDocs and contains the fully built static documentation site. Its structure mirrors the source documentation layout:

site/
├── index.html                  # Root landing page
├── 404.html                    # Custom 404 error page
├── assets/                     # Static assets (CSS, JS)
├── notebook/                   # Notebook section
│   ├── index.html
│   ├── ai-workspace/           # Specific workspace documentation
│   ├── features/               # Feature documentation
│   └── process/                # Process documentation
├── tech/                       # Technical documentation section
├── product/                    # Product-related pages
├── we-and-ai/                  # AI-related documentation
├── search/                     # Search index and functionality
└── stylesheets/                # Custom styling

Capture Strategy Considerations¶

Each HTML file represents a rendered documentation page
Subdirectories reflect the documentation's logical structure
Static site generation ensures consistent rendering

Filename Mapping¶

When capturing screenshots, we'll use a consistent mapping: - /notebook/ai-workspace/index.html → notebook_ai-workspace_index.png - /tech/architecture.html → tech_architecture.png

This approach ensures: - Unique filename generation - Easy traceability - Filesystem-friendly naming

Parallel Site Directory Screenshot Capture Strategy¶

Overview¶

The parallel site directory screenshot capture is a robust method for generating visual tests across a complex documentation site structure. It addresses the challenges of capturing screenshots from a multi-level, statically generated documentation site.

Key Design Principles¶

Parallel Processing: Utilize asynchronous processing to capture screenshots concurrently
Semaphore-based Concurrency: Limit simultaneous browser instances to prevent resource exhaustion
Preserve Directory Structure: Maintain the original site directory hierarchy in screenshot storage
Comprehensive Coverage: Capture screenshots for all HTML files recursively

Implementation Strategy¶

Directory Traversal¶

Start from the site/ root directory
Recursively walk through all subdirectories
Identify and process .html files
Skip non-HTML files and specific directories (e.g., assets/)

Capture Process¶

File Discovery
Use os.walk() to traverse directory tree
Filter for .html files
Preserve relative path information
Screenshot Generation
Use Playwright to render each HTML file
Capture full-page screenshots
Generate unique filenames based on original path
Concurrency Management
Implement asyncio Semaphore to control:
- Maximum concurrent browser instances
- Prevent system resource overload
- Ensure predictable performance

Potential Challenges and Mitigations¶

Large Directory Structures: Use configurable concurrency limits
Rendering Variations: Standardize viewport and browser settings
Performance Overhead: Implement intelligent filtering and caching mechanisms

Pseudocode Concept¶

async def capture_site_screenshots(site_dir, max_concurrent=5):
    screenshots = []
    semaphore = asyncio.Semaphore(max_concurrent)

    async def process_html_file(file_path):
        async with semaphore:
            # Capture screenshot logic
            pass

    for root, _, files in os.walk(site_dir):
        html_files = [f for f in files if f.endswith('.html')]
        tasks = [process_html_file(os.path.join(root, file)) for file in html_files]
        await asyncio.gather(*tasks)

Recommended Configuration¶

VisualTester(
    site_directory='site/',
    max_concurrent_captures=10,
    viewport_size=(1280, 720),
    output_directory='.cascade/visual-test'
)

Performance Considerations¶

Adjust max_concurrent_captures based on system capabilities
Monitor memory and CPU usage during large site captures
Implement logging and metrics collection

Future Enhancements¶

Intelligent file filtering
Incremental screenshot updates
Differential visual testing

Workaround Strategies¶

Timeout Modification¶

# Increased timeout with more detailed error handling
await page.goto(url, 
    wait_until='networkidle', 
    timeout=3000  # 3 seconds
)

Sequential Page Capture¶

Implement a sequential capture strategy instead of concurrent
Reduce parallel processing to minimize network contention

Retry Mechanism¶

async def capture_with_retry(page_path, max_retries=3):
    for attempt in range(max_retries):
        try:
            # Capture logic
            return result
        except Exception as e:
            if attempt == max_retries - 1:
                raise
            await asyncio.sleep(2 ** attempt)  # Exponential backoff

Comprehensive Workaround Strategy¶

Workaround: Static HTML Snapshot Capture¶

Core Workaround Strategy¶

When network-based page loading fails, we'll implement a filesystem-driven screenshot approach that bypasses traditional network rendering.

Proposed Implementation¶

async def capture_page_snapshot(page_path):
    """
    Capture page screenshot using alternative methods

    Workaround Strategies:
    - Direct file reading
    - Static HTML rendering
    - Fallback screenshot mechanisms
    """
    try:
        # Strategy 1: Direct HTML File Reading
        html_path = os.path.join('site', page_path)
        if os.path.exists(html_path):
            # Use a headless browser to render static HTML
            async with async_playwright() as p:
                browser = await p.chromium.launch(headless=True)
                context = await browser.new_context()
                page = await context.new_page()

                # Load file directly from filesystem
                await page.goto(f'file://{os.path.abspath(html_path)}')

                # Immediate screenshot, bypass network loading
                screenshot_path = f'.cascade/visual-test/{page_path.replace("/", "_")}.png'
                await page.screenshot(path=screenshot_path)

                await browser.close()
                return screenshot_path

        # Strategy 2: Fallback Rendering
        console.print(f"⚠️ Could not capture {page_path}")
        return None

    except Exception as e:
        console.print(f"❌ Snapshot capture failed: {e}")
        return None

Workaround Objectives¶

Bypass Network Constraints
Render pages directly from filesystem
Eliminate network-related loading issues
Ensure consistent page capture
Robust Error Handling
Graceful failure modes
Comprehensive error logging
Minimal performance overhead

Implementation Advantages¶

Works with static site generators
Independent of network conditions
Consistent rendering environment
Low resource consumption

Potential Limitations¶

May not capture dynamically generated content
Requires pre-built site directory
Potential rendering differences from live site

Fallback Mechanism Flowchart¶

Start Page Capture
│
├── Filesystem HTML Exists?
│   ├── Yes → Render Static HTML
│   │         └── Take Screenshot
│   │
│   └── No  → Log Capture Failure
│             └── Continue to Next Page

Diagnostic Information¶

Capture Method: Filesystem-based static rendering
Rendering Engine: Playwright Chromium
Viewport: Consistent browser context
Error Handling: Comprehensive logging

Future Improvements¶

Add multiple rendering engine support
Implement more sophisticated fallback strategies
Create comprehensive capture reports

Adaptive Workaround for Documentation Screenshot Challenges

Reasoning and Decision Process¶

Root Cause Analysis¶

Network Loading Challenges
Consistent timeout issues during page capture
Unpredictable page rendering times
Complex interaction between mkdocs and browser automation
Current Limitations
Network-dependent screenshot generation
High variability in page load times
Lack of consistent capture mechanism

Analytical Reasoning¶

Traditional network-based rendering introduces:
Unpredictable latency
Complex dependency chains
Non-deterministic loading behaviors
Static site generators like mkdocs pre-generate HTML
HTML files already exist on filesystem
No need for dynamic network loading
Consistent, reproducible page structure

Decision Framework¶

Capture Strategy Evaluation
│
├── Network-Based Rendering
│   └── ❌ Unreliable
│
└── Filesystem-Based Rendering
    ├── ✅ Consistent
    ├── ✅ Predictable
    └── ✅ Low Overhead

Final Decision¶

Implement Filesystem-Based Static HTML Snapshot Capture

Rationale: - Bypasses network loading complexities - Leverages pre-generated HTML - Provides consistent, reproducible screenshots - Minimal performance overhead - Works independently of server state

Action Items¶

Immediate Implementation¶

~~Create .cascade/visual-test directory for screenshots~~ Directory already exists
Implement capture_page_snapshot() function
Use Playwright for rendering
Read directly from site/ directory
Generate consistent screenshots

Configuration Tasks¶

Update pyproject.toml with new dependencies

[tool.poetry.dependencies]
playwright = "^1.x.x"

Install Playwright browsers
```
playwright install chromium
```

Testing Strategy¶

Validate screenshot generation for 10 sample pages
Compare filesystem rendering with live site
Create comprehensive error logging
Implement fallback mechanisms

Monitoring and Improvement¶

Track screenshot success rate
Log any rendering discrepancies
Develop alternative rendering strategies

Documentation¶

Update visual testing documentation
Create runbook for screenshot generation
Document known limitations

Timeline¶

Start: Immediate
Initial Implementation: Within 1 day
First Review: 3 days after implementation

Strategic Pivot: From Network Dependency to Filesystem Rendering

Debugging Checklist¶

Verify local server configuration
Check network interfaces
Validate mkdocs build process
Test individual page captures
Monitor system resources during testing

Potential Root Causes¶

Network configuration issues
Local server performance limitations
Playwright/browser compatibility
Resource constraints
Mkdocs site generation complexity

Next Steps¶

Run comprehensive diagnostics
Collect detailed logs
Analyze and categorize failures
Implement targeted fixes
Develop robust error recovery mechanism

Recommended Tools¶

netstat for network connection analysis
lsof to check port usage
Browser developer tools
Playwright trace viewer

Logging Recommendations¶

# Enhanced logging strategy
logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    filename='visual_test_diagnostics.log'
)

Conclusion¶

This is an iterative debugging process. Each investigation provides insights to refine our testing infrastructure.

Last Updated: 2024-12-22 Status: Active Investigation

Ongoing Investigation - Adaptive Strategies Required