whyml

WhyML Testing Guide

🧪 Test Suite Overview

WhyML features a comprehensive modular test suite with 127+ test cases across all packages, achieving 85%+ code coverage with focus on critical functionality, API compatibility, and cross-package integration.

📊 Current Test Status

Status	Count	Description
✅ Collected	127+ tests	All tests successfully discovered
✅ Passing	117+ tests	Core functionality working
🔧 Remaining	~10 tests	Minor CLI flags and async issues
📈 Coverage	85%+	Comprehensive code coverage

📦 Package Test Breakdown

whyml-core (35+ tests)

Location: whyml-core/tests/ Status: ✅ Core APIs working

cd whyml-core
make test                    # Run all core tests
make test-coverage          # With coverage report
pytest tests/test_loading.py -v    # Manifest loading tests
pytest tests/test_processing.py -v # Processing pipeline tests

Test Categories:

Validation Framework: Manifest schema validation, error handling
Loading System: Async manifest loading, dependency resolution
Processing Pipeline: Template processing, inheritance resolution
Utilities: YAML processing, path manipulation, async operations

whyml-converters (35+ tests)

Location: whyml-converters/tests/ Status: ✅ HTML, React, Vue, PHP converters working

cd whyml-converters
make test-html              # HTML converter specific tests
make test-react             # React converter tests
make test-vue               # Vue converter tests
make test-php               # PHP converter tests
make test-coverage          # All converters with coverage

Test Categories:

HTML Converter: Semantic HTML generation, CSS integration
React Converter: JSX/TSX generation, hooks, TypeScript support
Vue Converter: SFC generation, Composition API, Vue 3 compatibility
PHP Converter: Class generation, namespaces, type declarations

whyml-scrapers (25+ tests)

Location: whyml-scrapers/tests/ Status: ✅ URLScraper, WebpageAnalyzer fixed

cd whyml-scrapers
make test-scraper           # URL scraper tests
make test-analyzer          # Webpage analyzer tests
make test-extractor         # Content extraction tests
make demo-scrape            # Live scraping demo (use responsibly)

Test Categories:

URL Scraping: Website analysis, manifest generation
Structure Analysis: HTML parsing, simplification algorithms
Content Extraction: Smart content detection, semantic analysis
Page Analysis: SEO metrics, accessibility analysis

whyml-cli (32+ tests)

Location: whyml-cli/tests/
Status: ✅ Most commands working, minor CLI flag issues

cd whyml-cli
make test-cli               # CLI functionality tests
make test-commands          # Command-specific tests
make test-integration       # Integration workflows
make demo-scrape            # Test CLI scrape command

Test Categories:

Command Interface: CLI argument parsing, validation
Integration Workflows: End-to-end command testing
Error Handling: CLI error messages, exit codes
Configuration: Settings, environment variables

🎯 Test Categories & Strategies

1. Unit Tests

Purpose: Individual component functionality Location: tests/test_*.py in each package Strategy: Mock external dependencies, test isolated functionality

# Run unit tests for specific components
pytest tests/test_manifest_loader.py -v
pytest tests/test_html_converter.py -v
pytest tests/test_url_scraper.py -v

2. Integration Tests

Purpose: Cross-package workflows and API compatibility Location: Main package tests/ and individual packages Strategy: Test real workflows without mocking

# Integration testing across packages
pytest tests/test_integration.py -v
pytest tests/test_modular_integration.py -v

3. Performance Tests

Purpose: Speed and memory benchmarks Strategy: Measure processing time, memory usage, async performance

# Performance benchmarks
pytest tests/ -k "performance" -v
pytest --benchmark-only

4. End-to-End Tests

Purpose: Complete pipeline validation Strategy: Real websites → YAML → Generated code → Validation

# Complete workflow testing
pytest tests/test_e2e.py -v

5. Network Tests

Purpose: Web scraping and external requests
Strategy: Use test websites, mock HTTP responses when needed

# Network-dependent tests (use carefully)
pytest tests/ -k "network" -v
pytest tests/ -k "live" --tb=short

🚀 Running Tests

Quick Test Commands

# Test everything (from project root)
make test

# Test all packages with coverage
make test-coverage

# Test specific package
cd whyml-core && make test
cd whyml-converters && make test-html
cd whyml-scrapers && make test-scraper
cd whyml-cli && make test-integration

Advanced Testing

# Verbose output with detailed information
pytest tests/ -v --tb=long

# Test specific functionality
pytest tests/ -k "converter" -v
pytest tests/ -k "scraper and not live" -v

# Coverage with HTML report
pytest --cov=whyml --cov-report=html tests/

# Parallel testing (if pytest-xdist installed)  
pytest tests/ -n auto

# Stop on first failure
pytest tests/ -x

# Run only failed tests from last run
pytest --lf tests/

Test Data & Fixtures

Test Data Location: ./data/ Sample Manifests:

data/manifest.yaml - Basic manifest example
data/test-manifest-external.yaml - External content testing
data/test-metadata-structure.yaml - Metadata validation
data/test-selective.yaml - Selective section testing

Usage in Tests:

import pytest
from pathlib import Path

@pytest.fixture
def sample_manifest():
    return Path("data/manifest.yaml")

def test_manifest_loading(sample_manifest):
    # Test with real sample data
    pass

🔬 Example Project Testing

example_com Project Ports

The project/example_com/ demonstrates multi-format output testing:

Port Mappings:

HTML: Port 8011 - http://localhost:8011
JavaScript/Node.js: Port 8012 - http://localhost:8012
PHP: Port 8013 - http://localhost:8013
React: Port 8014 - http://localhost:8014
Vue: Port 8017 - http://localhost:8017

Testing Workflow:

# Start Docker demo environment
./start-demo.sh

# Access different formats
curl http://localhost:8011  # HTML output
curl http://localhost:8012  # JavaScript output  
curl http://localhost:8013  # PHP output

# View in browser for visual testing
open http://localhost:8010  # Dashboard with all formats

Automated Testing:

# Test conversion accuracy across formats
whyml scrape http://localhost:8015 --test-conversion \
  --output-html regenerated.html \
  --max-depth 3

# Compare outputs programmatically
python -c "
import requests
html = requests.get('http://localhost:8011').text
react = requests.get('http://localhost:8014').text
print(f'HTML length: {len(html)}, React length: {len(react)}')
"

✅ Test Quality Metrics

Recent API Compatibility Fixes (2025)

✅ Converter APIs: Added convert() methods and constructor parameters
✅ Scraper APIs: Added scrape_url() method, similarity calculation
✅ Structure Analysis: Added compatibility fields for testing
✅ Error Handling: Enhanced NetworkError with details parameter
✅ LoadedManifest: Fixed CLI conversion with proper content extraction

Test Coverage Goals

Unit Tests: 90%+ coverage for core functionality
Integration Tests: All major workflows covered
Error Cases: Comprehensive error handling validation
Performance: Benchmarks for critical paths
Compatibility: API surface fully tested

Continuous Improvement

# Generate detailed coverage report
pytest --cov=whyml --cov-report=html --cov-report=term-missing tests/

# Identify missing coverage
coverage report --show-missing

# Update test suite based on coverage gaps
coverage html && open htmlcov/index.html

🐛 Debugging Failed Tests

Common Issues & Solutions

1. Import Errors

# Problem: ModuleNotFoundError for whyml_*
# Solution: Install packages in development mode
pip install -e whyml-core/
pip install -e whyml-converters/
pip install -e whyml-scrapers/ 
pip install -e whyml-cli/

2. Async Test Issues

# Problem: Event loop errors
# Solution: Use pytest-asyncio
pip install pytest-asyncio
pytest tests/ --asyncio-mode=auto

3. Network Test Failures

# Problem: Network timeouts or unreachable sites
# Solution: Skip network tests or use mocking
pytest tests/ -k "not network" -v
pytest tests/ -m "not slow" -v

4. File Path Issues

# Problem: Test data not found after move to ./data/
# Solution: Update test file paths
find tests/ -name "*.py" -exec grep -l "test.*\.yaml" {} \;
# Update paths from ./ to ./data/ in test files

📋 Test Maintenance

Adding New Tests

# Template for new test file
import pytest
from pathlib import Path
from whyml_core.loading.manifest_loader import ManifestLoader

class TestNewFeature:
    """Test suite for new feature."""
    
    @pytest.fixture
    def sample_data(self):
        return Path("data/test-sample.yaml")
    
    def test_basic_functionality(self, sample_data):
        """Test basic feature functionality."""
        # Arrange
        loader = ManifestLoader()
        
        # Act
        result = loader.load_manifest(sample_data)
        
        # Assert
        assert result is not None
        assert result.metadata.title == "Expected Title"
    
    @pytest.mark.asyncio
    async def test_async_functionality(self):
        """Test async operations."""
        # Test async code here
        pass
        
    @pytest.mark.slow
    def test_performance(self):
        """Performance test - marked as slow."""
        # Performance testing here
        pass

Test Configuration

# pytest.ini
[tool:pytest]
testpaths = tests
python_files = test_*.py
python_classes = Test*
python_functions = test_*
addopts = -v --tb=short --strict-markers
markers =
    slow: marks tests as slow (deselect with '-m "not slow"')
    network: marks tests as requiring network access
    integration: marks tests as integration tests
asyncio_mode = auto

🎯 Testing Best Practices

1. Test Organization

One test file per module/class
Descriptive test names explaining what is tested
Group related tests in classes
Use fixtures for common setup

2. Test Data Management

Store test data in ./data/ directory
Use relative paths from project root
Version control test data files
Document test data requirements

3. Mocking Strategy

Mock external services (HTTP requests, file system)
Don’t mock code under test
Use realistic mock data
Document mock assumptions

4. Performance Testing

Benchmark critical paths
Test with realistic data sizes
Monitor memory usage
Set reasonable timeout limits

5. CI/CD Integration

Run tests on multiple Python versions
Include coverage reporting
Fail builds on test failures
Cache dependencies for speed

Last Updated: Current refactoring session
Test Suite Version: 127+ cases across modular architecture Coverage Target: 90%+ for production readiness

This site is open source. Improve this page.