This example demonstrates the complete WhyML workflow: scraping a webpage, simplifying its structure, and regenerating it as HTML from a YAML manifest.
README.md - This documentationscraped-manifest.yaml - YAML manifest generated from webpage scrapingregenerated.html - HTML file regenerated from the YAML manifestwhyml scrape https://example.com --output scraped-manifest.yaml --simplify-structure --max-depth 5
whyml convert --from scraped-manifest.yaml --to regenerated.html --as html
Instead of running commands manually, use our provided scripts:
# From WhyML root directory
./scripts/examples/run-example-1.sh
# From WhyML root directory
python3 scripts/examples/run-example-1.py
# From WhyML root directory
./scripts/run-all-examples.sh
These scripts will:
whyml scrape https://example.com --test-conversion --output-html regenerated.html
The regenerated HTML should maintain the essential structure and content of the original webpage while being cleaner and more maintainable through the YAML manifest approach.