Comprehensive suite of tools for planning, executing, and monitoring website migrations. These applications help ensure SEO equity preservation and smooth transitions during website redesigns, platform changes, and domain migrations.
Advanced mapping system that automatically matches pages between old and new website versions using multiple similarity algorithms and machine learning.
- Streamlit Web App: User-friendly interface for non-technical users
- Python Script: Command-line tool for automated processing
- Google Colab: Cloud-based notebook for interactive analysis
- Multi-Algorithm Matching: URL, title, H1, and content-based matching
- Machine Learning: Advanced similarity scoring using embeddings
- Bulk Processing: Handle large site migrations efficiently
- Quality Assurance: Confidence scoring and manual review options
- Export Formats: Multiple output formats for different implementation needs
- Exact URL Matching: Direct path-to-path mapping
- Title Similarity: Match pages based on title tag similarity
- H1 Heading Analysis: Use heading structure for page matching
- Content Fingerprinting: Full content similarity analysis
- Semantic Matching: AI-powered content understanding
- Confidence Scoring: Numerical confidence for each mapping
- Multiple Suggestions: Alternative mapping options
- Manual Review Interface: Review and approve mappings
- Validation Rules: Custom criteria for acceptable matches
- Error Detection: Identify potential mapping issues
- Redirect Generation: Automatic .htaccess or nginx redirect rules
- Sitemap Updates: New XML sitemap generation
- 301 Redirect Lists: Ready-to-implement redirect instructions
- Testing Protocols: Validation and testing procedures
- CMS Changes: WordPress to Shopify, Drupal to custom builds
- Technology Upgrades: Legacy systems to modern frameworks
- eCommerce Platforms: Magento to WooCommerce, etc.
- Static to Dynamic: HTML sites to CMS-powered sites
- Brand Changes: New domain for rebranding
- Geographic Expansion: Country-specific domains
- TLD Changes: .com to .co or other extensions
- Subdomain Restructuring: www to non-www, etc.
- URL Structure Changes: New information architecture
- Content Reorganization: Page consolidation or splitting
- Navigation Updates: Menu and linking structure changes
- Content Management: Page merging and content updates
- Link Equity Protection: Maintain ranking power
- Traffic Continuity: Minimize organic traffic loss
- Index Preservation: Ensure proper page indexation
- Performance Monitoring: Track migration success metrics
cd streamlit-source
pip install -r requirements.txt
streamlit run website-migration.pycd python-script
pip install -r requirements.txt
python migration-mapper.py --old-site old_crawl.csv --new-site new_crawl.csv# Upload the Colab notebook to Google Drive
# Open: Automatic_Website_Migration_V5_by_LeeFoot_07_12_23.ipynb
# Follow the step-by-step instructions in the notebook- Source Site: Complete crawl of the old website
- Target Site: Complete crawl of the new website
- Required Columns: URL, Title, H1, Meta Description (minimum)
- Optional Data: Content text, word count, page type
- Screaming Frog: Primary recommendation for crawl data
- Sitebulb: Alternative SEO crawler
- Custom Scripts: Any tool outputting CSV with required columns
- Google Analytics: Historical page performance data (optional)
- CSV Files: UTF-8 encoded with proper headers
- URL Format: Complete URLs with protocol (https://)
- Clean Data: Removed duplicates and test pages
- Consistent Structure: Matching column formats
# Example data structure
old_site_data = {
'URL': 'https://old-site.com/page1',
'Title': 'Page Title',
'H1': 'Main Heading',
'Meta Description': 'Page description',
'Content': 'Full page content...'
}- URL Pattern Matching: Identify similar URL structures
- Content Similarity: Calculate text similarity scores
- Structural Analysis: Compare page elements and hierarchy
- Machine Learning: Use embeddings for semantic matching
- Primary Matches: High-confidence automatic mappings
- Secondary Options: Alternative mapping suggestions
- Orphaned Pages: Old pages without suitable matches
- New Pages: Target pages without source matches
- Confidence Thresholds: Filter by minimum confidence scores
- Manual Review: Interface for reviewing uncertain matches
- Batch Approval: Approve multiple mappings efficiently
- Custom Rules: Apply business logic to mapping decisions
# .htaccess format
Redirect 301 /old-page/ https://newsite.com/new-page/
Redirect 301 /another-old-page/ https://newsite.com/updated-page/- Mapping Summary: Complete list of URL mappings
- Confidence Analysis: Quality metrics for each mapping
- Orphaned Pages: Pages requiring manual attention
- Implementation Checklist: Step-by-step migration tasks
- Pre-migration Baseline: Traffic and ranking benchmarks
- Post-migration Tracking: Performance monitoring templates
- Error Detection: 404 and redirect chain identification
- Success Metrics: KPIs for measuring migration success
- Custom Model Training: Domain-specific similarity models
- Content Embeddings: Advanced semantic understanding
- User Behavior Integration: Historical traffic patterns
- Predictive Analytics: Forecast migration impact
- Scheduled Processing: Regular mapping updates
- API Integration: Connect with CMS and development tools
- Real-time Monitoring: Live migration status tracking
- Alert Systems: Notification of critical issues
- Multi-site Support: Handle complex multi-domain migrations
- Team Collaboration: Shared review and approval workflows
- Version Control: Track mapping changes over time
- Audit Trails: Complete migration history and decisions
- Comprehensive Crawling: Ensure complete site coverage
- Stakeholder Alignment: Get approval on mapping strategies
- Timeline Planning: Allow adequate time for review and testing
- Risk Assessment: Identify high-value pages requiring special attention
- Phased Rollout: Migrate in stages to minimize risk
- Testing Environment: Validate redirects before going live
- Monitoring Setup: Implement tracking before migration
- Rollback Plan: Prepare contingency procedures
- Performance Monitoring: Track traffic and ranking changes
- Error Resolution: Quickly address 404s and redirect issues
- Index Management: Submit updated sitemaps to search engines
- Stakeholder Reporting: Regular updates on migration success
- Redirect Chains: Multiple redirects reducing efficiency
- 404 Errors: Broken links after migration
- Indexation Problems: Search engines not finding new pages
- Performance Issues: Site speed impact from redirects
- Ranking Drops: Temporary or permanent ranking losses
- Traffic Decline: Organic traffic reduction
- Index Coverage: Pages not being indexed properly
- Link Equity Loss: PageRank distribution issues
- Comprehensive Testing: Thorough pre-launch validation
- Monitoring Systems: Real-time issue detection
- Quick Response: Rapid issue resolution processes
- Communication: Clear stakeholder updates and expectations
Detailed migration guides, troubleshooting information, and best practices are available in each tool's directory. For complex enterprise migrations or custom implementations, visit LeeFoot.com.
Lee Foot - eCommerce SEO Consultant
Part of the Search Solved Public SEO toolkit - Comprehensive website migration solutions for SEO preservation.