Skip to content

Latest commit

Β 

History

History
Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

README.md

Website Migration Tools

Comprehensive suite of tools for planning, executing, and monitoring website migrations. These applications help ensure SEO equity preservation and smooth transitions during website redesigns, platform changes, and domain migrations.

Tools Overview

πŸš€ Automatic Website Migration Tool

Advanced mapping system that automatically matches pages between old and new website versions using multiple similarity algorithms and machine learning.

Available Versions

  • Streamlit Web App: User-friendly interface for non-technical users
  • Python Script: Command-line tool for automated processing
  • Google Colab: Cloud-based notebook for interactive analysis

Key Features

  • Multi-Algorithm Matching: URL, title, H1, and content-based matching
  • Machine Learning: Advanced similarity scoring using embeddings
  • Bulk Processing: Handle large site migrations efficiently
  • Quality Assurance: Confidence scoring and manual review options
  • Export Formats: Multiple output formats for different implementation needs

Migration Methodologies

Content Matching Algorithms

  1. Exact URL Matching: Direct path-to-path mapping
  2. Title Similarity: Match pages based on title tag similarity
  3. H1 Heading Analysis: Use heading structure for page matching
  4. Content Fingerprinting: Full content similarity analysis
  5. Semantic Matching: AI-powered content understanding

Quality Assurance Features

  • Confidence Scoring: Numerical confidence for each mapping
  • Multiple Suggestions: Alternative mapping options
  • Manual Review Interface: Review and approve mappings
  • Validation Rules: Custom criteria for acceptable matches
  • Error Detection: Identify potential mapping issues

Implementation Support

  • Redirect Generation: Automatic .htaccess or nginx redirect rules
  • Sitemap Updates: New XML sitemap generation
  • 301 Redirect Lists: Ready-to-implement redirect instructions
  • Testing Protocols: Validation and testing procedures

Use Cases

πŸ”„ Platform Migrations

  • CMS Changes: WordPress to Shopify, Drupal to custom builds
  • Technology Upgrades: Legacy systems to modern frameworks
  • eCommerce Platforms: Magento to WooCommerce, etc.
  • Static to Dynamic: HTML sites to CMS-powered sites

🌐 Domain Migrations

  • Brand Changes: New domain for rebranding
  • Geographic Expansion: Country-specific domains
  • TLD Changes: .com to .co or other extensions
  • Subdomain Restructuring: www to non-www, etc.

🎨 Site Redesigns

  • URL Structure Changes: New information architecture
  • Content Reorganization: Page consolidation or splitting
  • Navigation Updates: Menu and linking structure changes
  • Content Management: Page merging and content updates

πŸ“Š SEO Preservation

  • Link Equity Protection: Maintain ranking power
  • Traffic Continuity: Minimize organic traffic loss
  • Index Preservation: Ensure proper page indexation
  • Performance Monitoring: Track migration success metrics

Quick Start

Streamlit Web Application

cd streamlit-source
pip install -r requirements.txt
streamlit run website-migration.py

Python Script

cd python-script
pip install -r requirements.txt
python migration-mapper.py --old-site old_crawl.csv --new-site new_crawl.csv

Google Colab

# Upload the Colab notebook to Google Drive
# Open: Automatic_Website_Migration_V5_by_LeeFoot_07_12_23.ipynb
# Follow the step-by-step instructions in the notebook

Input Requirements

Site Crawl Data

  • Source Site: Complete crawl of the old website
  • Target Site: Complete crawl of the new website
  • Required Columns: URL, Title, H1, Meta Description (minimum)
  • Optional Data: Content text, word count, page type

Crawl Tools Supported

  • Screaming Frog: Primary recommendation for crawl data
  • Sitebulb: Alternative SEO crawler
  • Custom Scripts: Any tool outputting CSV with required columns
  • Google Analytics: Historical page performance data (optional)

Data Format Requirements

  • CSV Files: UTF-8 encoded with proper headers
  • URL Format: Complete URLs with protocol (https://)
  • Clean Data: Removed duplicates and test pages
  • Consistent Structure: Matching column formats

Processing Workflow

1. Data Preparation

# Example data structure
old_site_data = {
    'URL': 'https://old-site.com/page1',
    'Title': 'Page Title',
    'H1': 'Main Heading',
    'Meta Description': 'Page description',
    'Content': 'Full page content...'
}

2. Similarity Analysis

  • URL Pattern Matching: Identify similar URL structures
  • Content Similarity: Calculate text similarity scores
  • Structural Analysis: Compare page elements and hierarchy
  • Machine Learning: Use embeddings for semantic matching

3. Mapping Generation

  • Primary Matches: High-confidence automatic mappings
  • Secondary Options: Alternative mapping suggestions
  • Orphaned Pages: Old pages without suitable matches
  • New Pages: Target pages without source matches

4. Quality Review

  • Confidence Thresholds: Filter by minimum confidence scores
  • Manual Review: Interface for reviewing uncertain matches
  • Batch Approval: Approve multiple mappings efficiently
  • Custom Rules: Apply business logic to mapping decisions

Output Formats

Redirect Files

# .htaccess format
Redirect 301 /old-page/ https://newsite.com/new-page/
Redirect 301 /another-old-page/ https://newsite.com/updated-page/

Implementation Reports

  • Mapping Summary: Complete list of URL mappings
  • Confidence Analysis: Quality metrics for each mapping
  • Orphaned Pages: Pages requiring manual attention
  • Implementation Checklist: Step-by-step migration tasks

Monitoring Tools

  • Pre-migration Baseline: Traffic and ranking benchmarks
  • Post-migration Tracking: Performance monitoring templates
  • Error Detection: 404 and redirect chain identification
  • Success Metrics: KPIs for measuring migration success

Advanced Features

Machine Learning Enhancements

  • Custom Model Training: Domain-specific similarity models
  • Content Embeddings: Advanced semantic understanding
  • User Behavior Integration: Historical traffic patterns
  • Predictive Analytics: Forecast migration impact

Automation Capabilities

  • Scheduled Processing: Regular mapping updates
  • API Integration: Connect with CMS and development tools
  • Real-time Monitoring: Live migration status tracking
  • Alert Systems: Notification of critical issues

Enterprise Features

  • Multi-site Support: Handle complex multi-domain migrations
  • Team Collaboration: Shared review and approval workflows
  • Version Control: Track mapping changes over time
  • Audit Trails: Complete migration history and decisions

Migration Best Practices

🎯 Planning Phase

  1. Comprehensive Crawling: Ensure complete site coverage
  2. Stakeholder Alignment: Get approval on mapping strategies
  3. Timeline Planning: Allow adequate time for review and testing
  4. Risk Assessment: Identify high-value pages requiring special attention

πŸ”§ Implementation Phase

  1. Phased Rollout: Migrate in stages to minimize risk
  2. Testing Environment: Validate redirects before going live
  3. Monitoring Setup: Implement tracking before migration
  4. Rollback Plan: Prepare contingency procedures

πŸ“Š Post-Migration Phase

  1. Performance Monitoring: Track traffic and ranking changes
  2. Error Resolution: Quickly address 404s and redirect issues
  3. Index Management: Submit updated sitemaps to search engines
  4. Stakeholder Reporting: Regular updates on migration success

Common Migration Challenges

Technical Issues

  • Redirect Chains: Multiple redirects reducing efficiency
  • 404 Errors: Broken links after migration
  • Indexation Problems: Search engines not finding new pages
  • Performance Issues: Site speed impact from redirects

SEO Concerns

  • Ranking Drops: Temporary or permanent ranking losses
  • Traffic Decline: Organic traffic reduction
  • Index Coverage: Pages not being indexed properly
  • Link Equity Loss: PageRank distribution issues

Solutions & Mitigation

  • Comprehensive Testing: Thorough pre-launch validation
  • Monitoring Systems: Real-time issue detection
  • Quick Response: Rapid issue resolution processes
  • Communication: Clear stakeholder updates and expectations

Support & Documentation

Detailed migration guides, troubleshooting information, and best practices are available in each tool's directory. For complex enterprise migrations or custom implementations, visit LeeFoot.com.

Author

Lee Foot - eCommerce SEO Consultant

Website LinkedIn Bluesky

Part of the Search Solved Public SEO toolkit - Comprehensive website migration solutions for SEO preservation.