# MxChat Migration Tool

A WordPress plugin add-on that enables seamless migration of knowledge base entries and actions when switching between embedding models or vector databases in MxChat.

## Features

- **Knowledge Base Migration**: Migrate all knowledge base entries between WordPress database and Pinecone
- **Actions Migration**: Re-generate embeddings for all actions/intents with new embedding models
- **Smart Threshold Adjustment**: Automatically adjusts action similarity thresholds when switching embedding models
- **Multiple Embedding Models**: Support for OpenAI (ada-002, 3-small, 3-large), Voyage AI, and Google Gemini
- **Batch Processing**: Process migrations in batches to avoid timeouts and API rate limits
- **Real-time Progress Tracking**: Monitor migration progress with live updates in professional modals
- **Migration History**: View past migrations and their status
- **Granular Settings Control**: Choose which settings to update (embedding model, database, or both)
- **Safe & Reversible**: Option to update settings or keep them separate after migration
- **Pro License Required**: Requires active MxChat Pro license

## Installation

1. Upload the `mxchat-migration-tool` folder to `/wp-content/plugins/`
2. Activate the plugin through the 'Plugins' menu in WordPress
3. Navigate to **MxChat > Migration Tool** in the WordPress admin menu

## Requirements

- MxChat Basic plugin (activated)
- WordPress 5.0 or higher
- PHP 7.4 or higher
- Valid API keys for your chosen embedding model

## How It Works

### Migration Flow

1. **Configure Source**: The plugin automatically detects your current database (WordPress or Pinecone) and embedding model
2. **Configure Target**: Select your target database, embedding model, and enter necessary credentials
3. **Start Migration**: The plugin processes entries in batches:
   - Fetches content from source database
   - Generates new embeddings using target model
   - Stores in target database with metadata preserved
4. **Update Settings**: Optionally update your MxChat settings to use the new configuration

### Supported Migration Paths

#### Database Migration
- **WordPress → Pinecone**: Migrate from local WordPress database to Pinecone vector database
- **Pinecone → WordPress**: Migrate from Pinecone back to WordPress database
- **WordPress → WordPress**: Re-generate embeddings with new model (same database)
- **Pinecone → Pinecone**: Re-generate embeddings with new model (same database)

#### Embedding Models

**OpenAI**
- text-embedding-ada-002 (1536 dimensions)
- text-embedding-3-small (1536 dimensions)
- text-embedding-3-large (3072 dimensions)

**Voyage AI**
- voyage-2 (1024 dimensions)
- voyage-large-2 (1536 dimensions)
- voyage-3-large (2048 dimensions)

**Google Gemini**
- gemini-embedding-001 (1536 dimensions)

## Usage Guide

### Step 1: Review Current Configuration

The admin page displays your current setup:
- Current vector database (WordPress or Pinecone)
- Current embedding model
- Number of knowledge base entries
- Number of actions/intents

### Step 2: Configure Migration

1. **What to Migrate**: Choose from:
   - Knowledge Base + Actions (recommended)
   - Knowledge Base Only
   - Actions Only

2. **Target Database**:
   - Select WordPress Database or Pinecone Vector Database
   - If Pinecone is selected, enter:
     - Pinecone API Key
     - Pinecone Host (e.g., `your-index-abc123.svc.pinecone.io` - with or without `https://`)
     - Pinecone Namespace (optional)

3. **Target Embedding Model**:
   - Select from available models
   - The appropriate API key field will appear

4. **API Keys**:
   - Enter the API key for your selected embedding provider
   - Keys from main MxChat settings are pre-filled

5. **Batch Size**:
   - Set between 1-50 items per batch
   - Lower values are more reliable but slower
   - Recommended: 10 for most cases

### Step 3: Start Migration

1. Click **Start Migration**
2. Confirm the migration (this cannot be undone)
3. The target database will be **automatically cleaned** (except for WordPress → WordPress migrations)
4. Monitor real-time progress:
   - Progress bar shows overall completion
   - Log displays detailed processing information
   - Current stage (knowledge or actions) is shown

**Important**: Cross-database migrations (e.g., Pinecone → WordPress or WordPress → Pinecone) will **clear all existing data** in the target database before migrating. This ensures a clean slate with no duplicates or mixed data.

### Step 4: Finalize

When migration completes:
1. Review the log for any errors
2. Choose whether to update MxChat settings:
   - **Yes**: Your main plugin will use the new configuration
   - **No**: Settings remain unchanged (useful for testing)

## Best Practices

### Before Migration

- **Backup Your Database**: Always backup before major migrations
- **Test API Keys**: Verify all API keys work before starting
- **Check Quotas**: Ensure you have sufficient API quota for your content volume
- **Review Content**: Clean up unnecessary content to reduce migration time

### During Migration

- **Don't Close the Browser**: Keep the admin page open during migration
- **Monitor Progress**: Watch for error messages in the log
- **Be Patient**: Large migrations may take time (API rate limits apply)

### After Migration

- **Test the Chatbot**: Verify responses are still accurate
- **Check Knowledge Base**: Ensure all content was migrated
- **Test Actions**: Verify all intents still trigger correctly
- **Review Migration History**: Check the history table for details

## Troubleshooting

### Migration Fails to Start

- **Check API Keys**: Ensure keys are valid and have sufficient quota
- **Check Pinecone Config**: Verify host URL and API key are correct
- **Check Permissions**: Ensure you have admin privileges

### Migration Stalls or Times Out

- **Reduce Batch Size**: Lower the batch size to 5 or less
- **Check API Limits**: You may be hitting rate limits
- **Check Server Timeout**: Increase PHP max_execution_time if possible

### Some Items Fail

- **Review Error Log**: Check the migration log for specific errors
- **API Quota**: You may have exceeded your API quota
- **Content Issues**: Some content may be too large or contain invalid characters

### Embeddings Don't Match

- **Different Models**: Different embedding models produce different vectors (this is expected)
- **Similarity Thresholds**: You may need to adjust action similarity thresholds after migration
- **Model Dimensions**: Ensure your target database supports the embedding dimensions

## Database Schema

The plugin creates one table:

### wp_mxchat_migration_logs

Stores migration history and status:

| Column | Type | Description |
|--------|------|-------------|
| id | BIGINT(20) | Primary key |
| migration_id | VARCHAR(64) | Unique migration identifier |
| migration_type | VARCHAR(50) | Type: all, knowledge, actions |
| source_database | VARCHAR(50) | Source: wordpress or pinecone |
| target_database | VARCHAR(50) | Target: wordpress or pinecone |
| source_model | VARCHAR(100) | Original embedding model |
| target_model | VARCHAR(100) | New embedding model |
| total_items | INT(11) | Total items to migrate |
| processed_items | INT(11) | Successfully processed |
| failed_items | INT(11) | Failed items |
| status | VARCHAR(20) | pending, in_progress, completed, cancelled, failed |
| started_at | DATETIME | Migration start time |
| completed_at | DATETIME | Migration completion time |
| error_log | LONGTEXT | Error messages |

## API Reference

### AJAX Endpoints

- `mxchat_start_migration` - Initialize migration
- `mxchat_process_migration_batch` - Process next batch
- `mxchat_get_migration_status` - Get current status
- `mxchat_cancel_migration` - Cancel ongoing migration
- `mxchat_finalize_migration` - Finalize and update settings

## Performance Considerations

### API Costs

Embedding generation incurs API costs:
- **OpenAI**: ~$0.0001 per 1K tokens
- **Voyage AI**: Varies by model
- **Gemini**: Check Google's pricing

**Example**: Migrating 1,000 knowledge base entries with average 500 tokens each = ~$0.05

### Time Estimates

Migration time depends on:
- Number of items
- Batch size
- API rate limits
- Network speed

**Example**:
- 100 items at batch size 10 = ~2-3 minutes
- 1,000 items at batch size 10 = ~20-30 minutes
- 10,000 items at batch size 10 = ~3-5 hours

## Security

- All API keys are transmitted securely via AJAX
- Nonce verification on all requests
- Admin-only access (manage_options capability)
- Keys are not stored permanently by the migration tool
- Sanitization and validation on all inputs

## Support

For issues or questions:
1. Check the migration history table for error details
2. Review the migration log for specific errors
3. Ensure all dependencies are up to date
4. Contact MxChat support with migration ID for assistance

## Changelog

### Version 1.2.0
- **CRITICAL FIX**: Automatic target database cleanup before migration
- Cross-database migrations now start with a clean slate (no duplicates)
- WordPress → WordPress migrations preserve existing IDs (UPDATE behavior)
- Added cleanup confirmation message in migration log
- Prevents data mixing when migrating to an already-populated database

### Version 1.1.1
- Improved Pinecone host input handling
- Users can now enter host with or without `https://` protocol
- Automatic normalization ensures compatibility with core plugin
- Updated UI placeholders and help text for clarity

### Version 1.1.0
- Professional modal system (replaced browser alerts)
- Smart threshold adjustment with empirical model matrix
- Granular settings control (separate model/database updates)
- Pro license integration
- Automatic update checker
- Fixed Pinecone database toggle issues

### Version 1.0.0
- Initial release
- Knowledge base migration support
- Actions migration support
- Multi-provider embedding support
- Real-time progress tracking
- Migration history logging

## License

This plugin is licensed under the same terms as the MxChat Basic plugin.

## Credits

Developed for MxChat - AI Chatbot for WordPress
