Building a one-way CMS sync workflow that doesn't bite you later

Table of Contents

Most CMS staging problems I’ve seen aren’t really technical failures — they’re workflow failures. Someone makes a change on staging, someone else edits content on production, and nobody is quite sure which database is authoritative anymore. By the time you need to push a major update, the environments have drifted enough that testing on staging means almost nothing.

This is the problem I sat down to fix. What follows is how I designed it, what I got wrong, and what the leftover URL mess taught me about CMS databases.

The core design decision: one direction only
#

Illustration for The core design decision: one direction only The first thing I had to settle was ownership: which environment is the source of truth?

For most small teams running a CMS, production is where content lives and where staff actually work. Staging only exists for risky changes — plugin updates, theme overhauls, custom development. That asymmetry matters. If you try to sync in both directions, or worse, treat staging as a parallel workspace, you end up with merge conflicts in a database, which is not something you want to debug under pressure.

So I landed on a simple rule: production flows to staging, never the other way. Staging gets a fresh copy of production before any major work begins. Code changes developed on staging get promoted by deploying the code, not by syncing the database back.

This also forced a useful conceptual split:

Code (theme files, plugins, custom scripts) belongs in version control.
Content (the database and uploads/) is treated as data, not as something you commit.

That means I explicitly decided not to version-control the full CMS install. The repo holds the custom code layer; the database and media are managed operationally. I’ve seen enough full-install repos with gigabyte upload folders to know it’s worth stating.

The sync workflow itself
#

Illustration for The sync workflow itself The production environment runs under httpdocs/. Staging runs at staging.<domain> as a separate vhost. Both share the same server, which made the sync straightforward: no remote transfers, just local operations.

The refresh sequence:

Export the production database using WP-CLI’s wp db export.
Back up the current staging database before touching it.
Import the production dump into the staging database.
Run search-replace to rewrite all production URLs to staging URLs.

I chose CLI tools over raw SQL scripts for a specific reason: they read database credentials directly from wp-config.php, so I never have to hardcode a password in a script. That matters when you’re leaving a cron job running unattended — a hardcoded credential in a script is one file-permission mistake away from exposure.

The URL rewriting step is where most guides I found were too casual. A naive sed on the SQL file can corrupt serialized PHP data, which is how WordPress stores a lot of its option values. The right tool is the CLI’s built-in search-replace, which handles serialized data correctly and lets you skip GUID columns that should never be rewritten:

wp search-replace 'https://www.example.com' 'https://staging.example.com' \
  --all-tables \
  --skip-columns=guid \
  --path=/var/www/staging/httpdocs

After that, I added explicit updates for the two options that control where WordPress thinks it lives:

wp option update home 'https://staging.example.com' --path=/var/www/staging/httpdocs
wp option update siteurl 'https://staging.example.com' --path=/var/www/staging/httpdocs

For automation, I wrapped this in a bash script scheduled via cron to run roughly every two weeks — not a strict schedule, but often enough that staging never gets too stale before someone needs it. The script runs as the vhost system user. I added a flock-based lock file to prevent overlapping runs and a simple log so there’s an audit trail if something goes wrong at 3am.

#!/bin/bash
# Run as vhost user. Reads DB credentials from wp-config.php.
# Adjust LOGFILE path if the vhost user lacks write access to /var/log/.

LOCKFILE=/tmp/staging-sync.lock
LOGFILE=/tmp/staging-sync.log
STAGING_PATH=/var/www/staging/httpdocs
PROD_PATH=/var/www/httpdocs
DUMP=/tmp/prod-db-$(date +%Y%m%d%H%M%S).sql

exec 200>"$LOCKFILE"
flock -n 200 || { echo "$(date): sync already running" >> "$LOGFILE"; exit 1; }

echo "$(date): starting prod → staging sync" >> "$LOGFILE"

# Export production
wp db export "$DUMP" --path="$PROD_PATH" >> "$LOGFILE" 2>&1

# Back up staging
wp db export "/tmp/staging-backup-$(date +%Y%m%d%H%M%S).sql" --path="$STAGING_PATH" >> "$LOGFILE" 2>&1

# Import production into staging
wp db import "$DUMP" --path="$STAGING_PATH" >> "$LOGFILE" 2>&1

# Rewrite URLs
wp search-replace 'https://www.example.com' 'https://staging.example.com' \
  --all-tables --skip-columns=guid --path="$STAGING_PATH" >> "$LOGFILE" 2>&1

wp option update home 'https://staging.example.com' --path="$STAGING_PATH" >> "$LOGFILE" 2>&1
wp option update siteurl 'https://staging.example.com' --path="$STAGING_PATH" >> "$LOGFILE" 2>&1

# Cleanup
rm -f "$DUMP"
echo "$(date): sync complete" >> "$LOGFILE"

Every refresh overwrites the staging database completely. Anything someone changed directly in staging — test orders, draft content, plugin configuration — is gone. That’s by design, but it needs to be documented clearly so nobody is surprised.

The leftover URL problem I didn’t fully anticipate
#

Illustration for The leftover URL problem I didn’t fully anticipate After the first few syncs confirmed the site loaded correctly on staging, I thought I was done. Then I started poking at the wp_options table directly and ran a query to find any remaining references to the production domain:

SELECT option_name, option_value
FROM wp_options
WHERE option_value LIKE '%www.example.com%'
LIMIT 50;

There were a lot of them. Not because search-replace had failed — it had correctly rewritten what it could — but because a modern WordPress database is dense with plugin-owned data that doesn’t behave like regular content.

What I found fell into roughly three categories. The first was cache, transients, and logs: options like _transient_*, performance plugin caches, page-builder output caches — all of them had production URLs baked into cached HTML. The right move is not to rewrite them but to delete and regenerate them, since they’re derived data anyway.

The second category was the one that gave me pause: environment-specific configuration. SMTP credentials pointing at the production mail relay. Payment gateway API keys and webhook URLs registered against the live domain. CDN origin settings. These should not be blindly rewritten — a staging environment that can trigger live payment webhooks or send real email through the production relay is a genuine operational risk, not just a cosmetic one. Practically, that means setting WP_ENVIRONMENT_TYPE in wp-config.php on staging, using test-mode credentials for any payment gateway, and confirming that any plugin registered webhook URLs point somewhere inert. Each one needs a manual review rather than an automated pass; I added a short checklist to the runbook for this, because I know I’ll forget otherwise.

The third category was SEO and structural plugin settings: the canonical domain, sitemap URLs, and social metadata all referencing production. A search-replace gets most of it, but some of these options are nested in serialized arrays that need a second pass.

The practical conclusion: search-replace is necessary but not sufficient. After every sync, someone should do a quick audit — at minimum, check SMTP, payment settings, and any plugin that phones home or registers a webhook.

What I’d tell someone starting this from scratch
#

Illustration for What I’d tell someone starting this from scratch The workflow isn’t complicated, but it took longer to get right than I expected, mostly because CMS databases are messier than they look from the outside.

The two things I’d emphasize are the ones that aren’t obvious from the tooling. First, treat search-replace as one step in a post-import checklist rather than the whole solution: the URL rewrite handles content and most settings cleanly, but environment-specific credentials need deliberate attention every time — and the payment gateway case is the one where getting it wrong has real consequences. Second, document that staging gets clobbered, prominently and early. Anyone who knows the workflow will understand it, but anyone who doesn’t will eventually store something important in the staging database and lose it; the runbook header is the right place for that warning, not buried in a README section nobody reads.

On this site’s database, the whole setup takes about four minutes to run and gives us a staging environment that’s a reliable mirror of production. That’s really all I wanted.

Debugging a persistent WordPress backdoor

3 December 2025·1009 words·5 mins