Fixing Garbled Characters (Mojibake) After a SQL Import
Your imported data shows é instead of é, or ’ instead of an apostrophe. The double-encoding problem explained, with a real fix.
After importing, your text is full of é, ’, and ü where accented letters and quotes should be. This is mojibake — and it's a character-set mismatch between export, dump, and import, not corrupted data. Your bytes are usually fine; they're just being interpreted with the wrong charset.
How it happens
The classic case: UTF-8 data was exported while the connection thought it was latin1, so the bytes got tagged wrong. On import, MySQL re-encodes them and you get double-encoded garbage. The data is technically present — it's been run through the wrong decoder.
Prevention: set the charset on both ends
# Export
mysqldump --default-character-set=utf8mb4 db > dump.sql
# Import
mysql --default-character-set=utf8mb4 db < dump.sqlMake sure the dump itself declares the right charset near the top — look for SET NAMES utf8mb4 or /*!40101 SET NAMES utf8mb4 */.
Fixing data that's already mojibake
If the data is already imported and garbled, the reliable repair is a round-trip: dump it as latin1 (which preserves the raw bytes), then re-import declaring utf8mb4.
# Dump the garbled DB telling MySQL the bytes are latin1
mysqldump --default-character-set=latin1 --skip-set-charset \
broken_db > recover.sql
# Rewrite the charset declaration to utf8mb4
sed -i 's/SET NAMES latin1/SET NAMES utf8mb4/' recover.sql
# Re-import as utf8mb4
mysql --default-character-set=utf8mb4 fixed_db < recover.sqlCharset repairs can make things worse if the data is triple-encoded or mixed. Always work on a copy and verify a few rows before committing to the fixed database.
Check what you actually have
SHOW VARIABLES LIKE 'character_set%';
SHOW CREATE TABLE your_table; -- check the table's declared charsetAim for utf8mb4 everywhere — client, connection, results, database, and table. Plain utf8 in MySQL is only 3 bytes and can't store emoji or some characters, which is its own source of corruption.
Editing a large dump's charset?
Split it first so your SET NAMES find-and-replace runs on smaller, faster files.
Open SQLSplitFrequently Asked Questions
Why does my imported data show é instead of é?
It's mojibake — a character-set mismatch. UTF-8 bytes were interpreted as latin1 somewhere between export and import, causing double-encoding. Set --default-character-set=utf8mb4 on both mysqldump and mysql to prevent it.
How do I fix already-garbled characters in MySQL?
Do a charset round-trip: dump the broken database with --default-character-set=latin1 --skip-set-charset to preserve raw bytes, change SET NAMES latin1 to utf8mb4 in the dump, then re-import as utf8mb4. Always work on a backup copy.
Should I use utf8 or utf8mb4 in MySQL?
Always utf8mb4. MySQL's 'utf8' is a legacy 3-byte encoding that can't store emoji or all Unicode characters. utf8mb4 is true 4-byte UTF-8 and avoids a whole class of corruption.