Fixing encoding issues with mb_convert_encoding()
· One min read
One of our customers sent me a database dump that contained wrongly encoded characters like "ä" or "ü." Can I fix this on my own, or should I let the customer provide me with a properly encoded database dump?
While searching StackOverflow for advice, I came across a comment mentioning the PHP function mb_convert_encoding(). According to the documentation, the function converts a string from one character encoding to another. Sounds exactly what I wanted.
And indeed, it worked perfectly for my use case. I imported the original database dump into my database and ran a script to extract and "fix" the data I needed:
<?php
$dsn = "";
$user = "";
$pass = "";
$pdo = new PDO($dsn, $user, $pass);
$sql = "SELECT name, description FROM category";
foreach ($pdo->query($sql)->fetchAll() as $row) {
$name = mb_convert_encoding($row['name'], 'Windows-1252', 'UTF-8');
$description = mb_convert_encoding($row['description'], 'Windows-1252', 'UTF-8');
}