We've recently been working on some PHP coding to generate JSON output drawn from a variety of MySQL database tables and hit the following error: json_encode(): Invalid UTF-8 sequence in argument.
Solving this was, thankfully, relatively quick and easy...
It's all in the character....encodings
This issue appeared to be caused by a character, or set of characters, that weren't being recognised as UTF-8 compliant upon being retrieved from the database.
We're still not entirely sure why this is the case as we explicitly declared the following:
- VARCHAR and TEXT field types set to utf8_unicode_ci collation
- Table collation set to utf8_unicode_ci
- Database collation set to utf8_unicode_ci
To overcome this error with the JSON we implemented the MySQL character set with the following PHP mysql extension:
Which needs to be implemented immediately after the mysql connection has been established:
$conn = mysql_connect(DB_HOST, DB_USER, DB_PASS); mysql_set_charset('utf8');
You need to be aware though that the above function has been officially deprecated as of php 5.5.0 and completely removed in php 7.0.0. It is suggested that you use either the PDO_MySQL extension or MySQLi extension instead.
With the MySQLi extension we could accomplish this like so:
$conn = mysqli_connect(DB_HOST, DB_USER, DB_PASS, DB_NAME); mysqli_set_charset($conn, "utf8");
Or, if you prefer to use the PHP Data Objects interface instead:
// For PHP 5.3.6+ $pdo = new PDO("mysql:host=hostname;DBName=database;charset=utf8", "user", "password"); // For less than PHP 5.3.6 $pdo = new PDO("mysql:host=hostname;DBName=database", "user", "password"); $pdo->exec("set names utf8");
Whichever database extension you find yourself using you should hopefully find the problem resolved and your JSON generated as expected.