DB Character set change in phorum 5.2.8
Posted by chris
DB Character set change in phorum 5.2.8 September 01, 2008 08:15AM |
Registered: 16 years ago Posts: 38 |
Hello,
Having upgraded a phorum from 5.1 -> 5.2.8, i noticed a small change in the mysqli code and it seems to be a problem for me.
Please correct me if i'm wrong.
In 5.1 a single 'set names utf8' was issued (if that was the config charset of choice, it was for me, i always use utf8).
In 5.2 there is an additional statement "set character set utf8", which ruins multilingual content for me, unless i comment out this command!
I tested it some more, here is how mysql behaves.
As you can see, issuing "set character set utf8" changes the character_set_connection back to latin1. This means that all foreign characters will be saved as questionmarks (????).
According to the Mysql Docs:
Obviously the default db collation and charset in this case is latin1. Now i'm not a mysql guru, so if this is resolved through some phorum option i'm sorry, do let me know. But as far as i can see i need to comment out this line.
Chris
Having upgraded a phorum from 5.1 -> 5.2.8, i noticed a small change in the mysqli code and it seems to be a problem for me.
Please correct me if i'm wrong.
In 5.1 a single 'set names utf8' was issued (if that was the config charset of choice, it was for me, i always use utf8).
In 5.2 there is an additional statement "set character set utf8", which ruins multilingual content for me, unless i comment out this command!
I tested it some more, here is how mysql behaves.
Language: SQLmysql> SET names utf8; Query OK, 0 ROWS affected (0.00 sec) mysql> SHOW VARIABLES LIKE ';character_set%';; +--------------------------+----------------------------+ | Variable_name | VALUE | +--------------------------+----------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | latin1 | | character_set_filesystem | BINARY | | character_set_results | utf8 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ 8 ROWS IN SET (0.00 sec) mysql> SET CHARACTER SET utf8; Query OK, 0 ROWS affected (0.00 sec) mysql> SHOW VARIABLES LIKE ';character_set%';; +--------------------------+----------------------------+ | Variable_name | VALUE | +--------------------------+----------------------------+ | character_set_client | utf8 | | character_set_connection | latin1 | | character_set_database | latin1 | | character_set_filesystem | BINARY | | character_set_results | utf8 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ 8 ROWS IN SET (0.00 sec)
As you can see, issuing "set character set utf8" changes the character_set_connection back to latin1. This means that all foreign characters will be saved as questionmarks (????).
According to the Mysql Docs:
Quote
SET CHARACTER SET is similar to SET NAMES but sets character_set_connection and collation_connection to character_set_database and collation_database.
Obviously the default db collation and charset in this case is latin1. Now i'm not a mysql guru, so if this is resolved through some phorum option i'm sorry, do let me know. But as far as i can see i need to comment out this line.
Chris
Re: DB Character set change in phorum 5.2.8 September 02, 2008 05:32AM |
Admin Registered: 21 years ago Posts: 9,240 |
Re: DB Character set change in phorum 5.2.8 September 02, 2008 08:04AM |
Registered: 16 years ago Posts: 38 |
Hello Thomas,
In 5.2 $PHORUM['DBCONFIG']['charset'] is => 'utf8'.
In 5.1 there was no such option i think, but phorum always used a set names utf8 command (great) :)
and of course $PHORUM["DATA"]["CHARSET"] = utf-8 as well (in both phorum versions).
It seems to me, that the "set character set utf8" command issued by phorum in the latest version, would
not work as expected in a db running with latin1 as a default encoding. Then again i find it strange that i'm
the only one with this prob (?).
Let me illustrate with another example, consider a test table exactly like phorum_messages but with fewer fields
just for testing this out:
This is exactly what happens in phorum5.2.8 in my system with the added "set character set utf8" command. Older posts are in proper utf8, newer ones are not saved properly.
Edited 1 time(s). Last edit at 09/02/2008 08:07AM by chris.
In 5.2 $PHORUM['DBCONFIG']['charset'] is => 'utf8'.
In 5.1 there was no such option i think, but phorum always used a set names utf8 command (great) :)
and of course $PHORUM["DATA"]["CHARSET"] = utf-8 as well (in both phorum versions).
It seems to me, that the "set character set utf8" command issued by phorum in the latest version, would
not work as expected in a db running with latin1 as a default encoding. Then again i find it strange that i'm
the only one with this prob (?).
Let me illustrate with another example, consider a test table exactly like phorum_messages but with fewer fields
just for testing this out:
Language: SQLmysql> CREATE TABLE `phorum_messages_test` ( `message_id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT, `body` mediumtext NOT NULL, PRIMARY KEY (`message_id`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8; Query OK, 0 ROWS affected (0.03 sec) mysql> SET names utf8; Query OK, 0 ROWS affected (0.00 sec) mysql> INSERT INTO phorum_messages_test (body) VALUES (';non-latin1: ελληνικά';); Query OK, 1 ROW affected (0.02 sec) mysql> SET CHARACTER SET utf8; Query OK, 0 ROWS affected (0.00 sec) mysql> INSERT INTO phorum_messages_test (body) VALUES (';non-latin1: ελληνικά';); Query OK, 1 ROW affected (0.00 sec) mysql> SELECT * FROM phorum_messages_test; +------------+------------------+ | message_id | body | +------------+------------------+ | 1 | non-latin1: ελληνικά | | 2 | non-latin1: ???????? | +------------+------------------+ 2 ROWS IN SET (0.00 sec)
This is exactly what happens in phorum5.2.8 in my system with the added "set character set utf8" command. Older posts are in proper utf8, newer ones are not saved properly.
Edited 1 time(s). Last edit at 09/02/2008 08:07AM by chris.
Re: DB Character set change in phorum 5.2.8 September 02, 2008 09:02AM |
Admin Registered: 21 years ago Posts: 9,240 |
Re: DB Character set change in phorum 5.2.8 September 02, 2008 09:58AM |
Registered: 16 years ago Posts: 38 |
Thomas,
There is nothing wrong with my data as saved on the db i assure you. All characters are displayed normally both on the web and on the console. If there was any problem i would be spending my time correcting all my data, and it's a lot of data. :)
Obviously if there was not a 'set names utf8' before, i added it, i didn't remember. Maybe i added it for a good reason, like you said you added proper utf8 support in 5.2. At worse it was simply redundant. :)
But! Lets totally forget about the old version, shall we?
You made me try a brand new installation of phorum5.2.8
Same box, same mysql server, completely new database for a fresh installation.
config.php:
$PHORUM['DBCONFIG']['charset'] => 'utf8'
english.php:
$PHORUM["DATA"]['CHARSET']="UTF-8";
Installed ok, made a test post with non-latin1 characters -> Data saved wrong! (questionmarks).
Commented out the line:
// mysqli_query( $conn,"SET CHARACTER SET {$PHORUM['DBCONFIG']['charset']}");
Made 2nd post -> Data saved correctly in utf8 and displayed properly.
See attached screenshot.
Chris
Edited 2 time(s). Last edit at 09/02/2008 10:03AM by chris.
There is nothing wrong with my data as saved on the db i assure you. All characters are displayed normally both on the web and on the console. If there was any problem i would be spending my time correcting all my data, and it's a lot of data. :)
Obviously if there was not a 'set names utf8' before, i added it, i didn't remember. Maybe i added it for a good reason, like you said you added proper utf8 support in 5.2. At worse it was simply redundant. :)
But! Lets totally forget about the old version, shall we?
You made me try a brand new installation of phorum5.2.8
Same box, same mysql server, completely new database for a fresh installation.
config.php:
$PHORUM['DBCONFIG']['charset'] => 'utf8'
english.php:
$PHORUM["DATA"]['CHARSET']="UTF-8";
Installed ok, made a test post with non-latin1 characters -> Data saved wrong! (questionmarks).
Commented out the line:
// mysqli_query( $conn,"SET CHARACTER SET {$PHORUM['DBCONFIG']['charset']}");
Made 2nd post -> Data saved correctly in utf8 and displayed properly.
See attached screenshot.
Chris
Edited 2 time(s). Last edit at 09/02/2008 10:03AM by chris.
Re: DB Character set change in phorum 5.2.8 September 02, 2008 10:06AM |
Admin Registered: 19 years ago Posts: 8,532 |
What is the character set for your database?
Edit: you suggested above that is was latin1. I agree that there might be a problem with that setup, based on the piece of documentation that you pasted in your first message.
There were probably also good reasons to include both SET CHARACTER SET and SET NAMES in the connection code, so I don't really like the idea of simply throwing that away. To the other devs: do you remember the specifics?
Maurice Makaay
Phorum Development Team
my blog linkedin profile secret sauce
Edited 1 time(s). Last edit at 09/02/2008 10:26AM by Maurice Makaay.
Edit: you suggested above that is was latin1. I agree that there might be a problem with that setup, based on the piece of documentation that you pasted in your first message.
There were probably also good reasons to include both SET CHARACTER SET and SET NAMES in the connection code, so I don't really like the idea of simply throwing that away. To the other devs: do you remember the specifics?
Maurice Makaay
Phorum Development Team
my blog linkedin profile secret sauce
Edited 1 time(s). Last edit at 09/02/2008 10:26AM by Maurice Makaay.
Re: DB Character set change in phorum 5.2.8 September 02, 2008 10:26AM |
Registered: 16 years ago Posts: 38 |
Re: DB Character set change in phorum 5.2.8 September 02, 2008 11:00AM |
Registered: 16 years ago Posts: 38 |
The character set combinations in mysql can get quite tricky, but they're very versatile, personally i don't see any reason for using the character set command and it's the first time i've seen this problem. Assuming a full utf8 enviroment (db and client), 'set names' ensures all proper vars are utf8 and that's all i would ever use.
Cheers
Cheers
Re: DB Character set change in phorum 5.2.8 September 02, 2008 12:48PM |
Admin Registered: 21 years ago Posts: 9,240 |
Re: DB Character set change in phorum 5.2.8 September 02, 2008 01:42PM |
Registered: 16 years ago Posts: 38 |
Thomas,
Yes, naturally that resolves the problem. The only question is what if someone on a shared host (yikes) can't control this?
Also in this case, set character set utf8 behaves exactly like set names utf8. So my question is why try to change the
mysql character set vars twice.
Then another Q. is what happens to existing installations with various different settings if set character set was to be removed.
Questions, questions. :)
Chris
Yes, naturally that resolves the problem. The only question is what if someone on a shared host (yikes) can't control this?
Language: SQLmysql> ALTER DATABASE phorum52 DEFAULT CHARACTER SET utf8; Query OK, 1 ROW affected (0.00 sec) mysql> SET names utf8; Query OK, 0 ROWS affected (0.00 sec) mysql> SET CHARACTER SET utf8; Query OK, 0 ROWS affected (0.00 sec) mysql> SHOW VARIABLES LIKE ';character_set%';; +--------------------------+----------------------------+ | Variable_name | VALUE | +--------------------------+----------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | utf8 | | character_set_filesystem | BINARY | | character_set_results | utf8 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ 8 ROWS IN SET (0.00 sec)
Also in this case, set character set utf8 behaves exactly like set names utf8. So my question is why try to change the
mysql character set vars twice.
Then another Q. is what happens to existing installations with various different settings if set character set was to be removed.
Questions, questions. :)
Chris
Re: DB Character set change in phorum 5.2.8 September 02, 2008 01:45PM |
Admin Registered: 21 years ago Posts: 9,240 |
as you need to create the database, you can surely change the character set of the database.
if you can't, then you can get in contact with your host.
most are running utf8 by default now anyway.
we are not going to remove the "set character set". there more that is defined for the connection the less can go wrong.
Thomas Seifert
if you can't, then you can get in contact with your host.
most are running utf8 by default now anyway.
we are not going to remove the "set character set". there more that is defined for the connection the less can go wrong.
Thomas Seifert
Re: DB Character set change in phorum 5.2.8 September 02, 2008 02:01PM |
Registered: 16 years ago Posts: 38 |
I stand by the opinion that set names is the proper way to go however.
Set character set breaks this if the db is in a different charset, and you can't rely on it anyway. Apart from needing the permissions to alter the db default, there may be other apps/tables coexisting that may be affected. And i see no good reason to depend on it.
set names = works in all cases flawlessly.
set character set = works but depends on the default db charset.
Your call :)
Cheers
Edited 1 time(s). Last edit at 09/02/2008 02:08PM by chris.
Set character set breaks this if the db is in a different charset, and you can't rely on it anyway. Apart from needing the permissions to alter the db default, there may be other apps/tables coexisting that may be affected. And i see no good reason to depend on it.
set names = works in all cases flawlessly.
set character set = works but depends on the default db charset.
Your call :)
Cheers
Edited 1 time(s). Last edit at 09/02/2008 02:08PM by chris.
Sorry, only registered users may post in this forum.