Firefox PHP

phpBB conversion

Posted by zenon 
phpBB conversion
August 12, 2007 02:09AM
I need to migrate a phpBB board with more than five years of active history and some 35.000 postings in its database. That´s not something you can throw away and start afresh; either it moves along or it doesn't move at all.

After two days at it, I have neither succeeded with the migration nor know for sure whether it can be done at all. So I figure I might as well publish what I've done so far, although it's incomplete and inconclusive, and see what others think of it and who can spot what I have missed.

So if you're looking for a phpbb2phorum migration HOWTO, this is not it. With some luck, it could be a beginning. With no luck, it's a dead end.
UPDATE: It's finished and it works.

____________________________________________________________________


Install phorum. Dump the phpBB database with --opt. Insert the dump in the same database as the newly installed phorum. The table prefix in my installation is "ph_"; the default is "phorum_".

phpBB's default "Anonymous" user has to go; he has user_id -1 and phorum uses an unsigned integer for user_id.
delete from phpbb_users where user_id = '-1';

If phorum autoincrements user_id and users have been deleted in the phpBB database, subsequent users will be assigned a different user_id, which in turn will cause postings to change author. We can't have that, so autoincrementation must be turned off before the migration and then back on again.
alter table ph_users change user_id user_id int(10) unsigned not null;

Nutty as this might be, phpBB permits duplicate usernames. Phorum does not. Thus, the phpBB users cannot be imported unless the dupes are renamed first. I do have some of these.
create temporary table dupes select user_id, username from phpbb_users group by username having ( count(username) > 1);

select username, user_id, concat(cast(username as char),'.',cast(user_id as char)) as newname from dupes;
Then, unless your sql is better than mine, fix them by hand, one by one:
update phpbb_users set username = 'user.123' where user_id = '123';
When done, go to your original phpBB database and make username unique so the bloody bots have to make a bit more of an effort to register.
alter table phpbb_users change username username varchar(25) not null unique;
Now get the users (oops, we have a formatting problem here; long code lines break right out through the containing table, you have to scroll horizontally out of the page).
insert into ph_users (user_id, username, password, email, active, posts, date_added, date_last_active, email_notify, pm_email_notify, tz_offset) select phpbb_users.user_id, phpbb_users.username, phpbb_users.user_password, phpbb_users.user_email, phpbb_users.user_active, phpbb_users.user_posts, phpbb_users.user_regdate, phpbb_users.user_lastvisit, phpbb_users.user_notify, phpbb_users.user_notify_pm, phpbb_users.user_timezone from phpbb_users order by phpbb_users.user_id asc;
And then put back auto incrementation on ph_users.user_id.
alter table ph_users change user_id user_id int(10) unsigned not null auto_increment;
Get the fora. Most settings have to be fixed by hand, but I only have a dozen fora, so some editing in the browser is easier than trying to figure phpbb->phorum equivalences and refine the sql.
insert into ph_forums (forum_id, name, description) select phpbb_forums.forum_id, 
phpbb_forums.forum_name, phpbb_forums.forum_desc from phpbb_forums order by 
phpbb_forums.forum_id asc;
ph_messages.message_id is autoincremented. There's a risk the phpBB and phorum message_ids get out of sync and that would screw everything. Turn off autoincrementation.
alter table ph_messages change message_id message_id int(10) unsigned not null;
Get the postings. This leaves out the body and mangles the subject and the author, but that's as far as my sql goes.
insert into ph_messages (message_id, forum_id, thread, author, subject, ip, 
modifystamp, user_id, datestamp) select phpbb_posts.post_id, phpbb_posts.forum_id, 
phpbb_posts.topic_id, phpbb_posts.poster_id, phpbb_posts.topic_id, phpbb_posts.poster_ip, 
phpbb_posts.post_edit_time, phpbb_posts.poster_id, phpbb_posts.post_time from phpbb_posts 
order by phpbb_posts.post_id asc;
I wasn't really sure the above worked as advertised, so I split it in chunks that are easier to check:
insert into ph_messages (message_id) select phpbb_posts.post_id from phpbb_posts order by phpbb_posts.post_id asc;

update ph_messages set forum_id = (select phpbb_posts.forum_id from phpbb_posts where phpbb_posts.post_id = ph_messages.message_id); 

update ph_messages set thread = (select phpbb_posts.topic_id from phpbb_posts where phpbb_posts.post_id = ph_messages.message_id);

update ph_messages set author = (select phpbb_posts.poster_id from phpbb_posts where phpbb_posts.post_id = ph_messages.message_id);

update ph_messages set subject = (select phpbb_posts.topic_id from phpbb_posts where phpbb_posts.post_id = ph_messages.message_id);

update ph_messages set ip = (select phpbb_posts.poster_ip from phpbb_posts where phpbb_posts.post_id = ph_messages.message_id);

update ph_messages set datestamp = (select phpbb_posts.post_time from phpbb_posts where phpbb_posts.post_id = ph_messages.message_id);

update ph_messages set modifystamp = (select phpbb_posts.post_edit_time from phpbb_posts where phpbb_posts.post_id =ph_messages.message_id);

update ph_messages set user_id = (select phpbb_posts.poster_id from phpbb_posts where phpbb_posts.post_id = ph_messages.message_id);
Put back autoincrementation where it should be.
alter table ph_messages change message_id message_id int(10) unsigned not null auto_increment;
Fix the author.
update ph_messages set author = (select username from ph_users where user_id = ph_messages.author);
Fix the subject.
update ph_messages set subject = (select topic_title from phpbb_topics where topic_id = ph_messages.subject);
Add the e-mail address. Should it really be here?
UPDATE: It should not, except for anonymous users; see this other thread.
update ph_messages set email = (select email from ph_users where ph_messages.author = ph_users.username);
And now the IP addresses. This is a bloody mess. Most addresses are stored in hex, but some are stored in dotted decimal and some of the latter aren't even complete, e.g. "192.168.", with the last one or two octets missing.
#!/usr/bin/perl -w

use DBI;

$host = '127.0.0.1';
$db = 'myforum';
$user = 'dbuser';
$passwd = 'password';
$count = 0;
%done = ();

$dbh = DBI->connect ("DBI:mysql:database=$db:host=$host",$user,$passwd) || die "$DBI::errstr\n";

$hexips = $dbh->prepare("select ip from ph_messages");
$hexips->execute || die "$hexips->errstr\n";
while (@addr = $hexips->fetchrow_array()) {
   foreach $rawaddr (@addr) {
      $count++;
      if (defined ($done{"$rawaddr"})) {
         print "$count  skipping $rawaddr\n";
      } else {
         $done{$rawaddr} = $count;
         # This assumes that any ip field with a dot in it is a dotted 
         # decimal address. It is wrong to assume this, but I don't know 
         # what I'd do with a recipe for mushroom soup if it was found 
         # in the ip field, so I can just as well skip anyway.
         if ($rawaddr =~ m/\./) {
            print "$count  skipping $rawaddr\n";
            next;
         } else {
            $rawaddr =~ m/^(..)(..)(..)(..)$/;
            ($oct1, $oct2, $oct3, $oct4) = (hex($1), hex($2), hex($3), hex($4));
            $daddr = "$oct1.$oct2.$oct3.$oct4";
         }
      print "$count  updating $rawaddr\n";
      $dbh->do("update ph_messages set ip = '$daddr' where ip = '$rawaddr'");
      }
   }
}
$hexips->finish;
$dbh->disconnect();
Fix the message count (repeat for all fora).
update ph_forums set message_count = (select count(message_id) from ph_messages where forum_id = '3') where forum_id = '3';
Fix the thread count (and repeat).
update ph_forums set thread_count = (select count(*) from (select distinct thread from ph_messages where forum_id = '3') as count) where forum_id = '3';
Insert the bodies. Better do this late in the process; the bodies are big, so they make command line queries very hard to read.
update ph_messages set body = (select post_text from phpbb_posts_text where post_id = ph_messages.message_id);

There's a related discussion over at [www.phorum.org]

Z



Edited 4 time(s). Last edit at 10/13/2007 06:17AM by zenon.
Re: phpBB conversion
August 13, 2007 10:13PM
Here are a few more steps:

Activate the fora. phpBB uses phpbb_forums.forum_status 0 where phorum expects phorum_forums.active 1.
update ph_forums set active = '1';
Fix the per-thread thread count:
create temporary table tmp select distinct thread as mythread, (select count(message_id) 
from ph_messages where thread = mythread) as count from ph_messages where thread = thread;

update ph_messages set thread_count = (select count from tmp where mythread = ph_messages.thread);

drop temporary table tmp;
Following the advise I got at [www.phorum.org] , the thread IDs have to be changed. In phpBB, thread_id and message_id are completely unrelated, so we have existing threads with the same number as the message_id of thread starters in other threads.
create temporary table tmp select thread, message_id, message_id as newthread from ph_messages;

update tmp set newthread = (select message_id from ph_messages where 
thread = tmp.thread order by message_id limit 1);

update ph_messages set thread = (select newthread from tmp where ph_messages.message_id = tmp.message_id);
That last statement is extremely inefficient ("Query OK, 35056 rows affected (42 min 32.46 sec)" on a P4 2.6 GHz with 1,5 GB RAM). Someone with better sql skills than mine could perhaps do something about it. The same goes about the next one.

Set parent_id = thread and then dump the temporary table.
update ph_messages set parent_id = (select newthread from tmp where ph_messages.message_id = tmp.message_id);

drop table tmp;
And then set parent_id = 0 on the first posting in each thread.
update ph_messages set parent_id = '0' where thread = message_id;

UPDATE: Having done all this, run scripts/update_postcount.php, scripts/rebuild_search_table.php and scripts/rebuild_meta_data_mysql.php as explained in this posting. Also fix the last post dates with the script here after you add the missing quote noted here.



Edited 2 time(s). Last edit at 10/13/2007 07:36AM by zenon.
Re: phpBB conversion
August 16, 2007 02:39AM
One more installment, and by now the conversion is almost complete.

Fix modify times. phpBB uses 0 for "not edited", phorum uses modifytimestanp = datestamp.
update ph_messages set modifystamp = datestamp where modifystamp = '0';
ph_messages.email should contain the e-mail address of non-registered users and be empty for registered users. Since I earlier put all users' addresses here, I now have to delete those who are registered and leave the rest, if any.
update ph_messages set email = '' where exists (select * from ph_users where ph_users.username = ph_messages.author);
Import users' time zones. phpBB keeps them as hours.minutes while phorum keeps them as an integer in hours. Weird, since phorum actually offers time zone choices like "(GMT -3:30 hours) Newfoundland", only to then truncate the half hour and save "-3" in the database.
update ph_users set tz_offset = (SELECT substring(user_timezone from 1 for length(user_timezone)-3 ) 
as a from phpbb_users where phpbb_users.user_id = ph_users.user_id);
phpBB can't adjust for DST, phorum can. Most countries use it, so we arbitrarily set is_dst = 1 for everyone and let the few users who shouldn't have it remove it themselves.
UPDATE: Don't do this; see Maurice's message next in this thread.
update ph_users set is_dst = '1';
Get their signatures.
update ph_users set signature = (select user_sig from phpbb_users where ph_users.user_id = phpbb_users.user_id);
This may vary from one board to another, but on the board that I am currently converting, 99% of all users who have a sig at all, are spammers. For legacy boards with thousands of registered users, this is a very good place to search for deletables.

Get notification settings.
update ph_users set email_notify = (select user_notify from phpbb_users where ph_users.user_id = phpbb_users.user_id);

update ph_users set pm_email_notify = (select user_notify_pm from phpbb_users where ph_users.user_id = phpbb_users.user_id);
Hide e-mail addresses by default. Those who enjoy getting spammed can change it back themselves.
update ph_users set hide_email = '1';

alter table ph_users alter column hide_email set default '1';
Import banned literal e-mail addresses.
insert into ph_banlists (string,type,pcre) select ban_email,'3','0' from phpbb_banlist where ban_email != '';
Insert banned user_ids as e-mail addresses.
create temporary table tmp (userid int, mail varchar(128));

insert into tmp (userid) select ban_userid from phpbb_banlist where ban_userid !='';

update tmp set mail = (select email from ph_users where ph_users.user_id = tmp.userid);

insert into ph_banlists (string, type) select mail, '3' from tmp;

drop table tmp;
Because of the prevalence of dynamic IP addresses, banning by IP is usually a very bad idea. I am not importing those.

Next installment: private messages.



Edited 1 time(s). Last edit at 10/07/2007 04:41AM by zenon.
Re: phpBB conversion
August 16, 2007 08:40AM
Quote

phpBB can't adjust for DST, phorum can. Most countries use it, so we arbitrarily set is_dst = 1 for everyone and let the few users who shouldn't have it remove it themselves.

is_dst does not tell you if a user is in a place where dst is used. It tells that Phorum should apply DST to the time. This only is used for users, who configured a specific timezone (because they are not in the same timezone as the Phorum server and need the timezone tweaking to get the times right). Since there's no good way to work with DST in PHP (future is bright though, but nothing we can already use), Phorum reverted to letting the users switch on and off DST by hand.


Maurice Makaay
Phorum Development Team
my blog linkedin profile secret sauce
Re: phpBB conversion
October 07, 2007 05:00AM
Long overdue last installment: private messages.

To begin with, import the list of private messages:
insert into ph_pm_messages (pm_message_id, from_user_id, subject,datestamp) select 
phpbb_privmsgs.privmsgs_id, phpbb_privmsgs.privmsgs_from_userid, 
phpbb_privmsgs.privmsgs_subject, phpbb_privmsgs.privmsgs_date from  phpbb_privmsgs 
order by phpbb_privmsgs.privmsgs_id asc;

Fetch the usernames that correspond to the user_ids:
update ph_pm_messages set from_username = (select username from ph_users 
where ph_users.user_id = ph_pm_messages.from_user_id);

Get the message bodies:
update ph_pm_messages set message = (select privmsgs_text from phpbb_privmsgs_text 
where ph_pm_messages.pm_message_id = phpbb_privmsgs_text.privmsgs_text_id);

Start building the reference table by putting sent messages in the users' outbox:
insert into ph_pm_xref (user_id, pm_message_id) select from_user_id, pm_message_id 
from ph_pm_messages;

update ph_pm_xref set special_folder = 'outbox';

update ph_pm_xref set read_flag = 1;

Now repeat to create the receiving users' inboxes:
insert into ph_pm_xref (user_id, pm_message_id) select privmsgs_to_userid, privmsgs_id 
from phpbb_privmsgs;

update ph_pm_xref set special_folder = 'inbox' where special_folder is null;

This probably completes the phpBB2phorum conversion. The scripts mentioned in this other thread have to be run too. rebuild_meta_data_mysql.php does not create metadata for private messages (phorum team: RFE!), but converted PMs are accessible to the users anyway.
Re: phpBB conversion
October 14, 2007 10:25PM
Cosmetics:

The quotes didn't come out too well. phpBB puts a hexadecimal code after the BBcode "quote" and then the username that's being quoted within quotes, while phorum just uses "quote username" inside the square brackets. sed fixed this:
$ mysqldump -p --opt mydatabase > database.sql.
$ sed 's/\([[\/]*quote\):[a-f0-9]*/\1 /g' database.sql |sed 's/\([quote \)=\\"\([^\\]*\)\\"\]/\1\2\]/g' |\
           sed 's/[\/quote \]/[\/quote\]/g' |sed 's/[quote \]/[quote\]/g' >dbfixed.sql
$ mysqladmin -p drop mydatabase
$ mysqladmin -p create mydatabase
$ mysql -p mydatabase < dbfixed.sql
Z



Edited 1 time(s). Last edit at 10/14/2007 10:26PM by zenon.
Re: phpBB conversion
November 29, 2011 04:37AM
Awesome, thank you very much for posting this!!!

Going to try on Phorum 5.2 and PHP 5.3.3 from a phpBB 3.0 install

___________
F4 Systems



Edited 1 time(s). Last edit at 11/29/2011 05:04AM by Skye.
Re: phpBB conversion
November 29, 2011 05:05AM
Related script for conversions....

[www.phorum.org]

___________
F4 Systems
Sorry, only registered users may post in this forum.

Click here to login