November 4, 2008:
Splitting WordPress Export / Import file
As part of my testing the pre-release WordPress v2.7, I set up temporary free web hosting, installed the WordPress v2.7 beta there, and then wanted to move my GBMINI.net WordPress database over to it.
Unfortunately, the MySQL database backup file is about 5MB, and I couldn’t figure a way to get that uploaded to the free hosting via phpMyadmin … so then I tried WordPress built-in Export function. Of course, that file was also huge (about 8MB), and WordPress can’t Import more than 1MB.
Searching the web, it seems you might be able to edit various PHP / .htaccess settings – but that didn’t seem to help on my free hosting either!
Manually editing the WordPress export file to split it into less than 1MB sections is quite tricky – there are various “header” lines that must precede each file, and you need to split “whole entries” so you can’t arbitrarily cut the file in to pieces!
So finally, I quickly wrote a WordPress Splitter program. Written in Visual Basic 6 (because I have it handy), it runs not too fast, but it can split my 8MB WordPress export file in to 8 <1MB files in about 20s, much faster than it would take to do by hand! I was then able to upload each 1MB file successfully, so I finally had the website content moved over to the free hosting in WordPress v2.7. And I’m quite impressed how fast the free hosting actually runs!
If you need to split a WordPress xml export file, try my WPsplitter program – but remember you might also need to download the VB6 runtime if you’ve never before run a VB6 program.
UPDATE:
In an attempt (as yet unconfirmed) to solve people’s sometimes failures to get WPsplitter to run, I created this alternate install which adds a couple of VB6 support files that perhaps some computers are missing.
[Be careful using Export / Import, it doesn't get everything correct - I noticed that "sub-categories" weren't transferred under their parent category; maybe there's other things not transferred correctly, too]
RB (2008/11/05 @ 10:48 am)
So here’s a thought…. You have a Mac now and I’m sure it has iWeb why not give it a try so you can make it work in the real world and so I can make Twisty even better so you can stop bitch’n about it to me……. he, he, he! ;-]
BTW your badges go out tomorrow or even later today.
db (2008/11/05 @ 12:06 pm)
I just changed domains of one of my sites and did exactly what you needed to do.
I exported from mySQL to my desktop. From there, I ran an SQL query in mySQL on the new domain, using the SQL file on my desktop to run it.
I did trim the db down a bit and only exported the core WP tables. Even doing that I still had an sql file as large as yours.
BTW, if you ever wanted to do this on your Mac, google MAMP. Eventhough the Mac has Apache, this makes it much much simpler. I was able to do a complete test on the site before actually bringing it live.
GBMINI (2008/11/05 @ 3:10 pm)
Thanks for that, DB!
I know that mySQL limitations could have been worked round – but I also Googled and found other folks who had got stuck using the WordPress Export / Import because of file size, so I put the program together for them too
GBMINI (2008/11/19 @ 7:00 pm)
NOTE:
The WPsplitter program can’t handle “excessively large” weblogs!
Someone contacted me that had tried it, and it had failed for them – they sent their xml file, and it turned out that one weblog entry, with it’s thousands of SPAM comments, was 8MB of data!
My WPsplitter program splits into less than 1MB sub-files, so expects any one weblog entry to be less than 1MB!
Davide (2009/01/01 @ 10:20 am)
thank you for this little but very useful application
i used it for my blog and it worked without any problem
Prabhakar Kasi (2009/01/29 @ 7:15 pm)
Dude you utility is awesome but if the file size configurable it would be great.. 16MB is my xml file size. Uploading 18 pieces is really bugging.
GBMINI (2009/01/30 @ 9:30 am)
I made a change which allows you to specify the maximum file size (with a little effort).
Baris Unver (2009/02/10 @ 9:11 am)
Dude, it’s awesome! You should send it to Matt Mullenweg and let him release this tool as an official tool of WordPress.
Dan (2009/03/13 @ 7:33 am)
Thank you ! Thank you!
Saved me alot of work.
Voytec (2009/03/25 @ 3:24 pm)
Fantastic tool, just saved me a lot of pain. Thanks!
Deepak (2009/03/28 @ 4:12 am)
Thanks a lot for saving me man… this thing was bugging me since the morning on 2 of my blogs… you are a life saver bro… keep up the good work… many thanks for sharing this…
JB (2009/04/17 @ 9:05 am)
Just wanted to add a thank you on here. Downloaded your program and it worked very smoothly splitting my 15MB XML file. I was able to upload the files and also track down one post that was 2.83MB large full of spam comments.
Pradeep (2009/05/09 @ 11:39 pm)
thanks for your handy program. bless you.
Frank (2009/09/07 @ 3:00 pm)
Awesome! Thanks so much for this! Especially the ability to change the file output sizes!
This should be part of wordpress!
Flick (2009/11/02 @ 2:31 am)
Hello! Firstly, I wanted to say a big thanks for creating this nifty program. My file is but a mere 3MB so it’s just over the 2MB limit. The file splitter works really well, but I am not sure how to ensure that non Latin characters (specifically Japanese) in posts are split correctly as well.
Thanks
jessica (2009/11/17 @ 2:58 pm)
Ok i am running windows 7 and keep gettng an error i am missing a file, i am seriously in tears because i cannot move my blog from one to the other becauset he file is 16 mb i am begging you to please help me
GBMINI (2009/11/17 @ 3:25 pm)
I wonder what file is reported missing … I’ve not used Win7 so cannot be much help.
MommyGeek (2009/11/23 @ 2:16 pm)
Getting the same error -it says:
component ‘comdlg32.ocx’ or one of it’s dependencies not correctly registered: missing or invalid.
Gonna try running it in windows xp mode, see if that works.
GBMINI (2009/11/23 @ 2:25 pm)
I’ve uploaded a Microsoft-install version of WPsplitter here; it might help with Windows 7 issues.
You’ll need to download it, unzip the files, then run the SETUP.EXE included within.
Robert Worstell (2010/03/27 @ 12:30 pm)
Thanks for this great little utility. It did the job and saved my bacon! Have to blog about this one to let people know. I know I’m gushing, but that XML import file is a headache to try to edit – and you saved me hours of work. Thanks again!
EERac (2010/03/27 @ 6:19 pm)
If anyone is interested, here is a python script that can also split up a wordpress xml file. When run via the command line, the first argument specifies the file to split up, the second specifies how many chunks it should be split into (default is 2 is no number is given)
————–
#!/usr/bin/python
# This script is designed to take a wordpress xml export file and split it into some
# number of chunks (2 by default). The number of lines per chunk is determined by counting
# the number of occurences of a particular line, ‘\n’ by default, and breaking up the
# such that each chunk has an equal number occurences of that line. The appropriate header
# and footer is added to each chunk.
import os
import sys
import math
# first argument specifies the wordpress .xml file to split up
if len(sys.argv) 2 else 2
line_delimiter = ‘\n’
delimiter_count = 0
for line in lines :
if line == line_delimiter :
delimiter_count += 1
print ”
print ‘File “%s” contains %s items’ % (input_file_string, delimiter_count)
delimiter_count = 1.0*delimiter_count
delimiters_per_chunk = int(math.ceil(delimiter_count/number_of_chunks))
print ‘Creating %s files with at most %s items each:’ % (number_of_chunks, delimiters_per_chunk)
header = “”
footer = “\n\n\n”
chunk_number = 1
output_file_name = “%s_%s%s” % (input_file_name, chunk_number, input_file_extension)
output_file = open(output_file_name, ‘w’)
print ‘ Writing chunk %s to file %s…’ % (chunk_number, output_file_name)
delimiter_count = 0
for line in lines :
if line == line_delimiter : delimiter_count += 1
if chunk_number is 1 and delimiter_count is 0 : header += line
if delimiter_count > delimiters_per_chunk :
output_file.write(footer)
output_file.close()
chunk_number += 1
delimiter_count = 1
output_file_name = “%s_%s%s” % (input_file_name, chunk_number, input_file_extension)
output_file = open(output_file_name, ‘w’)
print ‘ Writing chunk %s to file %s…’ % (chunk_number, output_file_name)
output_file.write(header)
output_file.write(line)
output_file.close()
print ‘Done!\n’
MathieuB (2010/07/10 @ 10:51 am)
I used your alternative install and successfully splitted and imported a 8.9MB WordPress export. Thank you so much for this nifty useful tool, you’ve made my life so much easier =D
xavi (2010/11/03 @ 3:40 pm)
Gracias,
Gran aplicacion, me ha salvado el blog y la vida.
Salu2 desde spain
EERac (2010/11/27 @ 10:04 pm)
Just realized that the python code I posted 6 months ago got horribly mangled (whitespace went away, greater than symbols turned things into malformed html, quotes turned into unicode characters). Here is an untainted version:
http://wordpress.pastebin.ca/2004312
EERac (2010/11/27 @ 10:20 pm)
Whoops, just realized the above link to the python script expires in a month, here’s a link to my reply on wordpress.org that contains the actual code:
http://wordpress.org/support/topic/wxr-file-splitter?replies=7#post-1809736
Mike (2011/10/18 @ 10:33 pm)
I installed the original install (http://www.gbmini.net/downloads/WPsplitter.zip) and run it, selected the xml file, it started doing something and then showed this error message:
“Too many lines in “NoFollow – Some Links Don’t Count @24051″
Any idea how to get around this?
Thanks a lot!
GBMINI (2011/10/19 @ 8:27 am)
The problem seems to be that your website has a massive number of comments – perhaps spam comments. The program reads article by article and the error is saying that one article has blown the 20,000 line section limit.
No idea just how big your sections are, but 20,000 is already a huge limit.
Your best bet would be to try clearing out some of the junk from the website.
Mike (2011/10/19 @ 9:07 pm)
@GBMINI: thanks, you were exactly right. I deleted junk comments, tried again and it worked! Awesome, thanks again for the fast reply and helping random strangers out just like that
Rachel Ramey (2012/09/19 @ 7:51 am)
Thank you!!