As I switched from IBM Connection to WordPress for my blog, I started thinking about my existing content. Was there a way to move them all over without having to manually copy and paste and recreate all 268 entries?
Well, there is, and this is how I did it, using just a few tools. First I used Wget to retrieve my old blog. This put all the posts on one folder (entries), and all images in another (resource). It was then a simple task to write a Lotusscript agent that processed each file in that folder and read the content, parsed out the title, date originally posted and HTML for the blog post itself. I put that data into separate Notes documents, after performing some cleanup and string replacement.
I had already moved all images to a filer on my primary web server, so I performed a replace of the image URLs in the HTML, to have any images pointing to their new location. I also had to fix some special characters and replace them with the corresponding HTML entities.
Now when I had all the data, I just wrote another agent to export the data out again, to create a CSV file. I then installed a CSV importer in my WordPress blog and used to to import the file I just created.
After a few tweaks I performed a successful import. Later I realized I had missed a few special characters, so I had to fix those entries, but we are talking about 4 or 5, out of 268 entries.
If there is an interest, I might clean up the code a little and create a nicer UI (right now many of the values like path and URL are hard-coded) and then release the code if anyone else is planning to go through the same exercise. Below is the existing code to read the blog entries into a simple Notes database.
And here is the agent to export the documents to a CSV file that can be imported into a WordPress blog using the CSV import plugin.
Option Public Option Declare Sub Initialize Dim session As New NotesSession Dim db As NotesDatabase Dim view As NotesView Dim doc As NotesDocument Dim filename As String filename = "d:\bleedyellow.csv" Open filename For Output As #1 Print #1, |"csv_post_title","csv_post_post",| + _ |"csv_post_type","csv_post_excerpt",| + _ |"csv_post_categories","csv_post_tags",| + _ |"csv_post_date","custom_field_1","custom_field_2"| Set db = session.Currentdatabase Set view = db.GetView("By Title") Set doc = view.GetFirstDocument Do Until doc Is Nothing Print #1, GetCSV(doc) Set doc = view.GetNextDocument(doc) Loop Close #1 End Sub Function GetCSV(doc As NotesDocument) As String Dim rtitem As NotesRichTextItem Dim tmp As String Dim content As String Set rtitem = doc.Getfirstitem("Content") content = Replace(FullTrim(rtitem.GetUnformattedText()),|"|,|""|) tmp = |"| + Replace(doc.GetItemValue("Title")(0),|"|,|""|) + |",| tmp = tmp + |"| + content + |",| tmp = tmp + ",," tmp = tmp +|"| + "Old Blog Post" + |",| tmp = tmp +|"| + doc.GetItemValue("Tags")(0) + |",| tmp = tmp +|"| + doc.GetItemValue("PostedDate")(0) + |",,,| GetCSV = tmp End Function