Creating 301 redirects between many pairs of URLs

I recently upgraded a website. The website consisted of static html documents and some simple php scripts. I transformed it into a wordpress blog. In this process every url was changed. In order to keep the pagerank and not break any inlinks I created 301 redirects between them.

If there had been an obvious pattern between the old and new urls this would have been a simple task.
For example lets say urls of the pattern:[pagename].html

should be changed into:[pagename]/

Then I could just have used this in my .htaccess file on website1:

RedirectMatch 301 ^/(.*)\.html$$1/

But in this case there where no obvious patterns. Some of the old pages where converted to wordpress pages, some to wordpress posts and some to a wordpress category with the old text as the description for the category. All this resulting in a wide variety of new urls.

I solved the situation like this:
First i used a sitemap generator to get a list of all the urls that I needed to rewrite. If you google for “sitemap generator” you will find one. Myself I used After your sitemap is generated choose the alternative “Download Sitemap in Text Format” and you will get a txt file with the urls. It might look like this:


Now after each url add the corresponding new url you want to redirect to (sepparate with space). It might look like this:


Next I created this python3 script to generate the content for my .htaccess file:

fh = open('urls.txt', 'r')
lines ="\n")
print('RewriteEngine on')
for line in lines :
	parts = line.split(" ")
	from_url = parts[0]
	to_url = parts[1]
	if to_url.count('?') == 0 :
		to_url += '?'
	if from_url.count('?'):
		print("RewriteCond %{QUERY_STRING} ^" + from_url.split('?')[1] +"$")
		print("RewriteRule ^" + from_url.split('?')[0].replace('.', '\.') + "$ " + to_url + " [R=301,L]")
		print("RewriteRule ^" + from_url.replace('.', '\.') + "$ " + to_url + " [R=301,L]")
0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *