Wednesday, 27 February 2013

Blogger in a subdirectory of my domain

As CTO I tend to be fielded a mixture of technical SEO questions along with general tech questions and the last one was something which I had often wondered myself.

I know that Blogger allows you to host it as a subdomain of your own domain (i.e. www.mattstannard.com or blog.mattstannard.com) using CNAMEs but what about if you wanted in a sub directory of an existing domain, that is www.mattstannard.com/blog is this possible? You can try it out by clicking - http://panda.4psmarketing.com/blog/

Much research I read suggested that this wasn't possible, however, after a bit of "playing" in IIS I was able to do this and it wasn't quite as hard as I had thought so  I thought I'd share the solution, it basically uses PHP to act as a Proxy to Blogger.

Firstly, the solution below worked for me - it may need adaptation for you and indeed I used Microsoft IIS (Internet Information Services) 7.5 but I am sure the same can be achieved with Apache.

Step 1 : Ensure IIS has necessary Modules
The nice thing about newer versions of IIS is the ease in which additional plugins can be added. As my solution was going to use PHP (Pre Hypertext Processor) I installed this from - http://php.iis.net/

As well as PHP, I needed to ensure that IIS could rewrite URLs in the same way as the modrewrite module in Apache, so I installed - http://www.iis.net/learn/extensions/url-rewrite-module/using-the-url-rewrite-module

Step 2 : Create a Blog Folder
Obviously, you need to create a folder where the blog is going to sit. We are going to put some PHP files in this folder too. I chose to use the folder name blog.

Step 3 : Create the redirect.php file

The redirect.php file contains some global parameters such as the Blogger URL and the URL of where I want to host the blog.

We also have a PHP function which uses cURL to return the contents of a URL

<?php
// URL of the blogger (no http:// or trailing /)
// Where the blogger blog should go (no http:// or trailing /)
$_BLOGGER_URL = "matt-stannard.blogspot.co.uk";
$_REDIRECT_URL = "panda.4psmarketing.com/blog";
// *
// Function : get_data
//
// Arguments: Url - Request URL
//
// Returns: HTML from website
//
// Description
// Take a URL, connect using cURL and then return the data as a String
// *
function get_data($url)
{
  $ch = curl_init();
  $timeout = 5;
  curl_setopt($ch, CURLOPT_URL, $url);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
  curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
  $data = curl_exec($ch);
  curl_close($ch);
  return $data;
}
?>

Step 4 : Create the index.php file

We need to create an index.php file in the blog directory. This is the file which will do the work. The file takes the page the users is requesting, and requests this from Blogger by stripping the name from the original HTTP Request.

When Blogger responds, we use a Regular Expression to replace any hyperlinks with the Blogger domain to be requests on our domain before sending the output to the user.

<?php
include_once('redirect.php');
// If we are a re-written URL this Server Variable will be set
// Otherwise we default to the homepage
if (isset($_SERVER["HTTP_X_ORIGINAL_URL"]))
{
 $strURL = $_SERVER["HTTP_X_ORIGINAL_URL"];
 $strURL = str_replace("/blog/","http://" . $_BLOGGER_URL . "/",$strURL);
}
else
{
 $strURL = "http://" . $_BLOGGER_URL . "/";
}
// Get the HTML and replace the links, we need to make sure we do this
// or traffic goes back to the Blogger site
$html = get_data($strURL);
$pattern = '/<a([^>]+)href=\'http:\/\/' . $_BLOGGER_URL . '\//i';
$replace = '<a$1href=\'http://' . $_REDIRECT_URL . '/';
$html = preg_replace($pattern,$replace, $html);
// There is a weirdness that this isn't picked up, but this fixes one link!
$html = str_replace("<a href=\"http://" . $_BLOGGER_URL . "/\">Show all posts</a>","<a href=\"http://" . $_REDIRECT_URL . "/\">Show all posts</a>",$html);
// Write out the HTML
echo $html;

?>

Step 5 : Create a Search folder

I noticed that Blogger handled search requests slightly differently and so for ease of use I created a separate search folder so it could have its own index.php file listed below, this essentially does the same as the file above.

<?php
include_once('../redirect.php');
$strURL = "http://" . $_BLOGGER_URL . "/search?" . $_SERVER["QUERY_STRING"];
// Get the HTML and replace the links, we need to make sure we do this
// or traffic goes back to the Blogger site
$html = get_data($strURL);
$pattern = '/<a([^>]+)href=\'http:\/\/' . $_BLOGGER_URL . '\//i';
$replace = '<a$1href=\'http://' . $_REDIRECT_URL . '/';
// There is a weirdness that this isn't picked up, but this fixes one link!
$html = preg_replace($pattern,$replace, $html);

echo $html;

?>


Step 6 : Setup rewrites

Now we've created the proxy code all that remains is for us to setup our rewrites. The IIS rewrite module is a very simple and easy to use tool.

We want to create the same rewrite for the blog and the blog/search folders as follows:

For the /blog directory:
 
For the /blog/search directory:



Step 6 : You're Done

Hopefully the above is useful and works for you - if you do get stuck or need some help or want to add to this solution please do let me know!

8 comments:

  1. Nice piece of information on HTML5. With the expansion of smartphones and other portable gadgets, the demand for responsive website design that go comfy on all devices keeps on increasing. This leads to invention and expansion of HTM5 web technology. PHP Training in Chennai

    ReplyDelete
  2. Hello Matt, it is a very interesting hack. I noticed that the article is more than two years old. Does it still work? I would love to try it but I am not comfortable with PHP yet.

    ReplyDelete
  3. This information is impressive; I am inspired with your post writing style & how continuously you describe this topic. After reading your post, thanks for taking the time to discuss this, I feel happy about it and I love learning more about this topic..
    Selenium Training in Chennai | QTP Training in Chennai

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete
  6. Thanks for Information Oracle Apps Technical is a collection of a bunch of collected applications like accounts payables, purchasing, inventory, accounts receivables, human resources, order management, general ledger and fixed assets, etc which have its own functionality for serving the business
    Oracle Apps Training In Chennai

    ReplyDelete
  7. Oracle Training in chennai | Oracle D2K Training In chennai
    This information is impressive; I am inspired with your post writing style & how continuously you describe this topic. After reading your post, thanks for taking the time to discuss this, I feel happy about it and I love learning more about this topic..

    ReplyDelete
  8. Curious if others have tried this and if it does in fact work on Apache? I have several blogger blogs - most of them hosted on their own domains and I am hoping to make them all subdirectories of my main website for SEO purposes and ease of maintenance and sanity on my part. Matt - any new insight around this?

    ReplyDelete