Wednesday, April 2, 2014

How did I made this Blogger site HTTPS?

UPDATE: now blogger supports HTTPS.. but only for *.blogspot.com domains (not 3rd party such as blog.jonathanmarcil.ca). I've removed the setup from below and did a 301 from my own domain to jonathanmarcil.blogspot.com.

If you have noticed and Google haven't taken down my little trick yet, my blog is now fully HTTPS.

At first I looked at a way to do it while using Google Apps or Blogger itself. I found nothing. One of the problem on top of that is that I use my own domain name and there's no option to switch to SSL.

Looking around, I have found that CloudFare is offering "Flexible SSL" that is SSL on their front caching and protection service. When we design architecture, it's not uncommon to have the HTTPS handling on the front servers, let's say nginx with reverse caching to backend servers.

This gave me the idea to do the same using my Linux server. The only downside to it is that if my server is down, the Blogger site is unavailable. So I won't be using the power of the Google Cloud(tm), but I'd prefer to promote SSL over having a near 100% uptime.

I got a free certificate from startssl and I started playing with Pound. It was relatively easy and worked like supposed with the following config:

ListenHTTPS
        Address 1.1.1.8
        Port    443
        Cert "blog.pem"

        ## allow PUT and DELETE also (by default only GET, POST and HEAD)?:
        xHTTP           0

        Service
                BackEnd
                        Address ghs.l.google.com
                        Port    80
                End
        End
End

ListenHTTP
        Address 1.1.1.8
        Port 80
        xHTTP   0

        Service
                HeadRequire "Host: blog.jonathanmarcil.ca"
                Redirect "https://blog.jonathanmarcil.ca"
        End
End


However I found out that Blogger serve two images using plain HTTP and it gave me the infamous mixed content error message. So my next idea was to replace all http:// links with https://. Unfortunately Pound doesn't allow that. I took a look at Varnish, that I had already installed on my server, and found that natively it doesn't support that. However a little VMOD called vmod_rewrite did the trick I wanted, but I needed to have Varnish sources (I use Debian packages) and saw the note "not production-ready". Since my Varnish is used on a real production web-site, I looked elsewhere for a solution.

One idea that came on the table was to use nginx. That way I could do everything with it, from SSL to rewriting. It has even a module named ngx_http_sub_module that does the trick.  After the setup of the proxy and SSL and putting the simple find and replace directive to have all links renamed to https, I noticed that the website is now fully HTTPS with the green lock!

But something was missing: the Blogger bar on the top of the site was gone, and Google returned a 503 error. The problem was simple: the rewrite rule tempered with parameters that wasn't recognized by Google. I then made a regex in order to do negative lookbehind:
(?<!(homepageUrl|searchRoot)\=)http://

But the problem was that the default nginx module doesn't support regexes! What a shame! The next best thing is a another module that isn't part of the normal distribution of nginx.. so same problem than Varnish I had to recompile.

I tried to put many sub_filter directive one after the other, but no, nginx doesn't allow it.

I was starting to get a little bored with the project and then ask myself if nginx allow only one directive per proxy, maybe I could double-proxy? That's what I did and it worked! Here's the config:
server {
        listen 127.0.0.1:8080;
        root /var/www/nginx;

        location / {
                proxy_pass http://ghs.l.google.com;
                proxy_set_header Host   $host;
                proxy_set_header Accept-Encoding "";
                sub_filter_once off;
                sub_filter "http://" "https://";
        }
}


server {
        listen 1.1.1.8:443 ssl;
        server_name blog.jonathanmarcil.ca;
        root /var/www/nginx;
        ssl_certificate blog.crt;
        ssl_certificate_key blog.pem;

        location / {
                proxy_pass http://127.0.0.1:8080;
                proxy_set_header Host   $host;
                proxy_set_header Accept-Encoding "";
                sub_filter_once off;
                sub_filter "75https" "75http";
                proxy_redirect http://blog.jonathanmarcil.ca https://blog.jonathanmarcil.ca;

        }
}

The trick was to change back to http the strings that was faulty.

I also put on top of that config a redirection and a way to not serve anything else that my blog from that server:
server {
        listen 1.1.1.8:80;
        root /var/www/nginx;
}

server {
        listen 1.1.1.8:80;
        server_name blog.jonathanmarcil.ca;
        root /var/www/nginx;
        return 301 https://$host$request_uri;
}


Last step was to change the DNS record for blog.jonathanmarcil.ca from ghs.l.google.com to the direct IP 1.1.1.8 of nginx, and done!

Now let's wait and see if Google search engine play well with my bag of tricks.. worse case scenario I get ban from search result, and if search results still give http links, nginx will redirect them to https.



1 comment:

  1. Google has changed gapi.iframes.getContext().openChild() while Chrome now seems to check intent of posting into an iframe (if there's a form post to plain text while in HTTPS, you get a warning). So it broke the HTTPS on page. I'm changing the include of the iframe on blogger by an image from a HTTPS site to avoid that. I hope this is temporary because now there's no indication that this site is really on blogger.

    ReplyDelete