jack reed

The case for serving your IIIF content over HTTPS

TLDR: IIIF content hosted over HTTP is not fully usable by HTTPS hosted webpages.

In writing this blog post, I realized that I can’t fully understand what all of the barriers are for IIIF adopters in moving to HTTPS. To that end, I would like to know more about this so we can focus the community to provide more useful resources. Would you mind completing this short (4 questions, 3 are multiple choice) survey about your HTTPS adoption?

https://goo.gl/forms/6pvcGUG67yFzPTDD3

Interoperability is a characteristic of a product or system, whose interfaces are completely understood, to work with other products or systems, present or future, in either implementation or access, without any restrictions.

Definition of Interoperability 1

For several years there have been pushes from organizations to migrate websites to use HTTPS2345. This serves as an informational post for IIIF content users and providers on why serving IIIF content over HTTPS is just as important and how to do it.

There are many reasons why as a IIIF content provider you would want to serve your content only using HTTPS, the best reason first:

By serving your content over HTTPS exclusively, your image resources gain interoperability.

But don’t worry, this is not a problem exclusive to IIIF but a larger issue with content on the Web and the way browsers handle security.

What is HTTPS?

Hyper Text Transfer Protocol Secure (HTTPS) is the secure version of HTTP, a protocol that is used to transfer information on the World Wide Web. HTTPS provides a layer of encryption using SSL/TLS. While originally adopted on secure websites (e.g. financial institutions), it is now the preferred6 way to serve content on the Web.

Why and how does your content become more interoperable with HTTPS?

TLDR: IIIF content hosted over HTTP is not fully usable by HTTPS hosted webpages.

Your gain in interoperability is not something to do with how the IIIF specifications are written, but in how web browsers implement security policies. For the purpose of this discussion I will talk primarily about the IIIF Image API and the IIIF Presentation API but other IIIF specifications are also implicated.

As websites move to HTTPS only, content hosted over HTTP starts to become unusable.

The problem boils down to something called mixed content7. Mixed content describes a scenario when a user visits a site hosted over HTTPS and that page then requests content hosted over HTTP. Browsers specifically block mixed active content8 which causes problems for most browser-based IIIF clients. Browser security models prohibit displaying secure content (a web page hosted on HTTPS) with some types of insecure content (IIIF content hosted over HTTP).

How do browsers use IIIF?

IIIF clients implemented in browsers usually request JSON or JSON-LD as a precursor for requesting images. These JSON responses give information to the client in how to display images.

IIIF request/response cycle

This request/response cycle becomes problematic when the webpage requesting HTTP resources is hosted over HTTPS. Browser content security specifically blocks mixed active content8 which includes the JSON responses needed for IIIF clients usually requested as an XMLHttpRequest. For many browser-based IIIF clients hosted over HTTPS, these security restrictions essentially makes HTTP resources unusable. :(

Additional considerations

At the moment, only mixed active content is blocked by the browser’s security model. Mixed passive/display content is not blocked, and this includes <img> resources. This means that a browser-based IIIF client that displays content using <img> element tags should be ok.

So what should I do?

Host all of your content over HTTPS. No exceptions.

Why else should I host my content over HTTPS?

Not only is it important for interoperability, there are other really good reasons to serve everything over HTTPS by default910116.

Trust

By serving content over HTTPS you can guarantee to your users that they are receiving the content that they requested and nothing else. This provides proof to the user/browser that you are talking to the server that was requested. Internet Service Providers can inject content into pages1213, using HTTPS prevents them from being able to do this. Serving content over HTTPS using a trusted certificate, can also prevent man-in-the-middle (MITM) attacks14.

Privacy

By using HTTPS, all of the traffic between a user and the server is encrypted. This encryption layer gives your users a level of privacy ensuring that traffic between your server and your users is not broadcast to bad actors. This guarantees that only the server and browser can read the data that is transmitted between them.

Search engine optimization

Google started using HTTPS as a “ranking signal” for its search results15 back in 2014. This “signal” seems to rank HTTPS websites as delivering high-quality content. By serving your content in a secured way you can increase your ranking in search results.

Browsers will start marking HTTP as insecure

Google Chrome has decided that it will eventually start marking HTTP webpages as insecure. Chrome eventual treatment of all HTTP pages4

Chrome has already started to remove functionality like Geolocation-API from HTTP hosted sites16 and more things will be coming.

A great resource from Google on “Mythbusting HTTPS”.

Mythbusting HTTPS

How do I host my IIIF content using HTTPS?

I hope you are convinced now that all of your IIIF content should be hosted over HTTPS. Often times, the largest hurdle here is organizational buy-in. Yet the technical considerations are not trivial at all. Migrating legacy services from HTTP to HTTPS can take a bit of time and is really specific to the technical infrastructure. Some good news here is that as more and more websites move to HTTPS there are more resources than ever to get started. I won’t try and cover how to

Implementing HTTPS, first things first getting a trusted certificate

The first thing you need to implement HTTPS is a trusted certificate. Traditionally these are purchased through a trusted certificate provider and can vary in cost. Often times large organizations have the ability to purchase these through a central IT department that controls DNS.

Some cheap/free options

I wanted to outline a few cheap/free options for obtaining these certificates. sslmate is an options that provides certificates for $15.95 / year for single hosts, and you can obtain a Wildcard SSL for $149.95 / year17.

A new option is now available, that allows you to obtain certificates for free!

Let’s Encrypt is a free, automated, and open certificate authority (CA), run for the public’s benefit. It is a service provided by the Internet Security Research Group (ISRG).

Many hosting providers have integration with the service to make installation easier. If you run your own servers, I would recommend taking a look at Digital Ocean’s technical tutorials on installing Let’s Encrypt certificates18. There are tutorials for many different popular applications, languages and platforms.

Setting up Let’s Encrypt seems like it could be complicated, but the EFF has made it even easier with a new software project Certbot. Certbot “Automatically enable HTTPS on your website with EFF’s Certbot, deploying Let’s Encrypt certificates”19. This can take out some of the headache of having to renew your certificate.

I chose to host this blog over HTTPS using netlify and it was straightforward to setup. You can read about my experience in this post.

Moving all services to HTTPS

Because IIIF relies on potentially many different services you may need to be intentional on when and how you move your services. Because of the mixed active content problem, one likely needs to migrate Image API services first, before moving Presentation API services.

IIIF community specific problems

In writing this blog post, I realized that I can’t imagine what all of the barriers are for IIIF adopters in moving to HTTPS. To that end, I would like to know more about this so we can focus the community to provide more useful resources. Would you mind completing this short (4 questions, 3 are multiple choice) survey about your HTTPS adoption?

https://goo.gl/forms/6pvcGUG67yFzPTDD3

Thanks and I look forward to continuing the conversation!

Special thanks to Mark Matienzo for reviewing this post for me before I published and Sheila Rabun for helping with the survey.