Representational State Transfer, REST, is not so much a technology as an architecture for building web applications. It is a more formal and structured extension to the basic URL-based web service idea. The idea centres around using clear, highly-descriptive URLs to represent each real-world entity that our web application needs to deal with (e.g. a song, a list of all songs by a given artist, a flight, a biography of an actor, etc). For example we could have these URLs:
http://www.hittastic.com/artist/Oasis http://www.hittastic.com/song/Snow_Patrol/Run http://www.hittastic.com/biography/Madonna http://www.solentairways.com/flight/SA101 http://www.solentairways.com/flights/June/1/Southampton/New_York
In REST, these URLs are called resources. REST has the following key principles:
A key principle of REST, illustrated by the examples above, is that of clean, unchanging URLs. Why is this useful? URLs which show the real location on the server, or the server-side technology used (e.g. the fact that it's a PHP script) are prone to continuous change, for example, if the script is moved to a different folder or we switch server-side scripting technology. This causes problems in bookmarking and linking to such pages, and also, if the URLs represent web services, means that developers of client applications have to update their client code to point to the new URL.
With REST, we hide the implementation details with a simple, clean and easily-remembered URL, and define how this URL is mapped to the real, underlying location of the script on the server. For example, rather than
http://www.hittastic.com/track.php?title=Wonderwall&artist=Oasiswe could use:
http://www.hittastic.com/Oasis/WonderwallIf we changed the underlying URL, i.e. the location of the actual server side script on the server, all we'd need to do is change the mapping of our clean, easily remembered, publicly-visible, "REST-style" URL to the real underlying URL, and clients of the web service could continue to use our web service unchanged with the same publicly-visible URL as before; they wouldn't have to alter their code to reflect the new underlying URL. We could even change the server-side implementation technology (e.g. PHP to ASP) without having to change the publicly-visible URL: once again we would only have to change the mapping from the publicly-visible URL to the underlying URL.
Furthermore, this allows us to easily swap between dynamically and statically generated data. Imagine the URL below points at a static (i.e. not dynamically generated) XML file on the server representing all Oasis hits.
http://www.hittastic.com/artist/OasisBy changing the server configuration, we could easily change this URL to point to a server side script which dynamically generates the data from a database. So in summary, REST style URLs provide a clean and unchanging interface to data supplied by our server and there is no need to change the URL depending on how the data associated with that URL is generated.
The practical details of how to actually set up REST-style URLs to point to given scripts will be discussed towards the end of this week's notes.
With REST, we send different types of messages to the same URL to make it do different things, e.g. retrieve data or change the state of the item represented by the URL. For example if we had the URL:
http://www.solentairways.com/flight/SA101we could send one type of message (let's call it a "get" message) to to the URL to retrieve the details about flight SA101, and another type of message (let's call it a "put" message) to update the details (e.g. departure time) of flight SA101, and a third type (let's call it a "delete" message) to delete flight SA101.
But what form do these messages take? We could use query string parameters to inform the script of the message type. However, it so happens that HTTP, the protocol used for communication between web clients and servers, already allows us to send different request types to a URL. You've already seen GET and POST requests - but it just so happens that there are PUT and DELETE requests in HTTP too!
Recall from last year that HTTP is a set of instructions which allow clients and servers to communicate with each other
As mentioned above, HTTP comes with a set of standard request types, or methods, to retrieve and manipulate URLs, which are specified in the HTTP request header. The two you've probably met are:
REST takes the view that HTTP methods and status codes are under-used and can be exploited in web services. As mentioned above, the idea is that one single web resource (URL) can be used for retrieving, adding, and deleting data associated with a particular item, e.g. a particular song in the HitTastic! database. What we can do is to do different things with the song depending on the type of HTTP method we use to communicate with the URL. In general we:
REST's association with HTTP does not stop there. REST makes use of different HTTP status codes to communicate the success or otherwise of each operation. Recall that the first line of any HTTP response is a status code which indicates whether the request was successful or not. There are a large number of HTTP status codes including:
Imagine we had a URL to look up a given song:
http://hittastic/song/1009With REST, the URL could return "404 Not Found" to the client if the song with that ID was not on the HitTastic! database, or, if an invalid ID (0 or less) was supplied, the URL could return "400 Bad Request", another standard HTTP error code. Note that this use of error codes differs from the normal usage:
A number of examples are shown below.
Imagine we have the URL:
http://hittastic.com/song/1009to represent the song with the ID of 1009 in the HitTastic! database. (Note - see "Clean and unchanging URLs", above, for more details on why we use a URL like this, rather than something like
http://hittastic.com/song.php?id=1009)
We could send:
The REST web service would also make use of HTTP status codes to indicate whether our transaction was successful. For instance:
Another example: our URL could represent a particular artist in the HitTastic database e.g.
http://hittastic/artist/OasisWe could send:
Our URL could represent a particular song and artist e.g.
http://hittastic/track/Oasis/WonderwallWe could send:
Another feature of REST is that we don't transfer state between client and server. In other words, REST web applications do not use session variables server side, and do not transfer session IDs via cookies between server and client and back again. The idea is that the URL, or the content, contains all the information required to maintain state. This allows the development of more loosely-coupled web services and applications which can be connected to any client side application: they do not need to rely on a session variable having been set in some earlier server-side script.
So how is information passed from page to page? A common way is within the URLs themselves. For instance, if a user selects London as their origin and Denver as their destination in a flight booking system, the origin and destination can be written into the URLs of each successive stage of the booking system.
What about more secure information such as usernames and passwords though? It is obviously not a good idea to put those in the URL. Luckily, HTTP comes with its own authentication system known as HTTP authentication. With HTTP authentication, we actually embed the username and password into the HTTP header. This is relatively easily done with PHP. A server-side script can then interpret the HTTP header, extract the username and password from it and use them to authenticate. While this does have the advantage of being able to develop more loosely- coupled web applications as described above (we are not relying on the existence of some "gatekeeper" session variable indicating whether a user is logged in or not) it does have the disadvantage that we have to authenticate with the database every time a script which requires a user to be logged in is accessed, with possible performance issues. This raises an important point, REST does have a number of disadvantages. These are elaborated on below.
Apache comes with a module called mod_rewrite. mod_rewrite allows you to map one URL to another, e.g. a REST-style URL to the real URL. For example, you can tell mod_rewrite to map the REST-style URL (where T is the title and A is the artist)
http://www.hittastic.com/T/Ato the real URL:
http://www.hittastic.com/search.php?title=T&artist=AMore here, showing examples of how to use mod_rewrite.
You put mod_rewrite commands in a file called .htaccess and upload it to your public_html directory. mod_rewrite syntax is a big topic, but here is an example based on the mod_rewrite tutorial.
RewriteEngine on RewriteRule ^home.html$ index.htmlThis example will rewrite URLs containing home.html so that they load index.html. For example if you type in
http://hittastic.com/home.htmlthe page
http://hittastic.com/index.htmlwill be loaded instead. Note that the ^ and $ match the start and end of the input (requested) URL respectively. In other words, it would only match an exact request for home.html, not myhome.html for example.
Thus we can see that the general syntax is :
RewriteRule RequestedURL RewrittenURL
Here is a more complex example:
RewriteEngine on RewriteRule ^page/([0-9]+)/?$ index.php?page=$1This example shows the use of a simple regular expression. Regular expressions are specifications that match certain text patterns in the input text, and are a whole topic in themselves. The regular expression here, ^page/([0-9]+)/?$, means the following:
page/123/ page/456 page/139823/would match the expression.
After the input URL, the rewritten URL is specified. Here, it is:
index.php?page=$1The $1 means "substitute this with the results of the first bracketed expression in the input". The bracketed expression here, if you remember, was the sequence of numerical characters. So in other words, a request such as:
page/456will be rewritten as:
index.php?page=456
A final example shows the use of two bracketed expressions.
RewriteEngine on RewriteRule ^page/([0-9]+)/([0-9]+)/?$ index.php?page=$1§ion=$2Hopefully you can see that this would match two sequences of digits in the input, separated by a slash. So a URL such as:
page/456/2would be matched, which would be translated to:
index.php?page=456§ion=2