Statistics Beta launched – fun with SPARQL

This week we launched the beta version of our linked open data platform for testing. More information is available on our blog and we’re really keen to receive feedback.

Confused by SPARQL

I’m trying to learn more about linked open data so I thought I’d attempt to write a SPARQL query. Our platform lets you query all of the data held on it using SPARQL.

At first I struggled, but then I remembered some wise words that someone once said about thinking about the structure of the data. So I thought about it, and by considering how all the data is stored in RDF triples it started to make sense. Triples are based on a subject, a predicate (property) and an object. You can express statements like “roses are red” as a triple. The subject would be roses, the predicate would be colour and the object would be red. I just needed to think of which triples I wanted.

Getting there

I decided to write a query to extract a simple dataset: annual electricity usage in Edinburgh local authority area. I built up the query slowly. First I selected all triples that were in the Energy consumption dataset. Then I selected the rest of the information I’d need: year, energy type (gas or electricity), customer type (all, domestic or commercial) and the amount consumed. Finally I filtered the data to only include electricity used in Edinburgh. The query I used is at the bottom of this page. I’m sure that it could be written in a much simpler and more efficient way, but it does the trick.

So what?

Well you can start to do cool things like automatically import the results into a Google spreadsheet. This spreadsheet will always show the latest data and you can use it to power a chart like the one below. Pretty cool!

And here is the query:

PREFIX dcterms: <>
PREFIX owl: <>
PREFIX qb: <>
PREFIX rdf: <>
PREFIX rdfs: <>
PREFIX skos: <>
PREFIX xsd: <>

SELECT ?year ?consumption
?observation <> <>.
?observation <> ?yearURI.
?observation <> ?consumption.
?observation <> ?refArea.
?observation <> ?EnergyType.
?observation <> ?CustomerType.
?yearURI rdfs:label ?year .

FILTER (?refArea = <>)
FILTER (?EnergyType = <>)
FILTER (?CustomerType = <>)