Alasdair J G Gray

Connecting the dots in the World's data

SPARQL Lab

License: CC-BY

Overview

This lab exercise aims to introduce the SPARQL query language.

Discussions can be held on the Discussion Forum on Vision.

Task 1: Run Lecture Queries

Load the two datasets available from Vision and make sure that you understand the queries presented in the lecture. There are notes in the ‘Additional course resources’ section on Vision for running an instance of Apache Jena with Fuseki.

When running the queries, try altering them to return different information.

Writing Queries

It is recommended that you keep a copy of your queries in a code editor such as Atom, GEdit, or VisualStudio. These code editors can provide syntax highlighting to make it easier to develop your queries.

Syntax Highlighting in Atom Editor

You need to install two extensions in Atom. On the lab machines go to Edit -> Preferences. This will open the settings window. On the left hand side you will see Install. Click on this and then type ‘rdf’ into the search box. Click on the Install button for ‘language-rdf’ and ‘language-sparql’.

The highlighting should start automatically based on the file extension. If it doesn’t, then you can always set the highlighting in the bottom right of the screen.

Task 2: Movie Queries

Using the Linked Movie Database that you loaded in Task 1, answer the following questions.

Return the names of all directors of movies, returning each name only once
Write the same query using a property path
Find the 20 most recent films by date (dcterms:date) or initial release date (movie:initial_release_date). Return the name and date.
- Hint: Property path using |
Find all films released in 2007. Return the name and date, with the films sorted by the title.
Return the names of actors in Ghostbusters
Find the films that having 35 or more actors
- Hint: Use GROUP BY and HAVING
Return the names of actors in Ghostbusters. Each film should have a single row response with the actor names being comma separated.
- Hint: Use group_concat

Task 3: Querying DBpedia

DBpedia is a dataset that has been derived from the information boxes on Wikipedia. It covers a wide range of topics and has developed an ontology to model this data. The 2016-10 version contains 13 billion triples but the data is available through the DBpedia SPARQL endpoint (http://dbpedia.org/sparql).

Work through the exercise sheet available on Vision for querying DBpedia.

About Me

I'm an Associate Professor in Computer Science at Heriot-Watt University. My research focuses on linking datasets. Read more

Tweets

Tweets by gray_alasdair