Alasdair J G Gray

Connecting the dots in the World's data

New Workshop Paper: Creating Topical Subsets over Wikidata

Wikidata is an amazing source of data. However, it’s query service is limited to relatively straightforward queries due to fair usage timeouts, and its size makes it impractical for most people to use locally. To evaluate his PhD work on references, Seyed will need to process complex queries, so we need a mechanism to construct representative subsets of the whole. This paper explores our initial ideas on creating topical subsets from Wikidata.

Experiences of Using WDumper to Create Topical Subsets from Wikidata

Abstract: Wikidata is a general-purpose knowledge graph covering a wide variety of topics with content being crowd-sourced through an open wiki. There are now over 90M interrelated data items in Wikidata which are accessible through a public query endpoint and data dumps. However, execution timeout limits and the size of data dumps make it difficult to use the data. The creation of arbitrary topical subsets of Wikidata, where only the relevant data is kept, would enable reuse of that data with the benefits of cost reduction, ease of access, and flexibility. In this paper, we provide a formal definition of topical subsets over the Wikidata Knowledge Graph and evaluate a third-party tool (WDumper) to extract these topical subsets from Wikidata.

Hosseini Beghaeiraveri, Seyed Amir and Gray, Alasdair J. G. and McNeill, Fiona

In Proceedings of the 2nd International Workshop on Knowledge Graph Construction, co-located with the 18th Extended Semantic Web Conference (ESWC 2021), CEUR Workshop Proceedings, 2873, CEUR Workshop Proceedings (CEUR-WS.org), 2021
```
@inproceedings{HosseiniBeghaeiraveri2021:TopicalSubsets:KGConstruction2021,
  title = {Experiences of Using WDumper to Create Topical Subsets from Wikidata},
  booktitle = {Proceedings of the 2nd International Workshop on Knowledge Graph Construction, co-located with the 18th Extended Semantic Web Conference (ESWC 2021)},
  keywords = {Wikidata, Topical subset, WikiProject, Gene Wiki},
  author = {{Hosseini Beghaeiraveri}, {Seyed Amir} and Gray, {Alasdair J. G.} and McNeill, Fiona},
  year = {2021},
  month = jun,
  series = {CEUR Workshop Proceedings},
  publisher = {CEUR Workshop Proceedings (CEUR-WS.org)},
  volume = {2873},
  url = {http://ceur-ws.org/Vol-2873/paper13.pdf}
}
```

About Me

I'm an Associate Professor in Computer Science at Heriot-Watt University. My research focuses on linking datasets. Read more

Tweets

Tweets by gray_alasdair

New Workshop Paper: Creating Topical Subsets over Wikidata

Experiences of Using WDumper to Create Topical Subsets from Wikidata

About Me

Tweets