Home About Download Evaluation restricted

Lovelace vs Turing

Toward comparison table generation

Resources

Versus Code Source

Versus is implemented in Java. It uses the Jena library to query the SPARQL public endpoint of Wikidata by default. Versus was running under the environment configured as follows:

GitHub repository

Comparison Feature Benchmark (CFB)

As compaison table generation is a new problem, we had to develop a benchmark to assess the quality of the comparison features, named Comparison Feature Benchmark (CFB). We briefly explains the selection method from Wikidata of all the candidate comparison features and the manual evaluation to judge their relevance.

Phase 1: Construction of potential comparison features

This phase consisted in constructing a set of features for comparison tables:

  1. We randomly draw 1k types from Wikidata (with between 10k and 1k instances per type). This random sample guarantees to cover a wide variety of entities (person, place, objects, events and so on) in order to best reflect Wikidata diversity.
  2. We select the two most popular entities of each type (based on in-degree ranking).
  3. We consider all the direct properties of Wikidata for this entities as candidate comparison features.

Phase 2: Manual evaluation of potential features

This phase consisted in assessing whether the above candidate features are relevant or not for a human evaluator. We repeat 1,275 the following steps:

  1. We randomly draw a candidate feature.
  2. An evaluator decides whether the comparison feature is interesting or not (using this tool).