-
Notifications
You must be signed in to change notification settings - Fork 14
Use qEndpoint to index a dataset
To index a dataset, you need first to start the endpoint, as an example run you can download latest qEndpoint version jar (latest) and run it with this command:
wget https://github.com/the-qa-company/qEndpoint/releases/latest/download/qendpoint.jar
java -Xmx10G -jar qendpoint.jar
10G is the amount of RAM used by the system, 10G is the minimum for the wikidata-all dataset, 6G is enough for the wikidata-truthy dataset. (see the difference)
Once the endpoint is started, you need to send your dataset, we will use the Wikidata-latest dataset (all) as an example, but you can also use the truthy dataset if you don't want all the statements.
To send your dataset you can use this command
curl "http://127.0.0.1:1234/api/endpoint/load" -F "[email protected]"
qEndpoint supports a large amount of RDF file formats with gz, bz and xy compression. Here latest-all.nt.gz is our file and 127.0.0.1:1234 is the endpoint location.
You need to wait for your dataset to be indexed and you will have a full SPARQL endpoint done.