Skip to content

Querying VCF files with SPARQL

License

Apache-2.0, Unknown licenses found

Licenses found

Apache-2.0
LICENSE
Unknown
LICENSE.md
Notifications You must be signed in to change notification settings

simonjupp/sparql-vcf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 

Repository files navigation

sparql-vcf

Some code written at biohackathon 2014 to explore methods for querying data in a VCF file using SPARQL. These approaches are based on a methodology for creating triples "on-the-fly" using dedictated indexes to optimise execution times.

Implementation 1. is based on Jerven's sparql-bed (https://github.com/JervenBolleman/sparql-bed) code that uses Sesame API.

Implementation 2. is something I wrote with Jena to find out how property functions work. The idea here is that for a specific use-case, like querying for variants by chromomse coordiantes in a VCF file, we can introduce a special property that executes some code to run the query. The idea would be a hybrid approach to "on-the-fly" where you have most of the triples in a regular triple store, but the special predicates is used to optimize queries similar to how existing triple stores support lucene search. Not entirely sure this is the best approach, but it was fun playing wih it.

Example query get all variants on chromsome Y in range 2600000:2700000


PREFIX otf:<http://onthefly.com/>
SELECT * WHERE {
 ?feature otf:vcf-chromo-query \"Y,2600000,2700000\" .
}

About

Querying VCF files with SPARQL

Resources

License

Apache-2.0, Unknown licenses found

Licenses found

Apache-2.0
LICENSE
Unknown
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published