Currently the signature of variant_iter directly reflects the structure of the bcftools querying language:
def variant_iter(
vcz,
*,
fields: list[str] | None = None,
regions: str | None = None,
targets: str | None = None,
include: str | None = None,
exclude: str | None = None,
samples: list[str] | str | None = None,
):
However, we can easily imagine a future in which we also want to support other querying structures, say, something like SQL. Here, we might do something like
query = sql_query("""
SELECT variant_position, call_genotype FROM 1kgp3.vcz
WHERE variant_contig="chr1" AND sample_population == "YRI"
""")
for var in variant_iter(query)
# Do stuff with var
and also standard bcftools stuff like
# This is unfortunately confusing with "bcftools query", but you get the idea
query = bcftools_query(
"1kgp3.vcz",
fields=["variant_position", "call_genotype"],
regions="chr1"
samples=# list of YRI samples)
for var in variant_iter(query):
# Do stuff with var
We don't need to implement the SQL stuff, but I think it would be a shame to limit the API to just supporting the bcftools way of doing things (which is quite limiting in many ways) and it would be good to keep the door open to this in the future.
Currently the signature of
variant_iterdirectly reflects the structure of the bcftools querying language:However, we can easily imagine a future in which we also want to support other querying structures, say, something like SQL. Here, we might do something like
and also standard bcftools stuff like
We don't need to implement the SQL stuff, but I think it would be a shame to limit the API to just supporting the bcftools way of doing things (which is quite limiting in many ways) and it would be good to keep the door open to this in the future.