varona.utils.split

Splits up a VCF file specifically for Varona+Nextflow.

split_vcf(vcf_path: Path, out_dir: Path, assembly: Assembly | None = None, chunk_size: int | None = None, n_chunks: int | None = None, compress: bool = True) list[Path][source]

Split a VCF file into smaller chunks.

Parameters:
  • vcf_path – The path to the VCF file.

  • out_dir – The directory to save the chunks.

  • assembly – The genome assembly for the VCF file.

  • chunk_size – The number of records per chunk.

  • n_chunks – The number of chunks to split the VCF into. If set, then chunk_size is ignored.

  • compress – Whether to compress the output files.

Returns:

The list of paths to the split VCF files.