divbrowse.lib.genotype_data¶
Module Contents¶
Classes¶
Class for managing all genotype data related data structures and methods |
- class divbrowse.lib.genotype_data.GenotypeData(config)¶
Class for managing all genotype data related data structures and methods
- _load_data()¶
- get_vcf_header()¶
- _setup_sample_id_mapping()¶
- _create_chrom_indices()¶
- _create_list_of_chromosomes()¶
- _free_mem()¶
- sample_ids_to_mask(sample_ids: list) numpy.ndarray¶
Creates a boolean mask based on the input sample IDs that could be found in the samples array of the Zarr storage
- Parameters
sample_ids (list) – List with sample IDs
- Returns
Boolean mask, True for found sample IDs
- Return type
numpy.ndarray
- map_input_sample_ids_to_vcf_sample_ids(sample_ids: list) list¶
Map input sample IDs to VCF sample IDs according to the configured mapping table
- Parameters
sample_ids (list) – List with sample IDs
- Returns
List of mapped sample IDs
- Return type
list
- map_vcf_sample_ids_to_input_sample_ids(sample_ids: list) list¶
Map VCF sample IDs to input sample IDs according to the configured mapping table
- Parameters
sample_ids (list) – List with sample IDs
- Returns
List of mapped sample IDs
- Return type
list
- get_samples_mask(sample_ids)¶
Returns a tupel consisting of a boolean mask for found sample Ids and a list of mapped sample IDs
- Parameters
sample_ids (list) – List with sample IDs
- Returns
Boolean mask, True for found sample IDs list: mapped sample IDs
- Return type
numpy.ndarray
- get_posidx_by_genome_coordinate(chrom, pos, method='nearest') Tuple[int, str]¶
Returns array coordinates for given physical position on a given chromosome
- Parameters
chrom (str) – ID of the chromosome
pos (int) – Physical position on the chromosome
- Returns
lookup (int) Array coordinate of the found physical position on the chromosome lookup_type (str): Type of the lookup, could be either ‘direct_lookup’ or ‘nearest_lookup’
- get_posidx_by_genome_coordinates(chrom, positions, method='nearest') Tuple[int, str]¶
- count_variants_in_window(chrom, startpos, endpos) int¶
Counts number of variants in a genomic region
- Parameters
chrom (str) – The chromosome of the genomic region.
startpos (int) – The first position of the genommic region.
endpos (int) – The last position of the genommic region.
- Returns
Number of variants in the genomic region
- Return type
int
- get_slice_of_variant_calls(chrom, startpos=None, endpos=None, positions=None, count=None, samples=None, variant_filter_settings=None, with_call_metadata=False, calc_summary_stats=False, flanking_region_include=False, flanking_region_length=1500, flanking_region_direction='both') divbrowse.lib.variant_calls_slice.VariantCallsSlice¶