|
VariantKey
5.4.1
Numerical Encoding for Human Genetic Variants
|
Functions to read VariantKey-rsID binary files. More...
Go to the source code of this file.
Data Structures | |
| struct | rsidvar_cols_t |
Typedefs | |
| typedef struct rsidvar_cols_t | rsidvar_cols_t |
Functions | |
| static void | mmap_vkrs_file (const char *file, mmfile_t *mf, rsidvar_cols_t *cvr) |
| static void | mmap_rsvk_file (const char *file, mmfile_t *mf, rsidvar_cols_t *crv) |
| static uint64_t | find_rv_variantkey_by_rsid (rsidvar_cols_t crv, uint64_t *first, uint64_t last, uint32_t rsid) |
| static uint64_t | get_next_rv_variantkey_by_rsid (rsidvar_cols_t crv, uint64_t *pos, uint64_t last, uint32_t rsid) |
| static uint32_t | find_vr_rsid_by_variantkey (rsidvar_cols_t cvr, uint64_t *first, uint64_t last, uint64_t vk) |
| static uint32_t | get_next_vr_rsid_by_variantkey (rsidvar_cols_t cvr, uint64_t *pos, uint64_t last, uint64_t vk) |
| static uint32_t | find_vr_chrompos_range (rsidvar_cols_t cvr, uint64_t *first, uint64_t *last, uint8_t chrom, uint32_t pos_min, uint32_t pos_max) |
The functions provided here allows fast search for rsID and VariantKey values from binary files made of adjacent constant-length binary blocks sorted in ascending order.
rsvk.bin: Lookup table to retrieve VariantKey from rsID. This binary file can be generated by the `resources/tools/rsvk.sh' script from a TSV file. This can also be in Apache Arrow File format with a single RecordBatch, or Feather format. The first column must contain the rsID sorted in ascending order.
vkrs.bin: Lookup table to retrieve rsID from VariantKey. This binary file can be generated by the `resources/tools/vkrs.sh' script from a TSV file. This can also be in Apache Arrow File format with a single RecordBatch, or Feather format. The first column must contain the VariantKey sorted in ascending order.
| typedef struct rsidvar_cols_t rsidvar_cols_t |
Struct containing the RSVK or VKRS memory mapped file column info.
|
inlinestatic |
Search for the specified rsID and returns the first occurrence of VariantKey in the RV file.
| crv | Structure containing the pointers to the RSVK memory mapped file columns (rsvk.bin). |
| first | Pointer to the first element of the range to search (min value = 0). This will hold the position of the first record found. |
| last | Element (up to but not including) where to end the search (max value = nitems). |
| rsid | rsID to search. |
|
inlinestatic |
Search for the specified CHROM-POS range and returns the first occurrence of rsID in the VR file.
| cvr | Structure containing the pointers to the VKRS memory mapped file columns (vkrs.bin). |
| first | Pointer to the first element of the range to search (min value = 0). |
| last | Pointer to the Element (up to but not including) where to end the search (max value = nitems). |
| chrom | Chromosome encoded number. |
| pos_min | Start reference position, with the first base having position 0. |
| pos_max | End reference position, with the first base having position 0. |
|
inlinestatic |
Search for the specified VariantKey and returns the first occurrence of rsID in the VR file.
| cvr | Structure containing the pointers to the VKRS memory mapped file columns (vkrs.bin). |
| first | Pointer to the first element of the range to search (min value = 0). This will hold the position of the first record found. |
| last | Element (up to but not including) where to end the search (max value = nitems). |
| vk | VariantKey. |
|
inlinestatic |
Get the next VariantKey for the specified rsID in the RV file. This function should be used after find_rv_variantkey_by_rsid. This function can be called in a loop to get all VariantKeys that are associated with the same rsID (if any).
| crv | Structure containing the pointers to the RSVK memory mapped file columns (rsvk.bin). |
| pos | Pointer to the current item. This will hold the position of the next record. |
| last | Element (up to but not including) where to end the search (max value = nitems). |
| rsid | rsID to search. |
|
inlinestatic |
Get the next rsID for the specified VariantKey in the VR file. This function should be used after find_vr_rsid_by_variantkey. This function can be called in a loop to get all rsIDs that are associated with the same VariantKey (if any).
| cvr | Structure containing the pointers to the VKRS memory mapped file columns (vkrs.bin). |
| pos | Pointer to the current item. This will hold the position of the next record. |
| last | Element (up to but not including) where to end the search (max value = nitems). |
| vk | VariantKey. |
|
inlinestatic |
Memory map the RSVK binary file.
| file | Path to the file to map. |
| mf | Structure containing the memory mapped file. |
| crv | Structure containing the pointers to the RSVK memory mapped file columns. |
|
inlinestatic |
Memory map the VKRS binary file.
| file | Path to the file to map. |
| mf | Structure containing the memory mapped file. |
| cvr | Structure containing the pointers to the VKRS memory mapped file columns. |