VariantKey
5.4.1
Numerical Encoding for Human Genetic Variants
|
Functions to retrieve REF and ALT values by VariantKey from binary data file. More...
Go to the source code of this file.
Data Structures | |
struct | variantkey_rev_t |
struct | nrvk_cols_t |
Macros | |
#define | ALLELE_MAXSIZE 256 |
Maximum allele length. More... | |
Typedefs | |
typedef struct variantkey_rev_t | variantkey_rev_t |
typedef struct nrvk_cols_t | nrvk_cols_t |
Functions | |
static void | mmap_nrvk_file (const char *file, mmfile_t *mf, nrvk_cols_t *nvc) |
static size_t | get_nrvk_ref_alt_by_pos (nrvk_cols_t nvc, uint64_t pos, char *ref, size_t *sizeref, char *alt, size_t *sizealt) |
static size_t | find_ref_alt_by_variantkey (nrvk_cols_t nvc, uint64_t vk, char *ref, size_t *sizeref, char *alt, size_t *sizealt) |
static size_t | reverse_variantkey (nrvk_cols_t nvc, uint64_t vk, variantkey_rev_t *rev) |
static size_t | get_variantkey_ref_length (nrvk_cols_t nvc, uint64_t vk) |
static uint32_t | get_variantkey_endpos (nrvk_cols_t nvc, uint64_t vk) |
static uint64_t | get_variantkey_chrom_startpos (uint64_t vk) |
Get the CHROM + START POS encoding from VariantKey. More... | |
static uint64_t | get_variantkey_chrom_endpos (nrvk_cols_t nvc, uint64_t vk) |
Get the CHROM + END POS encoding from VariantKey. More... | |
static size_t | nrvk_bin_to_tsv (nrvk_cols_t nvc, const char *tsvfile) |
The functions provided here allows to retrieve the REF and ALT strings for a given VariantKey from a binary file.
The input binary files can be generated from a normalized VCF file using the resources/tools/vkhexbin.sh
. The VCF file can be normalized using the resources/tools/vcfnorm.sh
script.
The binary file can be generated by the `resources/tools/nrvk.sh' script from a TSV file with the following format:
[16 BYTE VARIANTKEY HEX]\t[REF STRING]\t[ALT STRING]\n...
for example:
b800c35bbcece603 AAAAAAAAGG AG 1800c351f61f65d3 A AAGAAAGAAAG
#define ALLELE_MAXSIZE 256 |
typedef struct nrvk_cols_t nrvk_cols_t |
Struct containing the NRVK memory mapped file column info.
typedef struct variantkey_rev_t variantkey_rev_t |
VariantKey decoded struct
|
inlinestatic |
Retrieve the REF and ALT strings for the specified VariantKey.
nvc | Structure containing the pointers to the memory mapped file columns. |
vk | VariantKey to search. |
ref | REF string buffer to be returned. |
sizeref | Pointer to the size of the ref buffer, excluding the terminating null byte. This will contain the final ref size. |
alt | ALT string buffer to be returned. |
sizealt | Pointer to the size of the alt buffer, excluding the terminating null byte. This will contain the final alt size. |
|
inlinestatic |
|
inlinestatic |
nvc | Structure containing the pointers to the memory mapped file columns. |
vk | VariantKey code. |
|
inlinestatic |
vk | VariantKey code. |
|
inlinestatic |
Get the VariantKey end position (POS + REF length).
nvc | Structure containing the pointers to the memory mapped file columns. |
vk | VariantKey. |
|
inlinestatic |
Retrieve the REF length for the specified VariantKey.
nvc | Structure containing the pointers to the memory mapped file columns. |
vk | VariantKey. |
|
inlinestatic |
Memory map the NRVK binary file.
file | Path to the file to map. |
mf | Structure containing the memory mapped file. |
nvc | Structure containing the pointers to the memory mapped file columns. |
|
inlinestatic |
Convert a vrnr.bin file to a simple TSV. For the reverse operation see the resources/tools/nrvk.sh script.
nvc | Structure containing the pointers to the memory mapped file columns. |
tsvfile | Output tsv file name. NOTE: existing files will be replaced. |
|
inlinestatic |
Reverse a VariantKey code and returns the normalized components as variantkey_rev_t structure.
nvc | Structure containing the pointers to the memory mapped file columns. |
vk | VariantKey code. |
rev | Structure containing the return values. |