Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

RaPID-Query for Fast Identity by Descent Search and Genealogical Analysis

View through CrossRef
Abstract The size of genetic databases has grown large enough such that, genetic genealogical search, a process of inferring familial relatedness by identifying DNA matches, has become a viable approach to help individuals finding missing family members or law enforcement agencies locating suspects. However, a fast and accurate method is needed to search an out-of-database individual against millions of individuals in such databases. Most existing approaches only offer all-vs-all within panel match. Some prototype algorithms offer 1-vs-all query from out-of-panel individual, but they do not tolerate errors. A new method, random projection-based identical-by-descent (IBD) detection (RaPID) query, referred as RaPID-Query, is introduced to make fast genealogical search possible. RaPID-Query method identifies IBD segments between a query haplotype and a panel of haplotypes. By integrating matches over multiple PBWT indexes, RaPID-Query method is able to locate IBD segments quickly with a given cutoff length while allowing mismatched sites in IBD segments. A single query against all UK biobank autosomal chromosomes can be completed within 2.76 seconds CPU time on average, with the minimum 7 cM IBD segment length and minimum 700 markers. Using the same criteria, RaPID-Query can achieve 0.099 false negative rate and 0.017 false positive rate at the same time on a chromosome 20 sequencing panel having 92,296 sites, which is comparable to the state-of-the-art IBD detection method Hap-IBD. For the relatedness degree separation experiments, RaPID-Query is able to distinguish up to fourth degree of the familial relatedness for a given individual pair, and the area under the receiver operating characteristic curve values are at least 97.28%. It is anticipated that RaPID-Query will make genealogical search convenient and effective, potentially with the integration of complex inference models.
Title: RaPID-Query for Fast Identity by Descent Search and Genealogical Analysis
Description:
Abstract The size of genetic databases has grown large enough such that, genetic genealogical search, a process of inferring familial relatedness by identifying DNA matches, has become a viable approach to help individuals finding missing family members or law enforcement agencies locating suspects.
However, a fast and accurate method is needed to search an out-of-database individual against millions of individuals in such databases.
Most existing approaches only offer all-vs-all within panel match.
Some prototype algorithms offer 1-vs-all query from out-of-panel individual, but they do not tolerate errors.
A new method, random projection-based identical-by-descent (IBD) detection (RaPID) query, referred as RaPID-Query, is introduced to make fast genealogical search possible.
RaPID-Query method identifies IBD segments between a query haplotype and a panel of haplotypes.
By integrating matches over multiple PBWT indexes, RaPID-Query method is able to locate IBD segments quickly with a given cutoff length while allowing mismatched sites in IBD segments.
A single query against all UK biobank autosomal chromosomes can be completed within 2.
76 seconds CPU time on average, with the minimum 7 cM IBD segment length and minimum 700 markers.
Using the same criteria, RaPID-Query can achieve 0.
099 false negative rate and 0.
017 false positive rate at the same time on a chromosome 20 sequencing panel having 92,296 sites, which is comparable to the state-of-the-art IBD detection method Hap-IBD.
For the relatedness degree separation experiments, RaPID-Query is able to distinguish up to fourth degree of the familial relatedness for a given individual pair, and the area under the receiver operating characteristic curve values are at least 97.
28%.
It is anticipated that RaPID-Query will make genealogical search convenient and effective, potentially with the integration of complex inference models.

Related Results

Query expansion by relying on the structure of knowledge bases
Query expansion by relying on the structure of knowledge bases
Query expansion techniques aim at improving the results achieved by a user's query by means of introducing new expansion terms, called expansion features. Expansion features introd...
Named Entity Recognition in Statistical Dataset Search Queries
Named Entity Recognition in Statistical Dataset Search Queries
Search engines must understand user queries to provide relevant search results. Search engines can enhance their understanding of user intent by employing named entity recognition ...
A Survey of Query Auto Completion in Information Retrieval
A Survey of Query Auto Completion in Information Retrieval
In information retrieval, query auto completion (QAC), also known as type-ahead [Xiao et al., 2013, Cai et al., 2014b] and auto-complete suggestion [Jain and Mishne, 2010], refers ...
Techniques for Improving Web Search by Understanding Queries
Techniques for Improving Web Search by Understanding Queries
<p>This thesis investigates the refinement of web search results with a special focus on the use of clustering and the role of queries. It presents a collection of new method...
SOCIOCULTURAL IDENTITY POSTMODERN: PROBLEM OF SOCIAL CONSTRUCTION
SOCIOCULTURAL IDENTITY POSTMODERN: PROBLEM OF SOCIAL CONSTRUCTION
Problem setting. The relevance of our study is due to the excessive popularity of the concept of «socio-cultural identity» as a scientific term and tool for studying the postmodern...
Search engine marketing
Search engine marketing
The article considers the elements of Internet marketing. It is determined that search marketing is a search engine optimization and a modern method of site promotion by optimizing...
ERROR ESTIMATION FOR A PIEZOELECTRIC CONTACT PROBLEM WITH WEAR AND LONG MEMORY
ERROR ESTIMATION FOR A PIEZOELECTRIC CONTACT PROBLEM WITH WEAR AND LONG MEMORY
We study a mathematical model for a quasistatic behavior of electro-viscoelastic materials. The problem is related to highly nonlinear and non-smooth phenomena like contact, fricti...
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Evaluating the Science to Inform the Physical Activity Guidelines for Americans Midcourse Report
Abstract The Physical Activity Guidelines for Americans (Guidelines) advises older adults to be as active as possible. Yet, despite the well documented benefits of physical a...

Back to Top