Nature Biotechnology

Primers

Designing proteins with language models

Protein language models learn from diverse sequences spanning the evolutionary tree and have proven to be powerful tools for sequence design, variant effect prediction and structure prediction. What are the foundations of protein language models, and how are they applied in protein engineering?

Jeffrey A. Ruffolo
Ali Madani
Primer15 Feb 2024
Generative models for protein structures and sequences

Models like ChatGPT and DALL-E2 generate text and images in response to a text prompt. Despite different data and goals, how can generative models be useful for protein engineering?

Chloe Hsu
Clara Fannjiang
Jennifer Listgarten
Primer15 Feb 2024
How to apply de Bruijn graphs to genome assembly

A mathematical concept known as a de Bruijn graph turns the formidable challenge of assembling a contiguous genome from billions of short sequencing reads into a tractable computational problem.

Phillip E C Compeau
Pavel A Pevzner
Glenn Tesler
Primer08 Nov 2011
Analyzing 'omics data using hierarchical models

Hierarchical models provide reliable statistical estimates for data sets from high-throughput experiments where measurements vastly outnumber experimental samples.

Hongkai Ji
X Shirley Liu
Primer01 Apr 2010
What is flux balance analysis?

Flux balance analysis is a mathematical approach for analyzing the flow of metabolites through a metabolic network. This primer covers the theoretical basis of the approach, several practical examples and a software toolbox for performing the calculations.

Jeffrey D Orth
Ines Thiele
Bernhard Ø Palsson
Primer01 Mar 2010
How does multiple testing correction work?

When prioritizing hits from a high-throughput experiment, it is important to correct for random events that falsely appear significant. How is this done and what methods should be used?

William S Noble
Primer01 Dec 2009
How to visually interpret biological data using networks

Networks in biology can appear complex and difficult to decipher. Merico et al. illustrate how to interpret biological networks with the help of frequently used visualization and analysis patterns.

Daniele Merico
David Gfeller
Gary D Bader
Primer01 Oct 2009
How to map billions of short reads onto genomes

Mapping the vast quantities of short sequence fragments produced by next-generation sequencing platforms is a challenge. What programs are available and how do they work?

Cole Trapnell
Steven L Salzberg
Primer01 May 2009
SNP imputation in association studies

Only a subset of single-nucleotide polymorphisms (SNPs) can be genotyped in genome-wide association studies. Imputation methods can infer the alleles of 'hidden' variants and use those inferences to test the hidden variants for association.

Eran Halperin
Dietrich A Stephan
Primer01 Apr 2009
Maximizing power in association studies

Only a subset of genetic variants can be examined in genome-wide surveys for genetic risk factors. How can a fixed set of markers account for the entire genome by acting as proxies for neighboring associations?

Eran Halperin
Dietrich A Stephan
Primer01 Mar 2009
Understanding genome browsing

How can genome browsers help researchers to infer biological knowledge from data that might be misleading?

Melissa S Cline
W James Kent
Primer01 Feb 2009
What are decision trees?

Decision trees have been applied to problems such as assigning protein function and predicting splice sites. How do these classifiers work, what types of problems can they solve and what are their advantages over alternatives?

Carl Kingsford
Steven L Salzberg
Primer01 Sept 2008
What is the expectation maximization algorithm?

The expectation maximization algorithm arises in many computational biology applications that involve probabilistic models. What is it good for, and how does it work?

Chuong B Do
Serafim Batzoglou
Primer01 Aug 2008
What is principal component analysis?

Principal component analysis is often incorporated into genome-wide expression studies, but what is it and how can it be used to explore high-dimensional data?

Markus Ringnér
Primer01 Mar 2008
What are artificial neural networks?

Artificial neural networks have been applied to problems ranging from speech recognition to prediction of protein secondary structure, classification of cancers and gene prediction. How do they work and what might they be good for?

Anders Krogh
Primer01 Feb 2008
How does eukaryotic gene prediction work?

Computational prediction of gene structure is crucial for interpreting genomic sequences. But how do the algorithms involved work and how accurate are they?

Michael R Brent
Primer01 Aug 2007
How do shotgun proteomics algorithms identify proteins?

Instrumentation aside, algorithms for matching mass spectra to proteins are at the heart of shotgun proteomics. How do these algorithms work, what can we expect of them and why is it so difficult to find protein modifications?

Edward M Marcotte
Primer01 Jul 2007
What is a support vector machine?

Support vector machines (SVMs) are becoming popular in a wide variety of biological applications. But, what exactly are SVMs and how do they work? And what are their most promising applications in the life sciences?

William S Noble
Primer01 Dec 2006
How does DNA sequence motif discovery work?

How can we computationally extract an unknown motif from a set of target sequences? What are the principles behind the major motif discovery algorithms? Which of these should we use, and how do we know we've found a 'real' motif?

Patrik D'haeseleer
Primer01 Aug 2006
What are DNA sequence motifs?

Sequence motifs are becoming increasingly important in the analysis of gene regulation. How do we define sequence motifs, and why should we use sequence logos instead of consensus sequences to represent them? Do they have any relation with binding affinity? How do we search for new instances of a motif in this sea of DNA?

Patrik D'haeseleer
Primer01 Apr 2006

Previous page
page 1
page 2
Next page

Search

Advanced search

Quick links