Cambridge University Press, 1997. — 550 p.
Algorithms on text (strings) have long been studied in computer science, and computation on molecular sequence data (strings) is at the heart of computational molecular biology. Existing and emerging algorithms for string computation provide a significant intersection between computer science and molecular biology. This book is a general, rigorous treatment of algorithms that operate on character strings and sequences. It covers a wide spectrum of string algorithms from classical computer science to modern molecular biology and, when possible, integrates those two fields.
In addition to explaining current algorithms, the book emphasizes fundamental ideas and techniques that are central in today's applications and should lead to new techniques, in the future. The book contains new approaches developed for complex material, simplifying methods that have been previously for the specialist alone. Biological problems are discussed in detail to explain the reasons that many biological questions have been productively cast as string problems.
The book is written for graduate or advanced undergraduate students in computer science, or computational biology, or bio-informatics. It can be used as a main text for courses on string algorithms, or for computer science oriented courses on computational biology and is also a reference for professionals. The book contains over 400 exercises to reinforce presented material, and to develop additional topics.
Exact String Matching: The Fundamental String Problem
Exact Matching: Fundamental Preprocessing and First Algorithms
Exact Matching: Classical Comparison-Based Methods
Exact Matching: A Deeper Look at Classical Methods
Seminumerical String Matching
Suffix Trees and Their Uses
Introduction to Suffix Trees
Linear-Time Construction of Suffix Trees
First Applications of Suffix Trees
Constant-Time Lowest Common Ancestor Retrieval
More Applications of Suffix Trees
Inexact Matching, Sequence Alignment, Dynamic Programming
The Importance of (Sub)sequence Comparison in Molecular Biology
Core String Edits, Alignments, and Dynamic Programming
Refining Core String Edits and Alignments
Extending the Core Problems
Multiple String Comparison - The Holy Grail
Sequence Databases and Their Uses - The Mother Lode
Currents, Cousins, and Cameos
Maps, Mapping, Sequencing, and Superstrings
Strings and Evolutionary Trees
Three Short Topics
Models of Genome-Level Mutations
Epilogue - where next?