Birkhäuser, 2003. — 118 p.
These notes had their origin in a postgraduate lecture series I gave at the Eidgenössiche Technische Hochschule (ETH) in Zürich in the Spring of 2000. I am very grateful to my hosts, the Forschungsinstitut für Mathematik at ETH, for providing the ideal opportunity to develop and present this material in what I hope is a reasonably coherent manner, and also for encouraging and assisting me to record the proceedings in these lecture notes.
The subject of the lecture series was counting (of combinatorial structures) and related topics, viewed from a computational perspective. As we shall see, "related topics" include sampling combinatorial structures (being computationally equivalent to approximate counting via efficient reductions), evaluating partition functions (being weighted counting) and calculating the volume of bodies (being counting in the limit).
We shall be inhabiting a different world to the one conjured up by books with titles like Combinatorial Enumeration or Graphical Enumeration. There, the problems are usually parameterised on a single integer parameter n, and the required solutions are closed form or asymptotic estimates obtained using very refined and precise analytical tools. An example problem might be "how many distinct unlabelled graphs are there on n vertices?"
Instead, we shall be working in a somewhat messier world of problems, ultimately inspired by practical applications, in which an instance (say, an undirected graph) is presented and one is asked to count certain structures (say perfect matchings) within it. (The "practical application" in this instance might be in the domain of statistical physics, where perfect matchings are configurations of the "dimer model".) A solution in this world is an efficient algorithm which takes as input the problem instance, and outputs the number of structures. As a first cut, "efficient" is taken to mean "having a running time that scales as a polynomial in some natural measure of the instance size". Later, particularly if the problem seems reasonably close to practical application, one might well be concerned to reduce the degree of the polynomial as far as possible.
These notes open with a couple of examples of combinatorial structures that can be counted exactly polynomial time. Although the examples are classical, I decided to include them because (i) they perfectly demonstrate the ideal we are aiming for, and (ii) the ideas on which they are based are beautiful and worth re- fleeting on. It transpires that these two examples are the exceptions that prove the empirical rule that exact counting is hard. The theory of #P-completeness, which we review briefly, provides evidence for the latter assertion. To make progress, we lower our sights by seeking solutions (i.e., algorithms) that are approximate only, but which nevertheless come with rigorously derived guarantees on the relative error of the results they produce. The notion of efficient approximation algorithm is formalised in the definition of "FPRAS".
Of the various approaches to designing efficient approximation algorithms for counting problems, by far the most fruitful has been "Markov chain Monte Carlo" (MCMC). The idea is to accumulate information on a set of combinatorial structures by performing a random walk (i.e., simulating a Markov chain) on those structures. The running time of an MCMC algorithm depends on the rate of convergence to equilibrium of this Markov chain, as formalised in the notion of "mixing time" of the Markov chain. To my mind, the most exciting developments in the area in recent years have been the introduction and refinement of various analytical tools for bounding the mixing time of combinatorially-defined Markov chains. A survey of these tools forms the primary theme of these notes. Geometric, canonical path and coupling arguments are all covered, together with illustrating examples.
Not every counting problem can be solved efficiently, even in the approximate FPRAS sense, and the notes close with some remarks on intractable problems. Despite the relative youthfulness of the topic, any treatment of this length must necessarily miss out more than it includes. In particular, with the intention of getting close to the coal-face of current research, I have concentrated on the MCMC approach to the exclusion of others. Among the sacrifices this decision entailed is Karp and Luby's proposal for estimating the size of a union of sets [40]. Moreover, I have said nothing at all about the sampling and counting of unlabelled structures, despite a reasonably extensive, if somewhat provisional body of work on the subject (see surveys by Jerrum [34] and Goldberg [31]). Even within the MCMC theme my coverage is certainly not exhaustive. Among several authors who have a right to be offended by the omission of their work are Feder and Mihail [27], with their surprising and elegant inductive approach to counting and sampling matroid bases. The comparison method of Diaconis and Saloff-Coste [19] also escapes mention, as does the decomposition approach of Madras and Randall [48].
So much for intention; now for the means. It is a pleasure to record my gratitude to various research students and associates at the Institut fur Theoretische Informatik at ETH for capturing so well the content of the lectures in written form, and ultimately as TEX source files. They are: Christoph Ambiihl, Alex Be- low, Bernd Gartner, Michael Hoffmann, Zsuzsanna Liptak, Samuele Pedroni and Uli Wagner. These files, together with fragments of earlier surveys, formed the raw material for this volume. I also gratefully acknowledge the contribution of Alistair Sinclair, who collaborated on a significant proportion of the work described in these pages.
But as well as thanking the note takers, I must also apologise to them. For the raw material they provided has undergone a series of reworkings, accumulating layers of extra material until an archaeologist would be required to uncover the ur-text. First, I realised that extra explanation would be required to make the material comprehensible in the absence of the spoken and somewhat interactive presentations. Then there was the matter of imposing some overall consistency of notation and terminology where it had been lacking in the lectures, and attempting some limited moves in the direction of a uniform style. Then there were the things I wish I had done slightly differently, and where the temptation to rewrite history was simply too great. Finally, I abandoned any attempt to mirror the division of the course into lectures in the structure of this volume. Anyway, the fact is that without their efforts there would be no book (and certainly no fancy illustrations to enliven the text).
Two good counting algorithms
#P-completeness
Sampling and counting
Coupling and colourings
Canonical paths and matchings
Volume of a convex body
Inapproximability.