^{1}

^{*}

^{2}

^{3}

^{4}

The authors have declared that no competing interests exist.

The understanding of biological networks is a fundamental issue in computational biology. When analyzing topological properties of networks, one often tends to substitute the term “network” for “graph”, or uses both terms interchangeably. From a mathematical perspective, this is often not fully correct, because many functional relationships in biological networks are more complicated than what can be represented in graphs.

In general, graphs are combinatorial models for representing relationships (edges) between certain objects (nodes). In biology, the nodes typically describe proteins, metabolites, genes, or other biological entities, whereas the edges represent functional relationships or interactions between the nodes such as “binds to”, “catalyzes”, or “is converted to”. A key property of graphs is that every edge connects two nodes. Many biological processes, however, are characterized by more than two participating partners and are thus not bilateral. A metabolic reaction such as A+B→C+D (involving four species), or a protein complex consisting of more than two proteins, are typical examples. Hence, such multilateral relationships are not compatible with graph edges. As illustrated below, transformation to a graph representation is usually possible but may imply a loss of information that can lead to wrong interpretations afterward.

Hypergraphs offer a framework that helps to overcome such conceptual limitations. As the name indicates, hypergraphs generalize graphs by allowing edges to connect more than two nodes, which may facilitate a more precise representation of biological knowledge. Surprisingly, although hypergraphs occur ubiquitously when dealing with cellular networks, their notion is known to a much lesser extent than that of graphs, and sometimes they are used without explicit mention.

This contribution does by no means question the importance and wide applicability of
graph theory for modeling biological processes. A multitude of studies proves that
meaningful biological properties can be extracted from graph models (for a review
see

Like graphs, hypergraphs may be classified by distinguishing between undirected and directed hypergraphs, and, accordingly, we divide the introduction to hypergraphs given below into two major parts.

An

Undirected

Detailed explanations are given in the text.

Another application of undirected hypergraphs is

Hypergraphs are also closely related to the concept of _{2}_{3}_{3}_{1}_{1}_{2}

Given how frequently greedy-type algorithms on hypergraphs are applied as heuristics
in practice, it appears important to study the deviation of the hypergraph under
consideration from being a matroid

The definition of

Typical examples are (bio)chemical reactions, which are often bi-molecular, such as
the example A+B→C+D. The tail _{H}_{H}

Directed hypergraphs can be drawn as shown in the example in

Another application of directed hypergraphs in computational biology is the
representation of logical relationships in signaling and regulatory networks.
Interaction graphs (signed directed graphs) are commonly used topological models for
causal relationships and signal flows in cellular networks. For example, in

The concept of hypergraphs provides such a rich modeling framework that algorithms
necessarily will be problem-specific, and will differ in complexity from similar
algorithms for graphs. Clearly, since graphs are special cases of hypergraphs,
algorithms for hypergraphs are at least as hard as its specialized implementations
in the graph case. Generally, when discussing algorithms in graphs and hypergraphs,
one has to distinguish between two types of problems. The first type encompasses
algorithms determining a

The second type of problem is

With the increasing availability of large-scale molecular interaction graphs such as
PPI or gene regulatory networks, more and more researchers have begun asking not
only for single specific elements of a graph but instead for its statistical
properties or significant building blocks. Examples are the neural network of

The degree

The natural next step in defining hypergraph statistics is to correlate vertex and
hyperedge connectivity, a major ingredient for determining, e.g., the small-world
property known from the graph case

In order to test for significance of certain structures, e.g., network motifs

What could be potential biological applications of hypergraph statistics? Given the
fact that in gene regulatory networks statistical properties are decisive

Choose two distinct hyperedges and two different vertices contained in either
of the two. Then swap them. Clearly this operation keeps both degree
distributions fixed. After a certain number of iterations, the
thus-generated Markov chain produces independent samples of the underlying
random hypergraph with given degree distributions. In the figure, this is
illustrated using the in-this-case simpler-to-visualize bipartite version.
The gray double-arrows indicate edges to be swapped. Each of the three
swaps, (A,H_{2})–(C,H_{3}),
(B,H_{1})–(E,H_{3}), and
(B,H_{3})–(D,H_{1}), does not change the
vertex and edge degrees. Significance analysis of the CORUM protein complex
hypergraph was done in

To summarize, hypergraphs generalize graphs by allowing for multilateral
relationships between the nodes, which often results in a more precise description
of biological processes. Hypergraphs thus provide an important approach for
representing biological networks, whose potential has not been fully exploited yet.
We therefore expect that applications of hypergraph theory

FT thanks Florian Blöchl and SK is grateful to Regina Samaga and Axel von Kamp for helpful comments during the preparation of the manuscript.