^{1}

^{2}

^{*}

^{1}

^{1}

^{3}

^{3}

^{3}

^{1}

^{*}

Conceived and designed the experiments: RDK GEL TH SB. Performed the experiments: RDK GEL MH JMW CJP TH SB. Analyzed the data: RDK GEL TH SB. Contributed reagents/materials/analysis tools: MH JMW CJP. Wrote the paper: RDK GEL SB.

The authors have declared that no competing interests exist.

Although fitness landscapes are central to evolutionary theory, so far no biologically realistic examples for large-scale fitness landscapes have been described. Most currently available biological examples are restricted to very few loci or alleles and therefore do not capture the high dimensionality characteristic of real fitness landscapes. Here we analyze large-scale fitness landscapes that are based on predictive models for in vitro replicative fitness of HIV-1. We find that these landscapes are characterized by large correlation lengths, considerable neutrality, and high ruggedness and that these properties depend only weakly on whether fitness is measured in the absence or presence of different antiretrovirals. Accordingly, adaptive processes on these landscapes depend sensitively on the initial conditions. While the relative extent to which mutations affect fitness on their own (main effects) or in combination with other mutations (epistasis) is a strong determinant of these properties, the fitness landscape of HIV-1 is considerably less rugged, less neutral, and more correlated than expected from the distribution of main effects and epistatic interactions alone. Overall this study confirms theoretical conjectures about the complexity of biological fitness landscapes and the importance of the high dimensionality of the genetic space in which adaptation takes place.

Evolutionary adaptation can be understood as populations moving uphill on landscapes, in which height corresponds to evolutionary fitness. Although such fitness landscapes are central to evolutionary theory, there is currently a lack of biologically realistic examples. Here we analyze large-scale fitness landscapes derived from in vitro fitness measurements of HIV-1. We find that these landscapes are very rugged and that, accordingly, adaptive processes on these landscapes depend sensitively on the initial conditions. Moreover, the landscapes contain large networks along which fitness changes only minimally. While the relative extent to which mutations affect fitness on their own or in combination with other mutations is a strong determinant of these properties, the fitness landscape of HIV-1 is considerably less rugged than expected from the individual and pair-wise effects of mutations. Overall this study confirms theoretical conjectures about the complexity of biological fitness landscapes and the importance of the high dimensionality of the genetic space in which adaptation takes place.

The fitness landscape is one of the central concepts in evolutionary biology. Ever since Sewall Wright

The centrality of the concept of fitness landscapes for evolutionary biology, combined with the absence of good biological examples has necessitated the study of theoretically conceived and idealized fitness landscapes, often tailored to the particular question studied. The so-called NK landscapes are an example for a broad class of theoretical fitness landscapes

Recent progress in high throughput data generation now allows measuring both fitness and genotype for a large number of mutants

The fitness landscapes analyzed here are based on statistical models that are based on extensive measurements of ^{1800}≅10^{600} fitness values. Clearly, it is impossible to generate all these values despite the fact that the predictive model would allow in principle to compute the fitness for any sequence. Therefore, we describe the properties of the fitness landscapes by using summary statistics based on different types of random or directed walks on these landscapes. Specifically we use such walks to compute three measures that characterize different properties of the landscapes: ruggedness, correlation length and neutrality (see

Ruggedness refers to the number of local fitness optima; i.e. genotypes whose fitness exceeds that of every one of its neighbors. We determine ruggedness as the number of different local optima reached by adaptive walks that climb the fitness landscape by means of steepest ascent from random positions on the landscape. These adaptive walks always move to that neighboring sequence, which has the highest fitness of all the neighboring sequences. Local optima act as attractors for such steepest-ascent walks: if a walk is started within the “attraction domain” of the optimum, the walk will converge to this optimum. Depending on the structure of the landscape, such walks need not end up in the same optima, even if they are started from similar initial conditions. Conversely, walks that end up in the same optima need not originate from similar areas of the fitness landscape. We use such simple hill-climbing walks here as tools to analyze structural properties of the underlying fitness landscape such as ruggedness or the attraction domain of the local optima. To characterize the process of adaptation of populations evolving on these fitness landscapes such hill-climbing walks have limited validity and may overly simplify more complex aspects of evolution. The correlation length quantifies to what extent proximity in sequence space translates into similarity in fitness. To measure correlation length, we perform random walks, which start at a random genotype in the landscape and then randomly move in each step to neighboring genotypes. Recording the fitness values along such a random walk we then determine correlation length as the characteristic distance over which the autocorrelation of fitness decays. Neutrality measures to what extent populations can move on the landscape without changing their fitness. To measure neutrality, we perform quasi-neutral walks, where random steps to neighboring genotypes are only accepted if they do not change fitness by more than a defined small threshold value. We determine neutrality as the maximal distance from the starting genotype that is attained by such a neutral walk.

We first explore these measures for a reference landscape (RL), which is based on the model that best predicts replicative capacity in the drug-free environment (see

The RL is characterized by a large number of optima, a large correlation length and considerable neutrality (^{5} different starting points. For the 10^{5} starting points tested in

(A) Number of different optima attained from steepest-ascent hill-climbing walks starting from random genotypes plotted as a function of the number of starting genotypes. (B) Distribution of attraction domains of steepest-ascent hill-climbing walks: Starting genotypes are chosen in the neighborhoods of 500 randomly chosen reference genotypes. Of each reference genotype, 100 random single, double, triple, fourfold, and fivefold mutants are considered as starting genotypes. Each dot corresponds to a local optimum. Coordinates indicate from how many unique neighborhoods (y-axis) and from what fraction of starting-genotypes in these neighborhoods the optimum is reached (x axis). Thus the y- and x-axis correspond to the global and local density of the attraction domain respectively. (C) Autocorrelation of log-fitness along random walks as a function of the number of steps. The red line corresponds to the linear least square fit of the autocorrelation and the correlation length is given by −1/(slope of the line). (D) Range explored by quasi-neutral walks for different discrete values of the maximal fitness-effect ^{5} walks of length 1000. 95%-confidence-intervals of the mean (inferred through 1000 bootstrap samples) are smaller than point size.

In the landscapes considered here, mutations may have extremely small effects, but they are never completely neutral. To define a sensible concept of neutrality we therefore need to define a threshold for the maximal fitness effect that a mutation is allowed to have to be considered neutral. The exploration range of the resulting quasi-neutral walks strongly depends on the magnitude of this threshold (see ^{−4} or lower, the exploration range is very small with a maximal distance of 5–10 mutations. For thresholds of 10^{−3} or higher, on the other hand, neutral walks can reach considerable distances of 100 mutations or more. Thus, although there are no fully neutral mutations in the RL, the landscape is characterized by large networks over which fitness changes only minimally.

Comparing the RL to the corresponding best-fit landscapes for 15 different environments each characterized by the presence of a different antiretroviral drug (see

(A) Ruggedness (i.e. number of different optima reached from 1000 steepest-ascent hill-climbing walks) for the no-drug and 15 single-drug environments. X-axis labels indicate the antiretroviral drug characterizing each environment (see ^{4} random walks of length 50 starting from random initial conditions. Points correspond to the mean over 100 such measurements of correlation length. 95%-confidence-intervals of the mean (inferred through 1000 bootstrap samples) are smaller than point size. (C) Range explored by quasi-neutral walks (threshold ε = 0.001) for different environments. Points correspond to the mean over 10^{5} walks of length 1000. Error-bars correspond to the 95% confidence-interval of the mean, inferred through 1000 bootstrap samples.

To assess the impact of the strength of epitasis relative to that of main effects, we consider alternative landscapes in which fitness interactions between mutations are weaker. We chose the RL as a reference because it has the highest predictive power (see _{ε}) decreases for small _{ε} landscapes allows us to study the effect of the relative strength of epistasis on ruggedness, correlation length and neutrality.

We find that ruggedness and neutrality consistently increase with the magnitude of epistatic effects (_{ε} gradually shift from a single-peaked smooth landscape without neutral networks (_{ε} depends only weakly on epistasis. All three measures continue to exhibit the same type of dependence on epistasis when switching from the HL to the more epistatic RL (

Ruggedness (A), correlation length (B), and neutrality (C) as a function of the magnitude of epistasis in HL_{ε}. For all panels, the 95% confidence interval of the mean (inferred through 1000 bootstrap samples) is smaller than the size of the data point symbol.

An intuition for the impact of the strength of epistasis on ruggedness can be obtained as follows: If main effects dominate, a given mutation is always either beneficial or deleterious, independent of its background. However, if epistatic interactions dominate, a change in the genetic background can turn a beneficial mutation into deleterious one and vice versa. Thus the landscape only has one peak if main effects dominate, but may have multiple peaks if epistatic effects dominate. Note that epistasis need not necessarily increase ruggedness. For example, this would not be the case if most epistatic interactions were of the same sign (as has often been assumed

The strong impact of the relative strength of main effects and epistasis raises the question whether the properties of fitness landscapes also depend on the detailed correlation structure between different epistatic effects and main effects or whether they are only determined by the distributions of these effects. In order to address this question, we use three different schemes to randomize the main and epistatic effects underlying the RL (

Distribution of ruggedness (A), correlation length (B), and neutrality (C), for different randomizations of the reference landscape. The following randomization schemes are used: In scheme 1 we draw main effects randomly with replacement from the distribution of main effects underlying the RL, whereas epistatic effects are kept as they are in the RL. This destroys any correlation between epistasis and main effects. In scheme 2 we additionally shuffle the non-zero epistasis values. This retains the information of which loci interact epistatically, but shuffles the value of any such interaction. Finally, in scheme 3, we fully shuffle all epistasis and main effect values, and thus destroy all correlations between effects. Each measure is inferred for 100 randomizations of each randomization type and the interpolation of the resulting distribution is plotted. : No randomization (i.e. the 100 realizations are done on the same landscape; black), scheme 1 (red), scheme 2 (blue), scheme 3 (green). For the latter two cases it should be noted that main effects and epistatic effects are shuffled separately, i.e. main effects remain main effects and epistatic effects remain epistatic effects.

It should be noted that the structure of the fitness landscapes discussed here might be affected by selection biases in the data used for the development the fitness-prediction model. The viral isolates have been obtained from HIV-infected individuals and therefore the mutations found in these isolates do not represent a random sample of all possible mutations. On the one hand, because all isolates harbour replication competent viruses, the sample is biased against lethal or highly deleterious mutations. On the other hand, most viral isolates carry drug resistance mutations. These resistance mutations are beneficial in the presence, but typically detrimental in absence of drugs. Hence, in the drug free environment (or in an environment containing drugs to which a give mutation does not confer resistance) the isolates may be enriched in deleterious mutations. In any event, the mutations found in the isolates represent the standing variation of mutations that are present on the level of the host population. Clearly, however, it is likely that the complete fitness landscape of HIV does contain much more fitness-holes/troughs than the landscapes described here, because of the observation bias against lethal mutants.

Comparing the fitness landscape of HIV with various theoretical landscapes that have been used to study evolutionary processes

The fitness-landscapes analyzed here are based on models that predict the fitness of HIV from amino acid sequences. Fitness is measured as the reproductive capacity (RC) of HIV-derived amplicons (representing all of Protease (PR) and most of Reverse Transcriptase (RT)) inserted into a constant backbone of a resistance test vector. The models are then trained to predict this fitness from the amino-acid sequence of the amplicons. Although the fitness, which is predicted by these models, is an in-vitro RC, we could show in

In essence, the predictor is based on fitting the data consisting of amino acid sequences _{ij}_{ij}_{ij}_{ij}_{ij}_{ij;kl}

Note that equation (M1) can also be written as a second order cluster expansion _{j}_{i}_{i}_{j}_{i}/S_{i}'_{k}_{k'}

The different landscapes are all based on model M1, but differ with respect to the relative weight that is given to epistasis and main effects:

The reference landscape RL is obtained by fitting the full model M1 to the data. Thus main effects and epistatic effects are fitted simultaneously. As main and epistatic effects are given the same weight in model fitting, while epistatic effects greatly outnumber main effects, this approach will explain the variance in RCs using mainly epistasis. Therefore this approach tends to overestimate the role of epistasis relative to that of main effects.

The hierarchic landscape (HL) avoids this overestimation of epistatic effects by first fitting model M1 only with the main effects (i.e. the _{ij;kl}_{ij}

The HL_{ε} are derived from the HL by scaling the epistatic effects by a factor _{ij;kl}_{ij;kl} ε

If not stated otherwise, the RC values underlying the fitness-landscapes RL, HL and HL_{ε} are measured in the absence of drugs. In addition we consider 15 alternative versions of the RL based on RC values measured in the presence of 15 different single drugs. The drugs used here are the protease inhibitors amprenavir (AMP), indinavir (IDV), lopinavir (LPV), nelfinavir (NFV), ritonavir (RTV), and saquinavir (SQV), the 6 nucleoside reverse transcriptase inhibitors abacavir (ABC), didanosine (ddI), lamivudine (3TC), stavudine (d4T), zidovudine (ZDV), and tenofovir (TFV) and the non-nucleoside reverse transcriptase inhibitors delavirdine (DLV), efavirenz (EFV), and nevirapine (NVP). For each drug, the replicative capacity of a virus on drugs was given by the interpolated value measured at the drug concentration at which the NL4-3 based control virus has 10% of its replicative capacity in the absence of drug (i.e. the IC90 for NL4-3 is used as the reference drug concentration for every subsequent measurement)

The landscapes are characterized by adaptive, neutral and random walks. Each walk consists of a series/succession of genotypes ^{0}→s^{1}→s^{2}→s^{3}^{k+1}^{k}

The ruggedness of the fitness landscapes is measured as the number of different end-points reached from a pre-specified number of steepest-ascent hill climbing walks (SAHCW) starting from different, random start genotypes. In each step ^{k}^{k+1}^{k}^{max}^{k}^{k}^{max}^{k+1} = s^{max})

The neutrality of fitness landscapes is measured as the range explored by quasi-neutral walks (QNW) of a pre-specified length ^{k}^{k+1}^{k}^{k}'^{k}'^{k}^{k}'^{k+1}^{k}'^{k}^{4} trials, no quasi-neutral mutation has been found, the QNW stays at ^{k}^{k+1}^{k}^{1}….^{L}_{0}.

The correlation length of a fitness landscape is measured as the inverse decay rate of the autocorrelation of the log-fitness along random walks (RW). Specifically, a pre-specified number (typically 10^{5}) of random walks are initiated each from a different random start genotype. In each step of a given RW a single randomly chose amino acid substitution is performed. The autocorrelation after

Predictive power (measured as the fraction of the deviance explained) of the fitness models underlying the fitness landscapes considered. The dashed line corresponds to the RL. Points correspond to the HL_{ε} for different values of ε. See Hinkley at al.

(PDF)