The authors have declared that no competing interests exist.
We determine an optimal protocol for temozolomide using population variability and dynamic optimization techniques inspired by artificial intelligence. We use a Pharmacokinetics/Pharmacodynamics (PK/PD) model based on Faivre and coauthors (Faivre, et al., 2013) for the pharmacokinetics of temozolomide, as well as the pharmacodynamics of its efficacy. For toxicity, which is measured by the nadir of the normalized absolute neutrophil count, we formalize the myelosuppression effect of temozolomide with the physiological model of Panetta and coauthors (Panetta, et al., 2003). We apply the model to a population with variability as given in Panetta and coauthors (Panetta, et al., 2003). Our optimization algorithm is a variant in the class of MonteCarlo tree search algorithms. We do not impose periodicity constraint on our solution. We set the objective of tumor size minimization while not allowing more severe toxicity levels than the standard Maximum Tolerated Dose (MTD) regimen. The protocol we propose achieves higher efficacy in the sense that –compared to the usual MTD regimen– it divides the tumor size by approximately 7.66 after 336 days –the 95% confidence interval being [7.36–7.97]. The toxicity is similar to MTD. Overall, our protocol, obtained with a very flexible method, gives significant results for the present case of temozolomide and calls for further research mixing operational research or artificial intelligence and clinical research in oncology.
One of the salient features of treatments in oncology is the persistent gap prevailing between standard drug regimens, corresponding to the official recommendation, and the actual drug regimens that are applied at bedside. For instance, Atkinson et al. [
This paper belongs to this trend and also relies on a PK/PD model to determine optimal chemotherapy regimen. We investigate the case of temozolomide, used in the treatment of some brain cancers, notably for children. Our optimization exercise is innovative along two dimensions. First, we fully relax the schedule constraint. We determine the optimal protocol over a 336day period, but we do not impose any cycle or weekly pattern. The period length of 336 days corresponds to a multiple of the cycle length of the standard Maximum Tolerated Dose (MTD) protocol for temozolomide. More precisely, in every day of the simulation period, we determine which treatment dose –including no dose– is optimal. Giving up cycles enables us to quantify the possible gains from opting for a fully unconstrained approach. Even though the existence of cycles are often considered as an important feature of clinical trials, we believe that our computational approach is a very good opportunity to assess the benefits of removing cycle constraints. The second innovation is that the optimal protocol is not only designed for a “median” patient, but for an heterogeneous population. Indeed, we take into account the individual patients specificities through heterogeneity in the population pharmacokinetics. We rely on the data of Panetta and coworkers [
In our
What does our optimal protocol look like? First, our protocol exhibits a pseudoperiodicity. Every 5 weeks approximately, the protocol features several consecutive days of treatment –typically, three to five. Each of these periods of consecutive treatment days is followed by a period lasting approximately 4 weeks, during which few –three to five– treatment days take place. Even though we do not impose
Even
The PK/PD model of temozolomide we rely on borrows from two sources. First, the pharmacokinetics of temozolomide, and the pharmacodynamics of efficacy come from Faivre and colleagues [
We now provide a brief description of the PK/PD model. First, pharmacokinetics follows the original paper of Panetta et al. [
Finally, the pharmacodynamics of toxicity relies on a physiological model of hematopoiesis, describing the myelosuppressive effect of temozolomide. The model was originally proposed by Panetta and coworkers [
We simulate the PK/PD model over a time length of 336 days, which corresponds to 12 full cycles of the standard MTD protocol. All computations are implemented in C++. For each protocol, we assess its efficacy and toxicity for a given patient as follows.
Since we use a population model for the pharmacokinetics, the drug absorption is not constant throughout the population and consequently, plasmatic concentration of temozolomide for a given protocol also varies across patients. Therefore, even though the pharmacodynamics for both toxicity and efficacy is constant in the population, the actual efficacy and toxicity of a given protocol, that depend on the drug plasmatic concentration, vary across patients. A given protocol is consequently not characterized by a unique pair of efficacy and toxicity, but by a population distribution of efficacy and toxicity values.
We illustrate these aspects in panel A of
Panel A: Population variability. Solid line: median, dashed lines: 5th and 95th percentiles. Panel B: No variability.
From
We provide a detailed version of the pseudocode in Algorithm 1. All statements following the sign ‘//’ are comments. The algorithm relies on the PK/PD model for temozolomide described above, that we do not make explicit here for the sake of conciseness. In the algorithm, a patient is characterized by a set of particular values for pharmacokinetics parameters –that are fixed over time– and a pair of efficacy and toxicity values that evolves over time, reflecting the administered protocol and the dynamics imposed by the PK/PD model. A population is a collection of patients and is characterized at every day by the distribution of efficacy and toxicity values.
The algorithm consists of two main parts. The first part is the procedure P
The core of the function is to determine, at a given day
1:
// Update the characteristics of the population
2:
3:
4:
5:
6:
// Determine the optimal protocol for the patient population
//
//
7: OptiP ← empty vector of length
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18: %
19:
20:
21:
22:
23:
24:
25:
26:
27:
28: OptiP(
29:
30:
31: OptiP(
32:
33:
34:
35:
In order to determine which of the 200 mg/m^{2} or no dose is optimal, function O
The issue in the previous operation is that assessing future outcomes for toxicity and efficacy relies on future dose administrations that are unknown by construction. We therefore need to make assumptions. We will suppose that the assessment of future outcomes relies on future protocols that are
Finally, the dose administered at day
In the remainder, we will refer to this optimal protocol as the heuristic –or H– protocol.
As a benchmark, we implement our optimization algorithm in absence of variability. The pharmacokinetics is identical for all patients, as in panel B of
Our results are summarized in
Protocol  Norm. ANC nadir (%)  Tumor mass (g) 

MTD  7.00  38.15 
H protocol  7.00  6.51 
We now turn to the output of the H protocol in presence of pharmacokinetics variability across patients. So as to observe its efficacy and toxicity, the H protocol is administered to a population of 3,200 patients drawn from the pharmacokinetics distribution. We compare the H protocol to MTD, administered to the same population of 3,200 patients. We summarize our results in
Median values and in square brackets, the 5th and 95th percentiles.
Protocol  Norm. ANC nadir (%)  Tumor mass (g) 

MTD  6.74 
32.99 
H protocol  4.17 
1.80 
Our results are unambiguous. The H protocol delivers a much better efficacy than MTD. The median tumor mass is 1.80 grams compared to 32.99 grams with MTD. The differences though still impressive, are slightly smaller for the 5th and 95th percentile. On average, the H protocol yields a tumor mass approximately 7.66 times smaller than MTD! The 95% confidence interval for the size factor is [7.36–7.97]. Furthermore, this smaller average value comes with a smaller dispersion of the tumor mass across patients. While with MTD the range of tumor masses between the 5th and 95th percentiles varies from 0.72 gram to 111.40 grams, the same range with the H protocol only covers the interval between 0.60 gram and 33.55 grams. In other words, the H protocol offers a better efficacy in terms of average
This better efficacy does not come at the cost of greater toxicity. Indeed, the population share experiencing a toxicity below the acceptability threshold is smaller with the H protocol than with MTD. More precisely, the 5th percentile of toxicity with the H protocol corresponds to a normalized ANC nadir equal to 2.74%, which is very close to –and slightly above– the 5th percentile in the MTD case. However, we can observe that, with no impact on our objective measure, the dispersion of the normalized ANC nadir in the population with H protocol is much smaller. Indeed, with the H protocol, 95% of the population experiences a normalized ANC nadir below 6.2%, while with MTD this 95th percentile reaches 10.76%. Population toxicity is therefore more concentrated around the acceptability threshold with the H protocol than with MTD. This better control of toxicity with the H protocol can be an important factor in explaining its better efficacy in terms of average and dispersion.
Elements of
Panel A: H protocol. Panel B: MTD protocol.
We can also compare more precisely the two protocols patientwise, since populations to which the MTD and H protocols have been administered are identical. First, regarding toxicity, each patient experiencing a normalized ANC nadir below the acceptability threshold with the H protocol, also experiences a belowthreshold ANC nadir with MTD. In other words, if the toxicity level for a given patient is too high with the H protocol, switching to MTD will not restore an acceptable toxicity level. Second, patientwise efficacy comparisons are also unambiguous. For each of the 3,200 patients in the population, the H protocol yields a strictly smaller tumor size than MTD. Not only the H protocol has a better efficacy than MTD, in terms of average and of dispersion, but the former also offers a strictly better efficacy than the latter for each and every patient, with no toxicity aggravation.
Finally, we report in
Left panel: H protocol. Right panel: MTD protocol.
We can draw several lessons from
Finally, regarding the patterns of treatment and rest periods, we can observe a pseudoperiodicity for the H protocol. This pseudoperiodicity is reminiscent of the MTD protocol cycles. Even though we do not impose any cycle, a pseudocycle naturally emerges in the H protocol. However, despite the resemblance with MTD, periodicity of the H protocol is not as exact as for MTD –hence, the term pseudoperiodicity. Periods of consecutive treatment days do not always exactly last 5 days and the interval between those periods does not always exactly amount to 23 days. Finally, and more substantially, the interval between the blocks of consecutive treatment days is never a full rest period but always contains a handful of treatment days (from 2 to 4). These interim treatment days seem to have a significant impact on the efficacy of the protocol, by avoiding the tumor to recover too much between treatment periods. They also influence the normalized ANC, which is, as discussed above, overall lower with H than with MTD. These interim treatment days also connect the H protocol to metronomic chemotherapy regimens, which involve low doses at a frequent schedule and without prolonged no treatment period.
Since the curse of dimensionality prevents an actual optimization to be conducted in this setup, there is no obvious protocol to which we can compare the H protocol. For this reason, we have chosen to compare the outcomes of our optimal protocol to those of a large family of protocols generalizing MTD. More precisely, we will consider the set of protocols {
We report the results in
Median values and in square brackets, the 5th and 95th percentiles.
Protocol  Tumor mass (g)  Norm. ANC nadir (%) 

H protocol  1.80 
4.17 
301.83 
42.33 

180.65 
22.68 

113.104 
14.17 

67.7481 
9.61 

32.99 
6.74 

2.76 
3.54 

1.56 
1.27 

1.26 
0.71 

1.11 
0.46 

1.02 
0.32 
We have proposed a novel algorithm for the optimization of temozolomide protocols, by taking into account a multipleobjective criterion. Our H protocol features a much better efficacy than the standard MTD. The efficacy, in terms of both average value and of dispersion is unambiguously in favor of the H protocol compared to MTD. This better efficacy can partly be explained by a better management of toxicity. On the one hand, a smaller share of the population experiences a toxicity below the acceptability threshold, and on the other hand, the toxicity for all patients is overall closer to the acceptability threshold. It is noteworthy that our algorithm is very flexible. In particular, the algorithm is able –with no added complexity– to handle a multidimensional nonlinear objective and to address population variability.
Our article can also be seen as a first and successful step toward the introduction of methods borrowed from operational research and artificial intelligence into the realm of protocol design in oncology.
We describe the equations for the temozolomide PK/PD model and its calibration.
(PDF)
We present the efficacy and toxicity results optimal protocols for several calibrations of parameter
(PDF)
We provide the detailed results of two other algorithm calibrations, which respectively correspond to a 0% and a 7% target population share. We also provide the complete results for protocols {
(PDF)
We are grateful to Sébastien Benzekry for his valuable comments on an earlier version of this paper.