PLoS ONEplosplosonePLOS ONE1932-6203Public Library of ScienceSan Francisco, CA USAPONE-D-21-0091710.1371/journal.pone.0251626Research ArticlePhysical sciencesMathematicsNumerical analysisInterpolationPhysical sciencesMathematicsGeometryTangentsPhysical sciencesMathematicsCalculusIntegralsPhysical sciencesMathematicsOptimizationPhysical sciencesMathematicsCalculusResearch and analysis methodsMathematical and statistical techniquesMathematical functionsExponential functionsPhysical sciencesMathematicsApproximation methodsOn closed-form tight bounds and approximations for the median of a gamma distributionOn closed-form tight bounds and approximations for the median of a gamma distributionhttps://orcid.org/0000-0003-2348-811XLyonRichard F.ConceptualizationFormal analysisInvestigationMethodologySoftwareVisualizationWriting – original draftWriting – review & editing*Google Research, Google Inc., Mountain View, California, United States of AmericaKryvenIvanEditorUtrecht University, NETHERLANDS
Author RFL is employed by Google. This does not alter RFL’s adherence to PLOS ONE policies on sharing data and materials. Google has no restrictions on this work.
* E-mail: dicklyon@acm.org20211352021165e0251626101202129420212021Richard F. LyonThis is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
The median of a gamma distribution, as a function of its shape parameter k, has no known representation in terms of elementary functions. In this work we use numerical simulations and asymptotic analyses to bound the median, finding bounds of the form 2^{−1/k}(A + Bk), including an upper bound that is tight for low k and a lower bound that is tight for high k. These bounds have closed-form expressions for the constant parameters A and B, and are valid over the entire range of k > 0, staying between 48 and 55 percentile. Furthermore, an interpolation between these bounds yields closed-form expressions that more tightly bound the median, with absolute and relative margins to both upper and lower bounds approaching zero at both low and high values of k. These bound results are not supported with analytical proofs, and hence should be regarded as conjectures. Simple approximation expressions between the bounds are also found, including one in closed form that is exact at k = 1 and stays between 49.97 and 50.03 percentile.
Google Inc.https://orcid.org/0000-0003-2348-811XLyonRichard F.The author(s) received no specific funding for this work. Author RFL is employed and partially funded by Google. The funder provided support in the form of salary for RFL and the publication fee for this article, but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.Data AvailabilityAll relevant data are within the manuscript.Introduction
The gamma distribution PDF is 1Γ(k)θkxk-1e-xθ, but we’ll use θ = 1 because both the mean and median simply scale with this parameter. Thus we use this PDF with just the shape parameter k, with k > 0 and x ≥ 0:
p(k,x)=1Γ(k)xk-1e-x.
The mean of this distribution, μ, is well known to be μ(k) = k. The median ν(k) is the value of x at which the CDF equals one-half:
12=∫0ν(k)p(k,x)dx=∫0ν(k)xk-1Γ(k)e-xdx.
This equation has no easy solution, but the median is well known to be a bit below the mean, bounded by [1]
k-13<ν(k)<kand0<ν(k).
Bounds that are tighter in some part of the shape parameter range can be obtained from the known Laurent series partial sums [2, 3], or from the low-k asymptote and bounds of Berg and Pedersen [3].
The Chen and Rubin bounds k-13<ν(k)<k are tight [1]; the upper bound in the low-k limit and the lower bound in the high-k limit. Recently, Gaunt and Merkle [4] proved that the line of slope 1 that intersects ν(k) at the known value ν(1) = log 2 is an upper bound for k ≥ 1; that is, ν(k)<k − 1 + log 2, much tighter than the ν(k)<k bound, leveraging the prior result at integers by Choi [2] and the result by Berg and Pedersen that the slope of the median ν′(k) is everywhere less than 1 [3]. As shown in Fig 1, this new upper bound can be combined with a chord for 0 ≤ k ≤ 1, based on convexity shown by Berg and Pedersen [5]. Convexity also implies that any tangent line is a lower bound, and we show later how to find the slope ν′(1) at the point where the value ν(1) = log 2 is known; that new linear lower bound is included in Fig 1 with the prior linear and piecewise-linear bounds.
10.1371/journal.pone.0251626.g001Linear and piecewise-linear bounds.
The bounds k-13<ν(k)<k and ν(k)>0 (solid lines) are shown along with the true value (solid curve), the piecewise-linear bound that combines the recent linear bound for k > 1 [4] with a chord segment (dashed lines), and the linear lower bound that is tangent at k = 1 (dash-dot line). The region k < 1 is not very usefully bounded.
A Laurent series for ν(k) with rational coefficients has been discovered, with deep connections to some math by Ramanujan. Choi [2] applied Ramanujan’s work to this particular question, providing 4 coefficients (through the k^{−3} term). Berg and Pedersen [3], based on work by Marsaglia [6], extended this to 10 coefficients. Neither commented on the radius of convergence, which appears to be in the neighborhood of k = 1. For large enough k and N, the series yields excellent approximations, but for k < 1 it is useless; convergence near k = 1 is very slow.
ν(k)≈k+∑j=0Najk-j
with aj={-13,2334·5,23·2336·5·7,23·28139·52·7,-23·17·139753313·53·7·11,-23·708494947315·53·72·11·13…}. Thus (where Choi [2] had 144 instead of the correct 2^{3} ⋅ 23 = 184):
ν(k)=k-13+8405k+18425515k2+O(1k3).
Partial sums of a Laurent series are not generally bounds, but the first two (k and k − 1/3) are upper and lower bounds [1], respectively, and the sums ending with −3 and −5 powers of k also appear (numerically) to be upper and lower bounds, respectively.
Berg and Pedersen [3] also derived an asymptote for small k, which we call ν_{0}(k):
ν(k)∼ν0(k)=e-γ2-1/k,
where γ ≈ 0.577216 is the Euler–Mascheroni constant. This asymptote is a lower bound, as we will show. Berg and Pedersen [3] also provide an upper bound ν(k)<e^{−1/3k}k that is just above k-13, and a lower bound ν(k)>2^{−1/k}k.
These previous known bounds are illustrated in Fig 2, where upper and lower bounds are distinguished by different line styles. The factor 2^{−1/k} is the key to good approximations and bounds, and can be divided out to reduce the dynamic range of values on later plots of ν versus k. For the range 0.001 < k < 100, this reduces the dynamic range we need to work with by about 300 orders of magnitude.
10.1371/journal.pone.0251626.g002Previously known bounds.
Previously published bounds (lower bounds solid, upper bounds dashed) for the median of a gamma distribution (heavy dotted), are good at high k or low k, but not both. At the left, at k = 0.01, the median is near 10^{−30}. Only ν_{0}(k) is close at low k, and it was published as an asymptote, but not a bound; it is strictly less than our new lower bound ν_{Γ}(k) (dotted).
Others have shown good bounds and approximations where k is an integer, that is, for the Erlang distribution [2, 7–9]; these do not help us for 0 < k < 1. The Wilson and Hilferty cube-root transformation [10] of chi-square leads to an approximation at half-integers that is apparently an upper bound: ν(k)<k(1-19k)3 for k≥12; if it is, then dropping the final negative term would make it an upper bound for all k: ν(k)<k-13+127k, which is a bit tighter than the upper bound ν(k)<k-13+118k that Berg and Pedersen [3] proved from their upper bound ν(k)<ke^{−1/3k}.
We seek upper and lower bounds that are tighter, especially in the middle part of the k range (near k = 1), than are previously known. Further, we seek simple approximation formulae for the median, leveraging these bounds via interpolation between them. Main results are summarized in two tables in sections to follow.
The approach of approximating functions by interpolating between upper and lower bounds has been discussed by Barry [11], who used minimax optimization to find numeric parameters in an interpolation-between-bounds approximation to the exponential integral. We are not aware of an interpolation approach being used to find improved bounds in closed form, as we propose here.
A family of asymptotes and bounds
The Berg and Pedersen lower bounds 2^{−1/k}e^{−γ} and 2^{−1/k}k are tight within their families 2^{−1/k}A and 2^{−1/k}Bk, but not as tight as possible within the wider two-parameter family 2^{−1/k}(A + Bk). We find (via numeric and asymptotic evidence, but without proof) that the sum of these two is the uniquely tight upper bound in that family, and that there is a range of tight lower bounds in the family.
To improve on the lower bounds, and to motivate the family that we consider further, first solve for ν(k) in a simple approximation to the distribution’s integral, using e^{−x} < 1 for x > 0:
12<∫0ν(k)xk-1Γ(k)dxνΓ(k)=2-1/kΓ(k+1)1/k<ν(k)
which is a tight lower bound, and is a good approximation for k < 0.1, but not so great at high k—and not what we consider a closed form, due to the gamma function. The factor 2^{−1/k}, in common to this bound and to Berg and Pedersen’s lower bound and asymptote, from a focus on low k instead of the usual high k, is key to our approach.
As an aside, since ν_{Γ}(k) is a lower bound, we can prove that Berg and Pedersen’s asymptote ν_{0}(k) = e^{−γ}2^{−1/k} is a lower bound by showing that e^{−γ} < Γ(k + 1)^{1/k}. That lim_{k → 0} Γ(k + 1)^{1/k} = e^{−γ} follows from the Taylor series about 0 of Γ(k + 1), which is 1 − γk + O(k^{2}). That Γ(k + 1)^{1/k} increases monotonically from there, even though Γ(k + 1) is decreasing, is proved by showing that its derivative is everywhere positive. Differentiating, in terms of the digamma function ψ^{(0)}, the logarithmic derivative of the gamma function, we find:
ddkΓ(k+1)1/k=Γ(k+1)1/k(kψ(0)(k+1)-log(Γ(k+1))k2
The only factor here that is not obviously positive for k > 0 is kψ^{(0)}(k + 1) − log (Γ(k + 1), which at k = 0 is equal to 0, and has a surprisingly simple derivative:
ddk(kψ(0)(k+1)-log(Γ(k+1))=kψ(1)(k+1)
Here ψ^{(1)} is the trigamma function, the derivative of the digamma function. This derivative is positive since the trigamma function, a special case of the Hurwitz zeta function, is positive for real arguments, because it has a series expansion with all positive terms:
ψ(1)(z)=∑n=0∞1(z+n)2
Therefore Γ(k + 1)^{1/k} is increasing, as was also obvious numerically, so the Berg and Pedersen asymptote is a lower bound.
The ν_{Γ}(k) expression resembles the “quantile mechanics” boundary condition for the gamma distribution from Steinbrecher and Shaw [12]. More generally, their work implies that (uΓ(k + 1))^{1/k} is a lower bound for the u quantile of the gamma distribution, if the coefficients of their power-series recurrence are all positive, for all 0 < u < 1 and all k > 0, as they appear to be. Nevertheless, since they don’t claim it as a bound, we say that ν_{Γ}(k) is new. If their coefficients are all positive, then partial sums of their power series form a family of lower bounds.
The lower bound ν_{Γ}(k) converges with the Berg and Pedersen asymptote at low k. We can improve Berg and Pedersen’s asymptote in closed form by utilizing another term. A symbolic calculus system finds for us the next Taylor series terms about k = 0 for the power of the gamma function:
Γ(k+1)1/k≈e-γ+e-γπ212k-0.035k2.
With the additional term, we have an improved approximation, which has much less relative error than Berg and Pedersen’s at low k, and has a high-k behavior nearly proportional to k (but is not a lower bound because we made it larger by ignoring a next negative term):
ν1(k)=2-1/k(e-γ+e-γπ212k).
Inspired by this asymptotic approximation, we consider members of this family of functions, with coefficients A and B, and analyze which ones are bounds:
ν˜(k)=2-1/k(A+Bk).
Bounds and asymptotic approximations in this family, described in the next section, and a few others are summarized in Table 1. Some of these are the basis for later sections on interpolation between bounds.
10.1371/journal.pone.0251626.t001Summary of simple bounds and asymptotes.
Version
Description
ν(k)
True median of gamma distribution
e^{−1/3k}k
B&P’s upper bound, high-k asymptote
k + log (2) − 1
G&M’s upper bound for k ≥ 1
k log (2)
New chord upper bound for k ≤ 1
log 2 + (k − 1)ν′(1)
New linear tangent lower bound
ν_{Γ}(k) = 2^{−1/k} Γ(k + 1)^{1/k}
New lower bound, low-k asymptote
A
B
parameters for the form 2^{−1/k}(A + Bk)
2^{−1/k}k
0
1
B&P’s lower bound
ν_{0}(k)
e^{−γ}
0
B&P’s low-k asymptote, a lower bound
ν_{1}(k)
e^{−γ}
e-γπ212
Improved low-k asymptote; not a bound
ν_{U}(k)
e^{−γ}
1
New uniquely tight upper bound*
ν_{L0}(k)
e^{−γ}
0.4596507
New tight lower bound*, best at low k
ν_{L1}(k)
0.4111107
0.9751836
New tight lower bound*, tangent at k = 1
ν_{L∞}(k)
log2-13
1
New tight lower bound*, best at high k
Comparison of several median bounds and asymptotes. B&P refers to Berg and Pedersen [3], and G&M refers to Gaunt and Merkle [4]. Conjectured bounds that have not been proved, but are supported by asymptotic and numerical results, are marked with an asterisk(*).
Tight upper and lower bounds
A graphical characterization of this family is most informative. Given some values of k and corresponding numerical ν(k), we can find the lines A + Bk = ν(k)2^{1/k} in A–B space, and plot them—see Fig 3. Regions full of lines are not bounds, and regions without lines are where bounds are found (including some of Berg and Pedersen’s bounds); we’re interested in the boundaries between these regions, where tight bounds are to be found.
10.1371/journal.pone.0251626.g003Parameter space.
The A–B parameter space is shaded with dash-dot lines where ν(k)2^{1/k} = A + Bk, for a set of very small to very large k values in geometric progression (using numerically computed ν(k) values). Key values of A and B are indicated. Points outside (or on the edge) of the shaded region represent bounds, while points inside the shaded region represent functions that cross the median function. There is an obvious uniquely tight upper bound ν_{U}, and a curved locus of tight lower bounds from ν_{L0}, which is tightest near k = 0, to ν_{L∞}, which is tightest for high k. One point (pentagram) on the curved locus represents a lower bound ν_{L1} that is tight at k = 1, for which A + B = 2 log 2 (which is the equation of the dashed line). The point ν_{1} represents a good asymptotic approximation close to ν_{L0}, but not a bound; see the next figure. The dotted line from ν_{U} to ν_{L∞} at B = 1 intersects the lines for all k in monotonic order, with A decreasing while k increases.
The improved asymptote ν_{1} is great at low k, but is neither an upper nor a lower bound, as shown in Fig 4. We can modify it to approach k-13 at high k by a few adjustments, via this asymptotic approximation that we get from a symbolic calculus system:
k2-1/k=k-log2+O(k-1).
10.1371/journal.pone.0251626.g004Parameter space detail.
Zooming in to the parameter neighborhood of ν_{1} and ν_{L0}, note that the point with B = e^{−γ}π^{2}/12, which we got from the Taylor series of the power of the gamma function, is actually inside the shaded area, so does not represent a bound; but a point at slightly lower B = 0.45965 is on the edge, so represents a lower bound. These points give zero error at approximately k = 0.1003 and k = 0.0708, respectively (see their errors, or margins, in Fig 5). We do not have analytic formulations for these numeric and graphical observations.
Thus we find this approximation for high k, which appears to be a lower bound as illustrated in Fig 3:
νL∞(k)=2-1/k(log2-13+k).
A compromise approximation for low k mixes these two, differing from the high-k approximation in only the A coefficient, leaving a result consistent to the same order as Berg and Pedersen’s asymptote at low k, and forming an upper bound as illustrated in Fig 3:
νU(k)=2-1/k(e-γ+k).
This mixed approximation has absolute and relative errors approaching zero at low k, and relative error approaching zero at high k; but the absolute error remains high, near log2-13-e-γ≈0.20, at high k. These approximations and their errors are illustrated in Fig 5.
10.1371/journal.pone.0251626.g005New bounds and their margins, premultiplied by 2<sup>1/<italic>k</italic></sup>.
The lower bounds ν_{Γ}, ν_{L0}, ν_{L1}, and ν_{L∞} (solid), and upper bound ν_{U} (dashed) are shown in red over the ideal median (black heavy dots), with their absolute errors in blue, all premultiplied by 2^{1/k} to reduce the required plot range. The approximation ν_{1}, which is not a bound, is also shown; note that its error curve changes from solid to dashed at the cusp, while the margins for ν_{L0} and ν_{L1} have cusps (on the log scale) where the margins graze zero but do not change sign. The k parameters at these cusps correspond to the sloped black lines indicated in the previous two figures.
These observations are consistent with what Fig 3 suggests: that A ≥ e^{−γ}∧B ≥ 1 is a necessary and sufficient condition for 2^{−1/k}(A + Bk) to be an upper bound to the median, with equality for the tightest upper bound. And for lower bounds, the condition A ≤ e^{−γ}∧B ≤ 1 is necessary, but not sufficient.
To support the graphical/numerical observation that ν_{L∞}(k) and ν_{U}(k) are lower and upper bounds, respectively, of the true median (ν_{L∞}(k)<ν(k)<ν_{U}(k)), we examine their asymptotic behaviors in more detail. At low k, it is easy to see, using log2-13≈0.359814<e-γ≈0.561459, and e-γπ212≈0.461781<1, that these differences are positive, for k → 0:
21/k(ν(k)-νL∞(k))=e-γ-(log2-13)+O(k)>0.21/k(νU(k)-ν(k))=k(1-e-γπ212)+O(k2)>0.
At high k, a symbolic calculus system gives us for ν_{L∞}(k):
2-1/kk=k-log2+log222k-1+O(k-2)
which we can use to construct comparisons to the Laurent series terms for ν(k) [2, 3]. Again we find positive differences, with log23-log222≈-0.009177<8405, and log2-e-γ≈0.131688<13, for k → + ∞:
ν(k)-νL∞(k)=(8405-(log23-log222))k-1+O(k-2)>0;νU(k)-ν(k)=e-γ-log2+13+O(k-1)>0.
In addition to these high-k and low-k asymptotic results, we can show the inequalities also hold at k = 1 where 12(log2-13+1)<log2<12(e-γ+1), but otherwise we’re relying on the graphical and numerical results. But though there is considerable margin in the asymptotes, and the median is well behaved (unique, monotonic, and smooth, with positive second derivative for all k > 0 [3, 5]), we do not have a proof that these are bounds. But if they are bounds, they are tight, in the sense that the positivity constraints would not all hold, since higher-order terms would not cancel, if A_{L∞} or B_{L∞} were any higher, or if A_{U} or B_{U} were any lower.
For the lower bound ν_{L1}(k) that is tight at k = 1, both the value and the slope need to match the true median. The value is the median of the exponential distribution, ν(1) = log 2. The slope ν′(k) is somewhat more troublesome to work out, but is tractable at the special point k = 1, where the CDF P(k, x) (the lower incomplete gamma function) and PDF p(k, x) are both exponential functions.
P(k,x)=∫0xp(k,t)dt=∫0xtk-1Γ(k)e-tdt.
At the point where P(k,x)=12, where x = ν(k), the slope is:
ν′(k)=dνdk=-∂P(k,x)∂k/∂P(k,x)∂x
The derivative with respect to x is easy, except that we only have a closed-form relation between x and k at k = 1, where we know x = ν(1) = log 2 and p(k,x)=e-log2=12, so the derivative is 12 there. The derivative with respect to k is messier:
∂P(k,x)∂k=-Γ(k)-2dΓ(k)dk∫0xtk-1e-tdt+Γ(k)-1∫0xdtk-1dke-tdt
At k = 1, using Γ(k) = 1 and dΓ(k)/dk = −γ, this derivative evaluates to
∂P(k,x)∂k|k=1=γ2+∫0xlogte-tdt=γ2+Ei(-log2)-γ-12loglog2
where Ei(− log 2) = −0.3786710 is the exponential integral (integration and evaluation assisted by Wolfram Alpha). Putting these results together we get the slope of the median:
ν′(k)|k=1=γ-2Ei(-log2)-loglog2≈0.9680448
which is a mathematical expression with a definite value, but is not a closed form due to the exponential integral, so still requires a numerical approach to evaluate it. Therefore, we describe this bound with approximate numerical parameters instead of closed-form analytic expressions.
B=2(ν′(k)|k=1-log22)≈0.9751836,A=2log2-B≈0.4111107.
Unlike the straight-line lower bound ν(k) < log 2 + (k − 1)ν′(1), which easily follows by convexity of ν(k), this new function ν_{L1}(k) that is tangent at the same point is not proved to be a bound.
The conjectured tight lower bound ν_{L0}(k) is in worse shape, as we have to search for the k value that gives the lowest B value with A = e^{−γ}. So its B parameter has no concise mathematical expression, but can be computed to high precision: B ≈ 0.4596507.
The conjectured bounds and asymptotic approximations discussed here are summarized in Table 1. In subsequent sections we focus primarily on the new upper and lower bounds with closed-form coefficients, ν_{U}(k) and ν_{L∞}(k), as a basis for even tighter closed-form bounds. Fig 6 shows the percentile values achieved by these bounds and approximations, compared to the ideal 50% that defines the median: ν_{L∞}(k) always comes in between 48% and 50% and ν_{U}(k) always between 50% and 55% (percentiles are calculated in Matlab as 100*gammainc(x, k), using the normalized lower incomplete gamma function that is the CDF for our PDF).
The percentiles achieved by four conjectured median bounds of the form 2^{−1/k}(A + Bk) (solid curves) are plotted, along with the straight-line bounds (dotted), upper and lower bounds from Berg and Pedersen [3] (dash-dot), and a pair of closer bounds formed by interpolation between ν_{U}(k) and ν_{L∞}(k) using a one-parameter rational function (dashed). The bounds 2^{−1/k}k, ν_{U}(k), ν_{L∞}(k), and the interpolated bounds converge on 50th percentile at both low and high k, while the other eight do not. Both the upper and lower interpolated bounds are close to ν_{U}(k) at low k and close to ν_{L∞}(k) at high k; tighter such interpolated bounds, developed in a later section, would crowd the center of the graph.
The coefficient of k^{−1} for ν_{L∞}(k) is negative, so the lower bound ν_{L∞}(k) at high k is less than the k-13 lower bound, in spite of having asymptotically zero absolute margin. That is, for k > 3.021, it’s a looser lower bound and a worse approximation than k − 1/3, even though it is the tightest lower bound of the form we’re considering. On the other hand, ν_{U}(k) is a much tighter upper bound than k is, for all k, with an asymptotic margin near 13-(log2-e-γ)≈0.202; for k ≥ 1, the recent straight-line bound k − 1 + log 2 [4] is tighter still, with an asymptotic margin 13-(1-log2)≈0.026.
Formulae for tighter bounds
Letting A and B be functions of k, rather than constants, allows tighter bounding expressions (and potentially exact expressions) for ν(k), but not enough structure. Allowing only one of them to vary, and tying the other to values used in the tight bounds above, allows a more constrained space of bounds.
With B_{L∞} = B_{U} = 1, we can express the median exactly as ν(k) = 2^{−1/k}(A(k) + k), for some smooth positive real function A(k) that runs from a limit of A_{U} = e^{−γ} as k → 0 to AL∞=log2-13 as k → + ∞; it is apparently monotonic. Alternatively, using A_{L0} = A_{U} = e^{−γ}, the formula ν(k) = 2^{−1/k}(e^{−γ} + B(k)k) has a smooth positive but non-monotonic function B(k) that runs between limits e-γπ212 and 1, but drops a little below its low-k limit before increasing.
This approach converts the problem of finding tighter bounds to the median to the problem of finding closed-form expressions to bound these more well-behaved functions. Calculating A(k) and B(k) numerically to high precision is easy when the median can be calculated; see Fig 7. For the rest of this paper, we focus on A(k), since it is monotonic and more nearly symmetric on a log k axis, and because it corresponds to interpolation between closed-form bounds.
10.1371/journal.pone.0251626.g007Ideal parameters as functions of <italic>k</italic>.
Functions A(k) and B(k), either of which can solve ν(k) = 2^{−1/k}(A + Bk), with the other constant at the limiting values indicated by the circles, the parameters of ν_{U}. Modeling either of these curves can lead to better bounds or approximations.
Toward a proof
The conjectured inequalities ν_{L∞}(k)<ν(k)<ν_{U}(k) are equivalent to log2-13<A(k)<e-γ; that is, that A(k) stays above its high-k limit and below its low-k limit. We know that the asymptotic slopes of A(k) are negative at both ends (and in the middle at k = 1), so it will be sufficient to show that the slope is negative everywhere; or that the function A(k) is convex. It is not quite enough that A(k) comes from monotonic and convex parts 2^{1/k} and ν(k). The proof of convexity of ν(k) [5] was complicated, and similar techniques might be needed here.
Numerically, we can see that A(k) is mostly sufficiently confined between other known bounds in different parts of the k range, but bounding it with other bounds could be a hard way to construct a proof. First, we’d need a better low-k upper bound, which we might get using 1 − x < e^{−x} as we used e^{−x} < 1 in finding ν_{Γ}(k). And on the lower side we could attempt to prove that the quantile mechanics [12] partial sums are lower bounds that are tighter, at low k, and that the Laurent series partial sum through the k^{−5} term is a tighter lower bound at high k.
Lacking proof of our main result, we cannot even start to prove the tighter bounds found in coming sections, based on interpolation, but again we have good confidence from asymptotic and numerical results. In some cases, asymptotic analysis yields closed-form expressions for the tightest bounds of the families, while numerical methods support the conjecture that they are bounds.
Interpolators
The function A(k) introduced above can be represented in terms of an interpolation function g(k) that runs monotonically from a low-k limit of 0 to a high-k limit of 1:
A(k)=g(k)AL∞+(1-g(k))AU=AU-g(k)(AU-AL∞)=e-γ-g(k)(e-γ-log2+13).
And g(k) is therefore also the function that interpolates between the bounds, allowing us to write the median in these convenient ways:
ν(k)=g(k)νL∞(k)+(1-g(k))νU(k)ν(k)=2-1/k(e-γ-g(k)(e-γ-log2+13)+k).
The ideal interpolator can be computed numerically from A(k) or from ν(k):
g(k)=AU-A(k)AU-AL∞=νU(k)-ν(k)νU(k)-νL∞(k).
It can be interesting to bound or otherwise approximate g(k). In approximating the ideal with an interpolator g˜(k), we achieve absolute and relative error of the median estimate approaching zero at low k if g˜(k)=0+O(k), and at high k if g˜(k)=1-O(k-1). But we might want to do better, matching the asymptotic slopes of the ideal interpolator to match the median to a higher order; or we might want to match the exact known value ν(1) = log 2. So we analyze these properties of the ideal interpolator, and give them names. At low k:
P0=dgdk=1-e-γπ212e-γ-log2+13≈2.66913.
At high k:
P∞=-dgd1k=8405+e-γlog2-log222e-γ-log2+13-log2≈0.143472.
And at k = 1:
P1=g(1)=1+e-γ-2log2e-γ-log2+13≈0.868678.
How such improved-approximation goals relate to bounds is not immediately clear. In the next two sections, we construct some examples, with results summarized in Table 2. Most of the interpolated approximations and bounds listed are closed-form analytic expressions, but a few others are numeric and approximate.
10.1371/journal.pone.0251626.t002Comparison of interpolations.
Version
Parameter
g˜1(k)=kk+b0
Symbolic b_{0}
Numeric b_{0}
best at low k
e-γ-log2+131-e-γπ212
0.374654
U*
exact at k = 1
e-γ-log2+131+e-γ-2log2-1
0.151175
–
best at high k
8405+e-γlog2-log222e-γ-log2+13-log2
0.143472
L*
g˜a(k)=2πtan-1kb
Symbolic b
Numeric b
best at low k
24π(e-γ-log2+1312-e-γπ2)
0.238512
U*
best at high k
π2(8405+e-γlog2-log222e-γ-log2+13-log2)
0.225366
–
minimax relative error
argminrmax|ν(k)-ν˜(k)ν(k)|
0.21639
–
minimax absolute error
argminrmax|ν(k)-ν˜(k)|
0.21008
–
exact at k = 1
cot(π2·1+e-γ-2log2e-γ-log2+13)
0.209257
–
tangent at k ≈ 0.4184
?
0.205282
L*
Several one-parameter interpolated bounds and approximations g˜(k)νL∞(k)+(1-g˜(k))νU(k), some of which have closed-form parameters, are listed for comparison. Conjectured bounds are indicated by U* or L*.
Fig 8 shows the ideal interpolator, computed numerically, and compares it to bounding interpolators of the forms g˜(k)=kk+b0 and g˜(k)=2πtan-1kb, with parameters b_{0} and b chosen to yield tight upper and lower bounds. The effects of these interpolator bounds on the median bounds is shown in Fig 9.
10.1371/journal.pone.0251626.g008Bounding the ideal interpolator function.
The ideal interpolator g(k) (heavy dotted sigmoid) is compared with upper and lower bounds g˜(k); their margins g˜(k)-g(k) are also plotted, magnified and displaced, with the same curve styles. The curves with largest absolute margins (dashed), which correspond to the interpolated bounds shown in Fig 6, are for the first-order rational-function interpolator kk+b0, while the curves with smaller margins (solid) are for arctan interpolators 2πtan-1kb. In each case, the one parameter (b_{0} or b) is chosen to give a tight bound (analytically in closed form in three of the four cases). Lower bounds of g(k) make upper bounds of ν(k), and vice versa.
10.1371/journal.pone.0251626.g009Margins of interpolated bounds, premultiplied by 2<sup>1/<italic>k</italic></sup>.
The absolute margins of the interpolated bounds are smaller than those of the bounds they started from. Compare Fig 5. The generally smallest margins are for the arctan interpolators, and the intermediate for the N = 1 rational-function interpolators.
Rational-function interpolators
Consider rational functions as interpolators, of the form
g˜N(k)=∑n=1N-1aiki+kN∑n=0N-1biki+kN.
For N = 1, the only parameter is b_{0}, so we have a one-parameter family:
g˜1(k)=kb0+k.
For N = 2 we have three parameters:
g˜2(k)=a1k+k2b0+b1k+k2,
and so forth.
We can easily constrain the coefficients to match the properties of the ideal interpolator. At low k:
a1b0=P0.
At high kbN-1-aN-1=P∞.
And at k = 1:
∑n=1N-1ai+1∑n=0N-1bi+1=P1.
For N = 1, the low-k asymptote is tightly approached with b0=1P0, yielding an upper bound for the median (lower bound for the interpolator g_{ideal}). Or the high-k asymptote is tightly approached with b_{0} = P_{∞}, yielding a lower bound for the median (upper bound for g(k)). These bounds are illustrated in Fig 8. For b_{0} between these values, the resulting interpolated function is not a bound, but is exact at one value of k, for example at k = 1 with b0=1P1-1.
For N = 2, there are enough parameters to use any or all of the three constraints, but it’s not immediately clear which sets of constraints can lead to bounds. Certainly using all three constraints does not lead to a bound, but to an interesting approximation. As we did with the A versus B space, we can investigate the locus of parameter solutions for each k, and examine the edges of these locus-filled areas for tight bounds. Reducing the space to 2D by constraining one asymptote or the other allows a graphical approach, but the results are not better than a one-parameter arctan interpolator.
For N ≥ 3, excellent approximations and bounds are possible, but with so many parameters are not very interesting. For example, we can choose to constrain both asymptotes to a higher-order fit, and to constrain the value at k = 1, minimizing the maximum relative error with the two remaining parameters. It’s an excellent fit in all regions, with maximum relative error of 0.00018, but could be even better in the middle without the constraints:
g˜3(k)=0.019983k+0.083933k2+k30.0074867+0.0359083k+0.227405k2+k3
Arctan interpolators
Getting a tighter bound or better approximation from the rational-function family requires at least a handful of parameters. An alternative approach is to find a one-parameter shape that fits better. We have found that the arctan shape with parameter b (like b_{0} in the one-parameter rational-function interpolator, corresponding to the k value at the midpoint, g˜(b)=0.5) does a good job:
g˜a(k)=2πtan-1kb.
As Fig 8 shows, the shapes of the arctan interpolators are imperfect, but are much better than the first-order rational function, fitting better in some regions than in others. Note that both the N = 1 rational function and the arctan interpolators are symmetric about their centers (with log k as the independent variable); they are logistic sigmoid and Gudermannian shapes, respectively. The ideal that they are to bound or approximate, however, is not quite symmetric in log k. So a family of not-quite-symmetric interpolators can perhaps do better.
Several special b parameter values for the arctan interpolator can be derived to match each of the ideal properties mentioned above. We can constrain the approximation to pass through the known value ν(1) = log 2, with b=cot(π2P1), but that does not give a bound. Or we match the true median at high k to within O(k^{−2}) with b=π2P∞. That also does not give a bound. Or we can match at low k to within O(2^{−1/k}k^{2}), like ν˜1 does, with b=2πP0-1, which yields a lower bound to g.
To find an upper bound to g (lower bound to ν), we decrease b until the margin is nonnegative for all k, which is at about b = 0.205282. Finding an analytic formulation for that tight bound is a challenge left to others.
See Fig 10 for the relative errors of various arctan interpolator versions.
10.1371/journal.pone.0251626.g010Relative errors of arctan interpolations.
Relative errors of arctan-interpolated approximations (dash-dot curves) between the upper (dashed) and lower (solid) bounds. These are among the possibly interesting approximations suggested by the tight-bounds approach. The non-bounding approximations, optimized for different criteria, all have maximum relative errors below 1%.
Conclusions
Tight upper and lower bounds to the median of the gamma distribution are conjectured, based on numerical and asymptotic analyses. The simplest conjectured lower bound is never below the 48th percentile, and the simplest conjectured upper bound, of the same form 2^{−1/k}(A + Bk), is never above the 55th percentile, over the entire range of k > 0. Using arctan and rational-function interpolators between these closed-form bounds, two better one-parameter families of bounds and approximations to the median of a gamma distribution are proposed.
The one-parameter rational-function family has simple closed-form formulae for tightest upper and lower conjectured bounds, staying below 50.85 and above 49.69 percentile, respectively; higher-order rational functions can provide tighter bounds or better approximations.
The one-parameter arctan family of interpolators is a better fit to the ideal interpolator, and includes a version that is most accurate in the low-k tail and provides a closed-form tight conjectured upper bound, staying below 50.18 percentile. With different b parameter, several approximations in the family, including the closed-form version that is exact at k = 1, stay between 49.97 and 50.03 percentile. We have not found an analytic formula for the parameter that gives the tightest lower bound, which stays above 49.96 percentile, but have shown where to find it, graphically or numerically.
The approach of interpolating between tight bounds opens the way to finding tighter bounds and more accurate approximations, and to finding more such families of bounds and approximations via other interpolator forms.
While numerical and graphical techniques were used in finding these bounds, the ones with closed forms are grounded in asymptotic analysis. Proving that they are in fact bounds remains an unmet challenge.
The author gratefully acknowledges helpful comments on previous drafts from Google colleagues Pascal Getreuer, Srinivas Vasudevan, Dan Piponi, and Michael Keselman, and from outside experts D. Andrew Barry, José Antonio Adell, Milan Merkle, Christian Berg, and Henrik L. Pedersen. The anonymous reviewers were also very helpful.
ReferencesChenJ, RubinH. Bounds for the difference between median and mean of gamma and Poisson distributions. ChoiKP. On the medians of gamma distributions and an equation of Ramanujan. BergC, PedersenHL. The Chen–Rubin conjecture in a continuous setting. GauntRE, MerkleM. On bounds for the mode and median of the generalized hyperbolic and related distributions. BergC, PedersenHL. Convexity of the median in the gamma distribution. MarsagliaJC. The incomplete gamma function and Ramanujan’s rational approximation to e^{x}. AdellJ, JodráP. On a Ramanujan equation connected with the median of the gamma distribution. YouX. Approximation of the median of the gamma distribution. ChenCP. The median of gamma distribution and a related Ramanujan sequence. WilsonEB, HilfertyMM. The distribution of chi-square. BarryD, ParlangeJY, LiL. Approximation for the exponential integral (Theis well function). SteinbrecherG, ShawWT. Quantile mechanics. 10.1371/journal.pone.0251626.r001Decision Letter 0KryvenIvanAcademic Editor2021Ivan KryvenThis is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.Submission Version0
15 Mar 2021
PONE-D-21-00917
Closed-form tight bounds and approximations for the median of a gamma distribution
PLOS ONE
Dear Dr. Lyon,
Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.
Please submit your revised manuscript by Apr 23 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.
Please include the following items when submitting your revised manuscript:
A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.
If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols
We look forward to receiving your revised manuscript.
Kind regards,
Ivan Kryven
Academic Editor
PLOS ONE
Journal Requirements:
Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.
When submitting your revision, we need you to address these additional requirements.
1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at
https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and
2. Thank you for stating the following in the Competing Interests section:
"The authors have declared that no competing interests exist."
We note that one or more of the authors are employed by a commercial company: Google Inc.
2.1. Please provide an amended Funding Statement declaring this commercial affiliation, as well as a statement regarding the Role of Funders in your study. If the funding organization did not play a role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and only provided financial support in the form of authors' salaries and/or research materials, please review your statements relating to the author contributions, and ensure you have specifically and accurately indicated the role(s) that these authors had in your study. You can update author roles in the Author Contributions section of the online submission form.
Please also include the following statement within your amended Funding Statement.
“The funder provided support in the form of salaries for authors [insert relevant initials], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.”
If your commercial affiliation did play a role in your study, please state and explain this role within your updated Funding Statement.
2.2. Please also provide an updated Competing Interests Statement declaring this commercial affiliation along with any other relevant declarations relating to employment, consultancy, patents, products in development, or marketed products, etc.
Within your Competing Interests Statement, please confirm that this commercial affiliation does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests) . If this adherence statement is not accurate and there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.
Please include both an updated Funding Statement and Competing Interests Statement in your cover letter. We will change the online submission form on your behalf.
Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests
[Note: HTML markup is below. Please do not edit.]
Reviewers' comments:
Reviewer's Responses to Questions
Comments to the Author
1. Is the manuscript technically sound, and do the data support the conclusions?
The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.
Reviewer #1: Yes
Reviewer #2: Yes
**********
2. Has the statistical analysis been performed appropriately and rigorously?
Reviewer #1: N/A
Reviewer #2: N/A
**********
3. Have the authors made all data underlying the findings in their manuscript fully available?
The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.
Reviewer #1: Yes
Reviewer #2: Yes
**********
4. Is the manuscript presented in an intelligible fashion and written in standard English?
PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.
Reviewer #1: Yes
Reviewer #2: Yes
**********
5. Review Comments to the Author
Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)
Reviewer #1: Please see the attached report *******************************************************************************************************************************************************************************
Reviewer #2: The topic of the manuscript is very important in mathematics and , due to the role of median in various problems. The median of Gamma distribution with a fixed shape parameter k can't be represented in terms of elementary functions, so
the the idea of the author to use approximation based on two parameters A and B, of type as in the page 3, is very convenient.
Some comments:
1) First sentence in Abstract: Delete "The tightest possible", as this is not proved by mathematics. Further,
in place of "interesting cases" describe numerically what are those cases.
2) Text under Fig 3. Parameter space detail. It seems that the " (see the next Fig)" is a mistake?
**********
6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.
If you choose “no”, your identity will remain anonymous but your review may still be made public.
Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.
Reviewer #1: No
Reviewer #2: No
[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]
While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.
Submitted filename: PLOS ONE review 2021.pdf
10.1371/journal.pone.0251626.r002Author response to Decision Letter 0Submission Version1
2 Apr 2021
Rebuttal Letter / Response to Reviwers
by Richard F. Lyon
[PONE-D-21-00917]
31 March 2021
I appreciate the positive reviews, including reviewer 1's "tough love" approach to explaining how I need to improve the presentation for conjectured bounds. I mostly made changes in line with their suggestions, and added a few more bits along the way to help clarify (such as the new section "Toward a proof" that might help others complete the work, and an example of a third-order rational function interpolation with low relative error).
First, just two changes per reviewer 2's two suggestions:
in abstract:
The tightest possible upper and lower bounds of the form $2^{-1/k} (A + Bk)$ to the median of a gamma distribution, over the entire range of shape parameter $k > 0$, have closed-form parameters $A$ and $B$ in interesting cases.
-- changed to -->
Conjectured tight upper and lower bounds of the form $2^{-1/k} (A + Bk)$ to the median of a gamma distribution over the entire range of shape parameter $k > 0$ have closed-form parameters $A$ and $B$ in the case of the lower bound that is tight for high $k$ and the upper bound that is tight for low $k$.
in Fig 4 caption: (previously Fig 3)
(see the next Fig)
--was not wrong, but unclear; changed to -->
(see their errors, or margins, in Fig. 5)
Then back to reviewer 1's lengthier comments:
General Comments
(1) Throughout this report, I have been careful to stress that most of the author’s bounds are conjectured. This is something the author needs to do more clearly. I acknowledge that the author does put an asterisk to denote bounds that have not been rigorously proved and writes in the conclusion “Proving that they [the conjectured bounds] are in fact bounds remains an unmet challenge.” However, the point that the bounds are conjectured needs to be made very clear early in the paper. Indeed, it wasn’t until I reached page 4 of this short paper that I realised that the bounds were not going to be rigorously proven, which is what one expects when reading a paper on mathematics. It is important that the reader is made aware at an early stage that the bounds are conjectured. For example, if another researcher uses the author’s bounds to bound a quantity themselves then the bound they obtain is in turn non-rigorous, and they must be made aware of this. The author therefore needs to make the following edits. The title must change. An example would be “Approximations and conjectured closed-form tight bounds for the median of a gamma distribution”, or if the author prefers not to use the word ‘conjecture’ in the title, something like “On closed-form tight bounds and approximations for the median of a gamma distribution”. The author must clearly state in the Abstract that the bounds are conjectured, and can mention that they are supported by asymptotic, numerical and graphical arguments. It must be mentioned in the Introduction that the bounds are conjectured. The first sentence of the Conclusion should be edited from “Tight upper and lower bounds to the median of the gamma distribution are introduced” to something like “Conjectured tight upper and lower bounds to the median of the gamma distribution are introduced”. Moreover, the author may wish to modify certain sentences throughout the paper to reflect the more modest findings of this research. For example, the sentence on page 5 that ends with “this seems reliable enough in concluding that these are bounds (but mathematicians are invited to interpret these as conjectures to be proved)” needs to be edited. There is no doubt that the bounds are conjectures to be proved.
Added "On" in the title, and "conjectured" in abstract, introduction, conclusion, and lots of places in between; took out the bit about "seems reliable enough" and instead wrote a short section "Toward a proof" to discuss a couple of approaches that might work.
(2) On page 2, the author presents the well-known two-sided inequality of Chen and Rubin for the true mean ν of a gamma distribution with rate parameter ν and scale parameter 1: k−1/3 < ν < k. Throughout the paper the author 3 compares his bounds to these bounds. However, for k ≥ 1, the upper bound can be significantly improved: Gaunt and Merkle [1, Theorem 3.1] give the upper bound ν < k − 1 + log(2), k > 1; note that there is equality at k = 1, which corresponds to the case of the exponential distribution. This upper bound extends the range of validity of a previous bound of Choi (reference 3) from positive integer k to all k > 1. As 1 − log(2) ≈ 0.306852 is ‘close’ to 1/3, when this upper bound is combined with the lower bound k − 1/3 it results in a very accurate two-sided inequality for the median. The author should report this bound of [1] in either the ‘Problem formulation’ or ‘Prior work’ sections. The author should also update their numerical results that give comparisons to existing bounds to include a comparison to this bound. As this upper bound is very accurate, some of the author’s conclusions regarding the performance of his bounds relative to others in the literature may change. The paper [1] (see Section 3) also gives conjectured inequalities for the median of the variance-gamma distribution. In particular, the authors conjecture that the median of the variance-gamma distribution can be bounded above and below by the median of certain gamma distributions. Therefore the author’s conjectured accurate bounds may in turn, at least for some parameter values, result in improved conjectured bounds for the median of variance-gamma distribution. I will leave up to the author to decide whether they wish to add such a remark to their paper.
Thanks for that better bound. I have added a brief discussion of the 2021 Gaunt & Merkle bound in the prior art section. Since it is not a bound over the entire domain k > 0, I extended it with a chord to make a complete piecewise linear bound, and made a new Fig. 1 to show it along with linear bounds. I also added that piecewise-linear boound to several other figures, and discussed the fact that it has much lower margin at high k than my new upper bound.
The variance-gamma distribution is outside what I understand, so I'll leave it out of scope for this paper.
Specific Comments
(1) Page 1, Abstract (and elsewhere): The author uses “closed-form” and closed form. Please consistently use just one of these.
The current usage is consistent with the advice of many English style and grammar guides that suggest hyphenating a compound noun when used as a modifier, but not otherwise, as in:
If PLOS One has a preferred hyphenation style, I will follow it; but for now, I think the consistent style that I have followed should be OK. It agrees, for example, with the Cambridge University Press style that I used in my book.
(2) Page 1, Introduction: I suggest the author brings Tables 1 and 2 to the reader’s attention in the introduction. These tables are very useful and efficiently display the main findings of this work. A reader who wants to quickly get to the results would find such a comment in the introduction to be very helpful.
I'm glad the tables were as helpful as I had hoped they would be. I moved Table 1 reference to much earlier (but not in the introduction). And Table 2 somewhat earlier, too, so they summarize what's coming instead of what's done. I mention "the tables" in the introduction, but not by number as that would require me to move them up there.
(3) Page 1, At the bottom of the page, the author introduces the notation ν for the median of the gamma distribution with rate parameter 1 and scale parameter k. I suggest modifying this notation to ν(k) to emphasis the dependence on k. Later in the paper, the author sometimes does write ν(k), and it would be helpful to keep things consistent throughout the paper. The same comment applies to the author’s notation for other bounds and approximations: e.g., νL∞ becomes νL∞(k).
Yes, good idea. I've added the "(k)" throughout to make clear that the median and the bounds are functions of k. Hopefully I didn't miss any.
(4) Page 3: The author neatly derives the bound ν > 2 −1/kΓ(k + 1)1/k. I appreciate that the author derives this bound with the intention of motivating further work in the paper. Nevertheless, it would be helpful if the author were to give a statement as to whether they believe this to be a new result.
Yes, the table says it's a new lower bound. But maybe it's anticipated or implied by Quantile Mechanics. I clarified that I'm claiming it's new even though it may be implicit in Steinbrecher and Shaw.
(5) Page 4, line −3: log(x) is not defined at x = 0, so it makes no sense to write log(0). The author therefore may consider formulating “log(0) cusps” differently.
Good point. I meant a cusp that comes from the plot trying to reach the log of nearly 0. Clarifying...
(6) Page 5, line −9, “The slope v'(k)”: Does the author mean \\nu'(k)? Please look out for other such typos, e.g., in the displayed equation below. Also, it would be helpful to the reader if the author were to provide an explanation or short calculation to verify how they found the formula for v'(k)|k=1.
Fixed typo. Changed $v^{\\prime }(k)$ to $\\nu^{\\prime }(k)$ in 3 places near there.
Derivation of the slope v'(k)|k=1 has been added.
(7) Page 8: In the final displayed equation, the comma should be a full stop.
I'm not sure I see the right place. In the compiled submission PDF, page 8/11 concludes with "and so forth." after the comma.
Submitted filename: Response to Reviewers.pdf
10.1371/journal.pone.0251626.r003Decision Letter 1KryvenIvanAcademic Editor2021Ivan KryvenThis is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.Submission Version1
26 Apr 2021
PONE-D-21-00917R1
On closed-form tight bounds and approximations for the median of a gamma distribution
PLOS ONE
Dear Dr. Lyon,
Thank you for resubmitting your manuscript to PLOS ONE. The referee have evaluated your manuscript again and they now are positive about publishing it at our journal. Note that one of the reviewers, has two minor suggestions. Before we can proceed with formal acceptance, we invite you to consider the following editorial comments regarding the style:
**Editorial comments**
1. PLOS one has a broad audience that includes mathematicians as well as scientists from diverse areas. Please consider rewriting abstract to be more comprehensible to non-experts. For example, you may consider the following:
The median of Gamma distribution with a fixed shape parameter k can't be represented in terms of elementary functions. In this work we use numerical simulations and asymptotic analyses to bound the median, suggesting an upper bound that is tight for low k and a lower bound that is tight for large k. These bounds have the form 2−1/k(A + Bk) and are valid over the entire range of k > 0. Furthermore, an interpolation between these bounds yields closed-form expressions that tightly bound the median, with absolute and relative margins approaching zero at both low and high values of k. Some of our results are not supported with analytical proofs but are only confirmed with numerical calculations, and hence should be regarded as conjectures in the strict mathematical sense.
2. The current Introduction section is too short and not much more informative than the abstract. Consider the following: a) Removing the current Introduction. b) Renaming Problem Formulation -> Introduction,and c) joining this section with Prior Work. In which case the paragraph starting with "We seek upper and lower bounds that are tighter..." should be placed at the end of the section. 3. Following up on out previous discussion, you may include the proof of the statement "Γ(k + 1)1/k increases monotonically from e−γ, ..."
Please submit your revised manuscript by Jun 10 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.
Please include the following items when submitting your revised manuscript:
A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.
If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.
We look forward to receiving your revised manuscript.
Kind regards,
Ivan Kryven
Academic Editor
PLOS ONE
Journal Requirements:
Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.
[Note: HTML markup is below. Please do not edit.]
Reviewers' comments:
Reviewer's Responses to Questions
Comments to the Author
1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.
Reviewer #1: All comments have been addressed
Reviewer #2: All comments have been addressed
**********
2. Is the manuscript technically sound, and do the data support the conclusions?
The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.
Reviewer #1: Yes
Reviewer #2: Yes
**********
3. Has the statistical analysis been performed appropriately and rigorously?
Reviewer #1: Yes
Reviewer #2: N/A
**********
4. Have the authors made all data underlying the findings in their manuscript fully available?
The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.
Reviewer #1: Yes
Reviewer #2: Yes
**********
5. Is the manuscript presented in an intelligible fashion and written in standard English?
PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.
Reviewer #1: Yes
Reviewer #2: Yes
**********
6. Review Comments to the Author
Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)
Reviewer #1: Please see the attached report *****************************************************************************************
Reviewer #2: The topic of the manuscript is very important in mathematics and , due to the role of median in various problems. The median of Gamma distribution with a fixed shape parameter k can't be represented in terms of elementary functions, so
the the idea of the author to use approximation based on two parameters A and B, of type as in the page 3, is very convenient.
**********
7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.
If you choose “no”, your identity will remain anonymous but your review may still be made public.
Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.
Reviewer #1: No
Reviewer #2: No
[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]
While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.
Submitted filename: PLOS ONE review rev1 2021.pdf
10.1371/journal.pone.0251626.r004Author response to Decision Letter 1Submission Version2
29 Apr 2021
Rebuttal Letter / Response to Reviewers
PONE-D-21-00917R1
On closed-form tight bounds and approximations for the median of a gamma distribution
PLOS ONE
Richard F. Lyon, author
I thank the reviewers again, and the editor, for the constructive suggestions and positive reaction to my article. I have incorporated the latest suggestions, pretty nearly, and a few minor corrections.
Editor suggests a rewrite of the Abstract. I did that, starting with his suggested opening but then with some changes. Instead of "with a fixed shape parameter k" I said "as a function of shape parameter k". I moved the algebraic form of the bounds up one sentence to near the bounds it most particularly applied to. And I added a few more words to clarify things.
Editor suggests dropping the short Introduction paragraph and merging the next two sections into an Introduction. Done; exactly as suggested, except also moved the sentence saying that results are summarized in tables to come, since the reviewers had asked for that.
Editor says an added proof that "Γ(k + 1)1/k increases monotonically from e−γ, ..." would be welcome, so I added that.
Reviewer 1 pointed out notation error in >= relation. I found and fixed 2 of those.
Reviewer 1 pointed out that I failed to add the "(k)" in a few places. I found and fixed 9 of those (including 5 in table 2).
Reviewer 2 had no specific comments.
Other things I changed:
Fig 2 had an error in the y axis label, so I fixed that.
I modified the comment on the potential use of quantile mechanics partial sums as bounds, as they are not closed form as I had suggested.
I re-ordered references to match the change order of citation in the new introduction.
I added an acknowledgement name, for the colleague who helped me with the proof.
That's all.
Submitted filename: Response_to_Reviewers.pdf
10.1371/journal.pone.0251626.r005Decision Letter 2KryvenIvanAcademic Editor2021Ivan KryvenThis is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.Submission Version2
30 Apr 2021
On closed-form tight bounds and approximations for the median of a gamma distribution
PONE-D-21-00917R2
Dear Dr. Lyon,
We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.
Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.
*Editorial comment:* When preparing the final version of the paper for production please note that the paper has multiple one-sentence paragraphs. For example, as in line 5, 7, 9 29. Consider merging these paragraphs with the preceding text.
An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.
If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.
Kind regards,
Ivan Kryven
Academic Editor
PLOS ONE
Additional Editor Comments (optional):
Reviewers' comments:
10.1371/journal.pone.0251626.r006Acceptance letterKryvenIvanAcademic Editor2021Ivan KryvenThis is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
4 May 2021
PONE-D-21-00917R2
On closed-form tight bounds and approximations for the median of a gamma distribution
Dear Dr. Lyon:
I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.
If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.
If we can help with anything else, please email us at plosone@plos.org.
Thank you for submitting your work to PLOS ONE and supporting open access.