Coverage Report

Created: 2025-07-23 06:05

/rust/registry/src/index.crates.io-6f17d22bba15001f/jiff-0.2.5/src/tz/posix.rs
Line
Count
Source (jump to first uncovered line)
1
/*!
2
Provides a parser for [POSIX's `TZ` environment variable][posix-env].
3
4
NOTE: Sadly, at time of writing, the actual parser is in `src/shared/posix.rs`.
5
This is so it can be shared (via simple code copying) with proc macros like
6
the one found in `jiff-tzdb-static`. The parser populates a "lowest common
7
denominator" data type. In normal use in Jiff, this type is converted into
8
the types defined below. This module still does provide the various time zone
9
operations. Only the parsing is written elsewhere.
10
11
The `TZ` environment variable is most commonly used to set a time zone. For
12
example, `TZ=America/New_York`. But it can also be used to tersely define DST
13
transitions. Moreover, the format is not just used as an environment variable,
14
but is also included at the end of TZif files (version 2 or greater). The IANA
15
Time Zone Database project also [documents the `TZ` variable][iana-env] with
16
a little more commentary.
17
18
Note that we (along with pretty much everyone else) don't strictly follow
19
POSIX here. Namely, `TZ=America/New_York` isn't a POSIX compatible usage,
20
and I believe it technically should be `TZ=:America/New_York`. Nevertheless,
21
apparently some group of people (IANA folks?) decided `TZ=America/New_York`
22
should be fine. From the [IANA `theory.html` documentation][iana-env]:
23
24
> It was recognized that allowing the TZ environment variable to take on values
25
> such as 'America/New_York' might cause "old" programs (that expect TZ to have
26
> a certain form) to operate incorrectly; consideration was given to using
27
> some other environment variable (for example, TIMEZONE) to hold the string
28
> used to generate the TZif file's name. In the end, however, it was decided
29
> to continue using TZ: it is widely used for time zone purposes; separately
30
> maintaining both TZ and TIMEZONE seemed a nuisance; and systems where "new"
31
> forms of TZ might cause problems can simply use legacy TZ values such as
32
> "EST5EDT" which can be used by "new" programs as well as by "old" programs
33
> that assume pre-POSIX TZ values.
34
35
Indeed, even [musl subscribes to this behavior][musl-env]. So that's what we do
36
here too.
37
38
Note that a POSIX time zone like `EST5` corresponds to the UTC offset `-05:00`,
39
and `GMT-4` corresponds to the UTC offset `+04:00`. Yes, it's backwards. How
40
fun.
41
42
# IANA v3+ Support
43
44
While this module and many of its types are directly associated with POSIX,
45
this module also plays a supporting role for `TZ` strings in the IANA TZif
46
binary format for versions 2 and greater. Specifically, for versions 3 and
47
greater, some minor extensions are supported here via `IanaTz::parse`. But
48
using `PosixTz::parse` is limited to parsing what is specified by POSIX.
49
Nevertheless, we generally use `IanaTz::parse` everywhere, even when parsing
50
the `TZ` environment variable. The reason for this is that it seems to be what
51
other programs do in practice (for example, GNU date).
52
53
# `no-std` and `no-alloc` support
54
55
A big part of this module works fine in core-only environments. But because
56
core-only environments provide means of indirection, and embedding a
57
`PosixTimeZone` into a `TimeZone` without indirection would use up a lot of
58
space (and thereby make `Zoned` quite chunky), we provide core-only support
59
principally through a proc macro. Namely, a `PosixTimeZone` can be parsed by
60
the proc macro and then turned into static data.
61
62
POSIX time zone support isn't explicitly provided directly as a public API
63
for core-only environments, but is implicitly supported via TZif. (Since TZif
64
data contains POSIX time zone strings.)
65
66
[posix-env]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03
67
[iana-env]: https://data.iana.org/time-zones/tzdb-2024a/theory.html#functions
68
[musl-env]: https://wiki.musl-libc.org/environment-variables
69
*/
70
71
use crate::{
72
    civil::DateTime,
73
    error::{err, Error, ErrorContext},
74
    shared,
75
    timestamp::Timestamp,
76
    tz::{
77
        timezone::TimeZoneAbbreviation, AmbiguousOffset, Dst, Offset,
78
        TimeZoneOffsetInfo, TimeZoneTransition,
79
    },
80
    util::{array_str::Abbreviation, escape::Bytes, parse},
81
};
82
83
/// The result of parsing the POSIX `TZ` environment variable.
84
///
85
/// A `TZ` variable can either be a time zone string with an optional DST
86
/// transition rule, or it can begin with a `:` followed by an arbitrary set of
87
/// bytes that is implementation defined.
88
///
89
/// In practice, the content following a `:` is treated as an IANA time zone
90
/// name. Moreover, even if the `TZ` string doesn't start with a `:` but
91
/// corresponds to a IANA time zone name, then it is interpreted as such.
92
/// (See the module docs.) However, this type only encapsulates the choices
93
/// strictly provided by POSIX: either a time zone string with an optional DST
94
/// transition rule, or an implementation defined string with a `:` prefix. If,
95
/// for example, `TZ="America/New_York"`, then that case isn't encapsulated by
96
/// this type. Callers needing that functionality will need to handle the error
97
/// returned by parsing this type and layer their own semantics on top.
98
#[cfg(feature = "tz-system")]
99
#[derive(Debug, Eq, PartialEq)]
100
pub(crate) enum PosixTzEnv {
101
    /// A valid POSIX time zone with an optional DST transition rule.
102
    Rule(PosixTimeZoneOwned),
103
    /// An implementation defined string. This occurs when the `TZ` value
104
    /// starts with a `:`. The string returned here does not include the `:`.
105
    Implementation(alloc::boxed::Box<str>),
106
}
107
108
#[cfg(feature = "tz-system")]
109
impl PosixTzEnv {
110
    /// Parse a POSIX `TZ` environment variable string from the given bytes.
111
    fn parse(bytes: impl AsRef<[u8]>) -> Result<PosixTzEnv, Error> {
112
        let bytes = bytes.as_ref();
113
        if bytes.get(0) == Some(&b':') {
114
            let Ok(string) = core::str::from_utf8(&bytes[1..]) else {
115
                return Err(err!(
116
                    "POSIX time zone string with a ':' prefix contains \
117
                     invalid UTF-8: {:?}",
118
                    Bytes(&bytes[1..]),
119
                ));
120
            };
121
            Ok(PosixTzEnv::Implementation(string.into()))
122
        } else {
123
            PosixTimeZone::parse(bytes).map(PosixTzEnv::Rule)
124
        }
125
    }
126
127
    /// Parse a POSIX `TZ` environment variable string from the given `OsStr`.
128
    pub(crate) fn parse_os_str(
129
        osstr: impl AsRef<std::ffi::OsStr>,
130
    ) -> Result<PosixTzEnv, Error> {
131
        PosixTzEnv::parse(parse::os_str_bytes(osstr.as_ref())?)
132
    }
133
}
134
135
#[cfg(feature = "tz-system")]
136
impl core::fmt::Display for PosixTzEnv {
137
    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
138
        match *self {
139
            PosixTzEnv::Rule(ref tz) => write!(f, "{tz}"),
140
            PosixTzEnv::Implementation(ref imp) => write!(f, ":{imp}"),
141
        }
142
    }
143
}
144
145
/// An owned POSIX time zone.
146
///
147
/// That is, a POSIX time zone whose abbreviations are inlined into the
148
/// representation. As opposed to a static POSIX time zone whose abbreviations
149
/// are `&'static str`.
150
pub(crate) type PosixTimeZoneOwned = PosixTimeZone<Abbreviation>;
151
152
/// An owned POSIX time zone whose abbreviations are `&'static str`.
153
pub(crate) type PosixTimeZoneStatic = PosixTimeZone<&'static str>;
154
155
/// A POSIX time zone.
156
///
157
/// # On "reasonable" POSIX time zones
158
///
159
/// Jiff only supports "reasonable" POSIX time zones. A "reasonable" POSIX time
160
/// zone is a POSIX time zone that has a DST transition rule _when_ it has a
161
/// DST time zone abbreviation. Without the transition rule, it isn't possible
162
/// to know when DST starts and stops.
163
///
164
/// POSIX technically allows a DST time zone abbreviation *without* a
165
/// transition rule, but the behavior is literally unspecified. So Jiff just
166
/// rejects them.
167
///
168
/// Note that if you're confused as to why Jiff accepts `TZ=EST5EDT` (where
169
/// `EST5EDT` is an example of an _unreasonable_ POSIX time zone), that's
170
/// because Jiff rejects `EST5EDT` and instead attempts to use it as an IANA
171
/// time zone identifier. And indeed, the IANA Time Zone Database contains an
172
/// entry for `EST5EDT` (presumably for legacy reasons).
173
///
174
/// Also, we expect `TZ` strings parsed from IANA v2+ formatted `tzfile`s to
175
/// also be reasonable or parsing fails. This also seems to be consistent with
176
/// the [GNU C Library]'s treatment of the `TZ` variable: it only documents
177
/// support for reasonable POSIX time zone strings.
178
///
179
/// Note that a V2 `TZ` string is precisely identical to a POSIX `TZ`
180
/// environment variable string. A V3 `TZ` string however supports signed DST
181
/// transition times, and hours in the range `0..=167`. The V2 and V3 here
182
/// reference how `TZ` strings are defined in the TZif format specified by
183
/// [RFC 9636]. V2 is the original version of it straight from POSIX, where as
184
/// V3+ corresponds to an extension added to V3 (and newer versions) of the
185
/// TZif format. V3 is a superset of V2, so in practice, Jiff just permits
186
/// V3 everywhere.
187
///
188
/// [GNU C Library]: https://www.gnu.org/software/libc/manual/2.25/html_node/TZ-Variable.html
189
/// [RFC 9636]: https://datatracker.ietf.org/doc/rfc9636/
190
#[derive(Clone, Debug, Eq, PartialEq)]
191
// NOT part of Jiff's public API
192
#[doc(hidden)]
193
// This ensures the alignment of this type is always *at least* 8 bytes. This
194
// is required for the pointer tagging inside of `TimeZone` to be sound. At
195
// time of writing (2024-02-24), this explicit `repr` isn't required on 64-bit
196
// systems since the type definition is such that it will have an alignment of
197
// at least 8 bytes anyway. But this *is* required for 32-bit systems, where
198
// the type definition at present only has an alignment of 4 bytes.
199
#[repr(align(8))]
200
pub struct PosixTimeZone<ABBREV> {
201
    inner: shared::PosixTimeZone<ABBREV>,
202
}
203
204
impl PosixTimeZone<Abbreviation> {
205
    /// Parse a IANA tzfile v3+ `TZ` string from the given bytes.
206
    #[cfg(feature = "alloc")]
207
0
    pub(crate) fn parse(
208
0
        bytes: impl AsRef<[u8]>,
209
0
    ) -> Result<PosixTimeZoneOwned, Error> {
210
0
        let bytes = bytes.as_ref();
211
0
        let inner = shared::PosixTimeZone::parse(bytes.as_ref())
212
0
            .map_err(Error::shared)
213
0
            .map_err(|e| {
214
0
                e.context(err!("invalid POSIX TZ string {:?}", Bytes(bytes)))
215
0
            })?;
216
0
        Ok(PosixTimeZone { inner })
217
0
    }
218
219
    /// Like `parse`, but parses a POSIX TZ string from a prefix of the
220
    /// given input. And remaining input is returned.
221
    #[cfg(feature = "alloc")]
222
0
    pub(crate) fn parse_prefix<'b, B: AsRef<[u8]> + ?Sized + 'b>(
223
0
        bytes: &'b B,
224
0
    ) -> Result<(PosixTimeZoneOwned, &'b [u8]), Error> {
225
0
        let bytes = bytes.as_ref();
226
0
        let (inner, remaining) =
227
0
            shared::PosixTimeZone::parse_prefix(bytes.as_ref())
228
0
                .map_err(Error::shared)
229
0
                .map_err(|e| {
230
0
                    e.context(err!(
231
0
                        "invalid POSIX TZ string {:?}",
232
0
                        Bytes(bytes)
233
0
                    ))
234
0
                })?;
235
0
        Ok((PosixTimeZone { inner }, remaining))
236
0
    }
237
238
    /// Converts from the shared-but-internal API for use in proc macros.
239
    #[cfg(feature = "alloc")]
240
0
    pub(crate) fn from_shared_owned(
241
0
        sh: shared::PosixTimeZone<Abbreviation>,
242
0
    ) -> PosixTimeZoneOwned {
243
0
        PosixTimeZone { inner: sh }
244
0
    }
245
}
246
247
impl PosixTimeZone<&'static str> {
248
    /// Converts from the shared-but-internal API for use in proc macros.
249
    ///
250
    /// This works in a `const` context by requiring that the time zone
251
    /// abbreviations are `static` strings. This is used when converting
252
    /// code generated by a proc macro to this Jiff internal type.
253
0
    pub(crate) const fn from_shared_const(
254
0
        sh: shared::PosixTimeZone<&'static str>,
255
0
    ) -> PosixTimeZoneStatic {
256
0
        PosixTimeZone { inner: sh }
257
0
    }
258
}
259
260
impl<ABBREV: AsRef<str>> PosixTimeZone<ABBREV> {
261
    /// Returns the appropriate time zone offset to use for the given
262
    /// timestamp.
263
    ///
264
    /// If you need information like whether the offset is in DST or not, or
265
    /// the time zone abbreviation, then use `PosixTimeZone::to_offset_info`.
266
    /// But that API may be more expensive to use, so only use it if you need
267
    /// the additional data.
268
0
    pub(crate) fn to_offset(&self, timestamp: Timestamp) -> Offset {
269
0
        Offset::from_ioffset_const(
270
0
            self.inner.to_offset(timestamp.to_itimestamp_const()),
271
0
        )
272
0
    }
Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<jiff::shared::util::array_str::ArrayStr<30>>>::to_offset
Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<&str>>::to_offset
273
274
    /// Returns the appropriate time zone offset to use for the given
275
    /// timestamp.
276
    ///
277
    /// This also includes whether the offset returned should be considered
278
    /// to be "DST" or not, along with the time zone abbreviation (e.g., EST
279
    /// for standard time in New York, and EDT for DST in New York).
280
0
    pub(crate) fn to_offset_info(
281
0
        &self,
282
0
        timestamp: Timestamp,
283
0
    ) -> TimeZoneOffsetInfo<'_> {
284
0
        let (ioff, abbrev, is_dst) =
285
0
            self.inner.to_offset_info(timestamp.to_itimestamp_const());
286
0
        let offset = Offset::from_ioffset_const(ioff);
287
0
        let abbreviation = TimeZoneAbbreviation::Borrowed(abbrev);
288
0
        TimeZoneOffsetInfo { offset, dst: Dst::from(is_dst), abbreviation }
289
0
    }
Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<jiff::shared::util::array_str::ArrayStr<30>>>::to_offset_info
Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<&str>>::to_offset_info
290
291
    /// Returns a possibly ambiguous timestamp for the given civil datetime.
292
    ///
293
    /// The given datetime should correspond to the "wall" clock time of what
294
    /// humans use to tell time for this time zone.
295
    ///
296
    /// Note that "ambiguous timestamp" is represented by the possible
297
    /// selection of offsets that could be applied to the given datetime. In
298
    /// general, it is only ambiguous around transitions to-and-from DST. The
299
    /// ambiguity can arise as a "fold" (when a particular wall clock time is
300
    /// repeated) or as a "gap" (when a particular wall clock time is skipped
301
    /// entirely).
302
0
    pub(crate) fn to_ambiguous_kind(&self, dt: DateTime) -> AmbiguousOffset {
303
0
        let iamoff = self.inner.to_ambiguous_kind(dt.to_idatetime_const());
304
0
        AmbiguousOffset::from_iambiguous_offset_const(iamoff)
305
0
    }
Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<jiff::shared::util::array_str::ArrayStr<30>>>::to_ambiguous_kind
Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<&str>>::to_ambiguous_kind
306
307
    /// Returns the timestamp of the most recent time zone transition prior
308
    /// to the timestamp given. If one doesn't exist, `None` is returned.
309
0
    pub(crate) fn previous_transition(
310
0
        &self,
311
0
        timestamp: Timestamp,
312
0
    ) -> Option<TimeZoneTransition> {
313
0
        let (its, ioff, abbrev, is_dst) =
314
0
            self.inner.previous_transition(timestamp.to_itimestamp_const())?;
315
0
        let timestamp = Timestamp::from_itimestamp_const(its);
316
0
        let offset = Offset::from_ioffset_const(ioff);
317
0
        let dst = Dst::from(is_dst);
318
0
        Some(TimeZoneTransition { timestamp, offset, abbrev, dst })
319
0
    }
Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<jiff::shared::util::array_str::ArrayStr<30>>>::previous_transition
Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<&str>>::previous_transition
320
321
    /// Returns the timestamp of the soonest time zone transition after the
322
    /// timestamp given. If one doesn't exist, `None` is returned.
323
0
    pub(crate) fn next_transition(
324
0
        &self,
325
0
        timestamp: Timestamp,
326
0
    ) -> Option<TimeZoneTransition> {
327
0
        let (its, ioff, abbrev, is_dst) =
328
0
            self.inner.next_transition(timestamp.to_itimestamp_const())?;
329
0
        let timestamp = Timestamp::from_itimestamp_const(its);
330
0
        let offset = Offset::from_ioffset_const(ioff);
331
0
        let dst = Dst::from(is_dst);
332
0
        Some(TimeZoneTransition { timestamp, offset, abbrev, dst })
333
0
    }
Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<jiff::shared::util::array_str::ArrayStr<30>>>::next_transition
Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<&str>>::next_transition
334
}
335
336
impl<ABBREV: AsRef<str>> core::fmt::Display for PosixTimeZone<ABBREV> {
337
0
    fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result {
338
0
        core::fmt::Display::fmt(&self.inner, f)
339
0
    }
340
}
341
342
// The tests below require parsing which requires alloc.
343
#[cfg(feature = "alloc")]
344
#[cfg(test)]
345
mod tests {
346
    use super::*;
347
348
    #[cfg(feature = "tz-system")]
349
    #[test]
350
    fn parse_posix_tz() {
351
        // We used to parse this and then error when we tried to
352
        // convert to a "reasonable" POSIX time zone with a DST
353
        // transition rule. We never actually used unreasonable POSIX
354
        // time zones and it was complicating the type definitions, so
355
        // now we just reject it outright.
356
        assert!(PosixTzEnv::parse("EST5EDT").is_err());
357
358
        let tz = PosixTzEnv::parse(":EST5EDT").unwrap();
359
        assert_eq!(tz, PosixTzEnv::Implementation("EST5EDT".into()));
360
361
        // We require implementation strings to be UTF-8, because we're
362
        // sensible.
363
        assert!(PosixTzEnv::parse(b":EST5\xFFEDT").is_err());
364
    }
365
}