/rust/registry/src/index.crates.io-6f17d22bba15001f/jiff-0.2.5/src/tz/posix.rs
Line | Count | Source (jump to first uncovered line) |
1 | | /*! |
2 | | Provides a parser for [POSIX's `TZ` environment variable][posix-env]. |
3 | | |
4 | | NOTE: Sadly, at time of writing, the actual parser is in `src/shared/posix.rs`. |
5 | | This is so it can be shared (via simple code copying) with proc macros like |
6 | | the one found in `jiff-tzdb-static`. The parser populates a "lowest common |
7 | | denominator" data type. In normal use in Jiff, this type is converted into |
8 | | the types defined below. This module still does provide the various time zone |
9 | | operations. Only the parsing is written elsewhere. |
10 | | |
11 | | The `TZ` environment variable is most commonly used to set a time zone. For |
12 | | example, `TZ=America/New_York`. But it can also be used to tersely define DST |
13 | | transitions. Moreover, the format is not just used as an environment variable, |
14 | | but is also included at the end of TZif files (version 2 or greater). The IANA |
15 | | Time Zone Database project also [documents the `TZ` variable][iana-env] with |
16 | | a little more commentary. |
17 | | |
18 | | Note that we (along with pretty much everyone else) don't strictly follow |
19 | | POSIX here. Namely, `TZ=America/New_York` isn't a POSIX compatible usage, |
20 | | and I believe it technically should be `TZ=:America/New_York`. Nevertheless, |
21 | | apparently some group of people (IANA folks?) decided `TZ=America/New_York` |
22 | | should be fine. From the [IANA `theory.html` documentation][iana-env]: |
23 | | |
24 | | > It was recognized that allowing the TZ environment variable to take on values |
25 | | > such as 'America/New_York' might cause "old" programs (that expect TZ to have |
26 | | > a certain form) to operate incorrectly; consideration was given to using |
27 | | > some other environment variable (for example, TIMEZONE) to hold the string |
28 | | > used to generate the TZif file's name. In the end, however, it was decided |
29 | | > to continue using TZ: it is widely used for time zone purposes; separately |
30 | | > maintaining both TZ and TIMEZONE seemed a nuisance; and systems where "new" |
31 | | > forms of TZ might cause problems can simply use legacy TZ values such as |
32 | | > "EST5EDT" which can be used by "new" programs as well as by "old" programs |
33 | | > that assume pre-POSIX TZ values. |
34 | | |
35 | | Indeed, even [musl subscribes to this behavior][musl-env]. So that's what we do |
36 | | here too. |
37 | | |
38 | | Note that a POSIX time zone like `EST5` corresponds to the UTC offset `-05:00`, |
39 | | and `GMT-4` corresponds to the UTC offset `+04:00`. Yes, it's backwards. How |
40 | | fun. |
41 | | |
42 | | # IANA v3+ Support |
43 | | |
44 | | While this module and many of its types are directly associated with POSIX, |
45 | | this module also plays a supporting role for `TZ` strings in the IANA TZif |
46 | | binary format for versions 2 and greater. Specifically, for versions 3 and |
47 | | greater, some minor extensions are supported here via `IanaTz::parse`. But |
48 | | using `PosixTz::parse` is limited to parsing what is specified by POSIX. |
49 | | Nevertheless, we generally use `IanaTz::parse` everywhere, even when parsing |
50 | | the `TZ` environment variable. The reason for this is that it seems to be what |
51 | | other programs do in practice (for example, GNU date). |
52 | | |
53 | | # `no-std` and `no-alloc` support |
54 | | |
55 | | A big part of this module works fine in core-only environments. But because |
56 | | core-only environments provide means of indirection, and embedding a |
57 | | `PosixTimeZone` into a `TimeZone` without indirection would use up a lot of |
58 | | space (and thereby make `Zoned` quite chunky), we provide core-only support |
59 | | principally through a proc macro. Namely, a `PosixTimeZone` can be parsed by |
60 | | the proc macro and then turned into static data. |
61 | | |
62 | | POSIX time zone support isn't explicitly provided directly as a public API |
63 | | for core-only environments, but is implicitly supported via TZif. (Since TZif |
64 | | data contains POSIX time zone strings.) |
65 | | |
66 | | [posix-env]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03 |
67 | | [iana-env]: https://data.iana.org/time-zones/tzdb-2024a/theory.html#functions |
68 | | [musl-env]: https://wiki.musl-libc.org/environment-variables |
69 | | */ |
70 | | |
71 | | use crate::{ |
72 | | civil::DateTime, |
73 | | error::{err, Error, ErrorContext}, |
74 | | shared, |
75 | | timestamp::Timestamp, |
76 | | tz::{ |
77 | | timezone::TimeZoneAbbreviation, AmbiguousOffset, Dst, Offset, |
78 | | TimeZoneOffsetInfo, TimeZoneTransition, |
79 | | }, |
80 | | util::{array_str::Abbreviation, escape::Bytes, parse}, |
81 | | }; |
82 | | |
83 | | /// The result of parsing the POSIX `TZ` environment variable. |
84 | | /// |
85 | | /// A `TZ` variable can either be a time zone string with an optional DST |
86 | | /// transition rule, or it can begin with a `:` followed by an arbitrary set of |
87 | | /// bytes that is implementation defined. |
88 | | /// |
89 | | /// In practice, the content following a `:` is treated as an IANA time zone |
90 | | /// name. Moreover, even if the `TZ` string doesn't start with a `:` but |
91 | | /// corresponds to a IANA time zone name, then it is interpreted as such. |
92 | | /// (See the module docs.) However, this type only encapsulates the choices |
93 | | /// strictly provided by POSIX: either a time zone string with an optional DST |
94 | | /// transition rule, or an implementation defined string with a `:` prefix. If, |
95 | | /// for example, `TZ="America/New_York"`, then that case isn't encapsulated by |
96 | | /// this type. Callers needing that functionality will need to handle the error |
97 | | /// returned by parsing this type and layer their own semantics on top. |
98 | | #[cfg(feature = "tz-system")] |
99 | | #[derive(Debug, Eq, PartialEq)] |
100 | | pub(crate) enum PosixTzEnv { |
101 | | /// A valid POSIX time zone with an optional DST transition rule. |
102 | | Rule(PosixTimeZoneOwned), |
103 | | /// An implementation defined string. This occurs when the `TZ` value |
104 | | /// starts with a `:`. The string returned here does not include the `:`. |
105 | | Implementation(alloc::boxed::Box<str>), |
106 | | } |
107 | | |
108 | | #[cfg(feature = "tz-system")] |
109 | | impl PosixTzEnv { |
110 | | /// Parse a POSIX `TZ` environment variable string from the given bytes. |
111 | | fn parse(bytes: impl AsRef<[u8]>) -> Result<PosixTzEnv, Error> { |
112 | | let bytes = bytes.as_ref(); |
113 | | if bytes.get(0) == Some(&b':') { |
114 | | let Ok(string) = core::str::from_utf8(&bytes[1..]) else { |
115 | | return Err(err!( |
116 | | "POSIX time zone string with a ':' prefix contains \ |
117 | | invalid UTF-8: {:?}", |
118 | | Bytes(&bytes[1..]), |
119 | | )); |
120 | | }; |
121 | | Ok(PosixTzEnv::Implementation(string.into())) |
122 | | } else { |
123 | | PosixTimeZone::parse(bytes).map(PosixTzEnv::Rule) |
124 | | } |
125 | | } |
126 | | |
127 | | /// Parse a POSIX `TZ` environment variable string from the given `OsStr`. |
128 | | pub(crate) fn parse_os_str( |
129 | | osstr: impl AsRef<std::ffi::OsStr>, |
130 | | ) -> Result<PosixTzEnv, Error> { |
131 | | PosixTzEnv::parse(parse::os_str_bytes(osstr.as_ref())?) |
132 | | } |
133 | | } |
134 | | |
135 | | #[cfg(feature = "tz-system")] |
136 | | impl core::fmt::Display for PosixTzEnv { |
137 | | fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result { |
138 | | match *self { |
139 | | PosixTzEnv::Rule(ref tz) => write!(f, "{tz}"), |
140 | | PosixTzEnv::Implementation(ref imp) => write!(f, ":{imp}"), |
141 | | } |
142 | | } |
143 | | } |
144 | | |
145 | | /// An owned POSIX time zone. |
146 | | /// |
147 | | /// That is, a POSIX time zone whose abbreviations are inlined into the |
148 | | /// representation. As opposed to a static POSIX time zone whose abbreviations |
149 | | /// are `&'static str`. |
150 | | pub(crate) type PosixTimeZoneOwned = PosixTimeZone<Abbreviation>; |
151 | | |
152 | | /// An owned POSIX time zone whose abbreviations are `&'static str`. |
153 | | pub(crate) type PosixTimeZoneStatic = PosixTimeZone<&'static str>; |
154 | | |
155 | | /// A POSIX time zone. |
156 | | /// |
157 | | /// # On "reasonable" POSIX time zones |
158 | | /// |
159 | | /// Jiff only supports "reasonable" POSIX time zones. A "reasonable" POSIX time |
160 | | /// zone is a POSIX time zone that has a DST transition rule _when_ it has a |
161 | | /// DST time zone abbreviation. Without the transition rule, it isn't possible |
162 | | /// to know when DST starts and stops. |
163 | | /// |
164 | | /// POSIX technically allows a DST time zone abbreviation *without* a |
165 | | /// transition rule, but the behavior is literally unspecified. So Jiff just |
166 | | /// rejects them. |
167 | | /// |
168 | | /// Note that if you're confused as to why Jiff accepts `TZ=EST5EDT` (where |
169 | | /// `EST5EDT` is an example of an _unreasonable_ POSIX time zone), that's |
170 | | /// because Jiff rejects `EST5EDT` and instead attempts to use it as an IANA |
171 | | /// time zone identifier. And indeed, the IANA Time Zone Database contains an |
172 | | /// entry for `EST5EDT` (presumably for legacy reasons). |
173 | | /// |
174 | | /// Also, we expect `TZ` strings parsed from IANA v2+ formatted `tzfile`s to |
175 | | /// also be reasonable or parsing fails. This also seems to be consistent with |
176 | | /// the [GNU C Library]'s treatment of the `TZ` variable: it only documents |
177 | | /// support for reasonable POSIX time zone strings. |
178 | | /// |
179 | | /// Note that a V2 `TZ` string is precisely identical to a POSIX `TZ` |
180 | | /// environment variable string. A V3 `TZ` string however supports signed DST |
181 | | /// transition times, and hours in the range `0..=167`. The V2 and V3 here |
182 | | /// reference how `TZ` strings are defined in the TZif format specified by |
183 | | /// [RFC 9636]. V2 is the original version of it straight from POSIX, where as |
184 | | /// V3+ corresponds to an extension added to V3 (and newer versions) of the |
185 | | /// TZif format. V3 is a superset of V2, so in practice, Jiff just permits |
186 | | /// V3 everywhere. |
187 | | /// |
188 | | /// [GNU C Library]: https://www.gnu.org/software/libc/manual/2.25/html_node/TZ-Variable.html |
189 | | /// [RFC 9636]: https://datatracker.ietf.org/doc/rfc9636/ |
190 | | #[derive(Clone, Debug, Eq, PartialEq)] |
191 | | // NOT part of Jiff's public API |
192 | | #[doc(hidden)] |
193 | | // This ensures the alignment of this type is always *at least* 8 bytes. This |
194 | | // is required for the pointer tagging inside of `TimeZone` to be sound. At |
195 | | // time of writing (2024-02-24), this explicit `repr` isn't required on 64-bit |
196 | | // systems since the type definition is such that it will have an alignment of |
197 | | // at least 8 bytes anyway. But this *is* required for 32-bit systems, where |
198 | | // the type definition at present only has an alignment of 4 bytes. |
199 | | #[repr(align(8))] |
200 | | pub struct PosixTimeZone<ABBREV> { |
201 | | inner: shared::PosixTimeZone<ABBREV>, |
202 | | } |
203 | | |
204 | | impl PosixTimeZone<Abbreviation> { |
205 | | /// Parse a IANA tzfile v3+ `TZ` string from the given bytes. |
206 | | #[cfg(feature = "alloc")] |
207 | 0 | pub(crate) fn parse( |
208 | 0 | bytes: impl AsRef<[u8]>, |
209 | 0 | ) -> Result<PosixTimeZoneOwned, Error> { |
210 | 0 | let bytes = bytes.as_ref(); |
211 | 0 | let inner = shared::PosixTimeZone::parse(bytes.as_ref()) |
212 | 0 | .map_err(Error::shared) |
213 | 0 | .map_err(|e| { |
214 | 0 | e.context(err!("invalid POSIX TZ string {:?}", Bytes(bytes))) |
215 | 0 | })?; |
216 | 0 | Ok(PosixTimeZone { inner }) |
217 | 0 | } |
218 | | |
219 | | /// Like `parse`, but parses a POSIX TZ string from a prefix of the |
220 | | /// given input. And remaining input is returned. |
221 | | #[cfg(feature = "alloc")] |
222 | 0 | pub(crate) fn parse_prefix<'b, B: AsRef<[u8]> + ?Sized + 'b>( |
223 | 0 | bytes: &'b B, |
224 | 0 | ) -> Result<(PosixTimeZoneOwned, &'b [u8]), Error> { |
225 | 0 | let bytes = bytes.as_ref(); |
226 | 0 | let (inner, remaining) = |
227 | 0 | shared::PosixTimeZone::parse_prefix(bytes.as_ref()) |
228 | 0 | .map_err(Error::shared) |
229 | 0 | .map_err(|e| { |
230 | 0 | e.context(err!( |
231 | 0 | "invalid POSIX TZ string {:?}", |
232 | 0 | Bytes(bytes) |
233 | 0 | )) |
234 | 0 | })?; |
235 | 0 | Ok((PosixTimeZone { inner }, remaining)) |
236 | 0 | } |
237 | | |
238 | | /// Converts from the shared-but-internal API for use in proc macros. |
239 | | #[cfg(feature = "alloc")] |
240 | 0 | pub(crate) fn from_shared_owned( |
241 | 0 | sh: shared::PosixTimeZone<Abbreviation>, |
242 | 0 | ) -> PosixTimeZoneOwned { |
243 | 0 | PosixTimeZone { inner: sh } |
244 | 0 | } |
245 | | } |
246 | | |
247 | | impl PosixTimeZone<&'static str> { |
248 | | /// Converts from the shared-but-internal API for use in proc macros. |
249 | | /// |
250 | | /// This works in a `const` context by requiring that the time zone |
251 | | /// abbreviations are `static` strings. This is used when converting |
252 | | /// code generated by a proc macro to this Jiff internal type. |
253 | 0 | pub(crate) const fn from_shared_const( |
254 | 0 | sh: shared::PosixTimeZone<&'static str>, |
255 | 0 | ) -> PosixTimeZoneStatic { |
256 | 0 | PosixTimeZone { inner: sh } |
257 | 0 | } |
258 | | } |
259 | | |
260 | | impl<ABBREV: AsRef<str>> PosixTimeZone<ABBREV> { |
261 | | /// Returns the appropriate time zone offset to use for the given |
262 | | /// timestamp. |
263 | | /// |
264 | | /// If you need information like whether the offset is in DST or not, or |
265 | | /// the time zone abbreviation, then use `PosixTimeZone::to_offset_info`. |
266 | | /// But that API may be more expensive to use, so only use it if you need |
267 | | /// the additional data. |
268 | 0 | pub(crate) fn to_offset(&self, timestamp: Timestamp) -> Offset { |
269 | 0 | Offset::from_ioffset_const( |
270 | 0 | self.inner.to_offset(timestamp.to_itimestamp_const()), |
271 | 0 | ) |
272 | 0 | } Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<jiff::shared::util::array_str::ArrayStr<30>>>::to_offset Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<&str>>::to_offset |
273 | | |
274 | | /// Returns the appropriate time zone offset to use for the given |
275 | | /// timestamp. |
276 | | /// |
277 | | /// This also includes whether the offset returned should be considered |
278 | | /// to be "DST" or not, along with the time zone abbreviation (e.g., EST |
279 | | /// for standard time in New York, and EDT for DST in New York). |
280 | 0 | pub(crate) fn to_offset_info( |
281 | 0 | &self, |
282 | 0 | timestamp: Timestamp, |
283 | 0 | ) -> TimeZoneOffsetInfo<'_> { |
284 | 0 | let (ioff, abbrev, is_dst) = |
285 | 0 | self.inner.to_offset_info(timestamp.to_itimestamp_const()); |
286 | 0 | let offset = Offset::from_ioffset_const(ioff); |
287 | 0 | let abbreviation = TimeZoneAbbreviation::Borrowed(abbrev); |
288 | 0 | TimeZoneOffsetInfo { offset, dst: Dst::from(is_dst), abbreviation } |
289 | 0 | } Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<jiff::shared::util::array_str::ArrayStr<30>>>::to_offset_info Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<&str>>::to_offset_info |
290 | | |
291 | | /// Returns a possibly ambiguous timestamp for the given civil datetime. |
292 | | /// |
293 | | /// The given datetime should correspond to the "wall" clock time of what |
294 | | /// humans use to tell time for this time zone. |
295 | | /// |
296 | | /// Note that "ambiguous timestamp" is represented by the possible |
297 | | /// selection of offsets that could be applied to the given datetime. In |
298 | | /// general, it is only ambiguous around transitions to-and-from DST. The |
299 | | /// ambiguity can arise as a "fold" (when a particular wall clock time is |
300 | | /// repeated) or as a "gap" (when a particular wall clock time is skipped |
301 | | /// entirely). |
302 | 0 | pub(crate) fn to_ambiguous_kind(&self, dt: DateTime) -> AmbiguousOffset { |
303 | 0 | let iamoff = self.inner.to_ambiguous_kind(dt.to_idatetime_const()); |
304 | 0 | AmbiguousOffset::from_iambiguous_offset_const(iamoff) |
305 | 0 | } Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<jiff::shared::util::array_str::ArrayStr<30>>>::to_ambiguous_kind Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<&str>>::to_ambiguous_kind |
306 | | |
307 | | /// Returns the timestamp of the most recent time zone transition prior |
308 | | /// to the timestamp given. If one doesn't exist, `None` is returned. |
309 | 0 | pub(crate) fn previous_transition( |
310 | 0 | &self, |
311 | 0 | timestamp: Timestamp, |
312 | 0 | ) -> Option<TimeZoneTransition> { |
313 | 0 | let (its, ioff, abbrev, is_dst) = |
314 | 0 | self.inner.previous_transition(timestamp.to_itimestamp_const())?; |
315 | 0 | let timestamp = Timestamp::from_itimestamp_const(its); |
316 | 0 | let offset = Offset::from_ioffset_const(ioff); |
317 | 0 | let dst = Dst::from(is_dst); |
318 | 0 | Some(TimeZoneTransition { timestamp, offset, abbrev, dst }) |
319 | 0 | } Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<jiff::shared::util::array_str::ArrayStr<30>>>::previous_transition Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<&str>>::previous_transition |
320 | | |
321 | | /// Returns the timestamp of the soonest time zone transition after the |
322 | | /// timestamp given. If one doesn't exist, `None` is returned. |
323 | 0 | pub(crate) fn next_transition( |
324 | 0 | &self, |
325 | 0 | timestamp: Timestamp, |
326 | 0 | ) -> Option<TimeZoneTransition> { |
327 | 0 | let (its, ioff, abbrev, is_dst) = |
328 | 0 | self.inner.next_transition(timestamp.to_itimestamp_const())?; |
329 | 0 | let timestamp = Timestamp::from_itimestamp_const(its); |
330 | 0 | let offset = Offset::from_ioffset_const(ioff); |
331 | 0 | let dst = Dst::from(is_dst); |
332 | 0 | Some(TimeZoneTransition { timestamp, offset, abbrev, dst }) |
333 | 0 | } Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<jiff::shared::util::array_str::ArrayStr<30>>>::next_transition Unexecuted instantiation: <jiff::tz::posix::PosixTimeZone<&str>>::next_transition |
334 | | } |
335 | | |
336 | | impl<ABBREV: AsRef<str>> core::fmt::Display for PosixTimeZone<ABBREV> { |
337 | 0 | fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result { |
338 | 0 | core::fmt::Display::fmt(&self.inner, f) |
339 | 0 | } |
340 | | } |
341 | | |
342 | | // The tests below require parsing which requires alloc. |
343 | | #[cfg(feature = "alloc")] |
344 | | #[cfg(test)] |
345 | | mod tests { |
346 | | use super::*; |
347 | | |
348 | | #[cfg(feature = "tz-system")] |
349 | | #[test] |
350 | | fn parse_posix_tz() { |
351 | | // We used to parse this and then error when we tried to |
352 | | // convert to a "reasonable" POSIX time zone with a DST |
353 | | // transition rule. We never actually used unreasonable POSIX |
354 | | // time zones and it was complicating the type definitions, so |
355 | | // now we just reject it outright. |
356 | | assert!(PosixTzEnv::parse("EST5EDT").is_err()); |
357 | | |
358 | | let tz = PosixTzEnv::parse(":EST5EDT").unwrap(); |
359 | | assert_eq!(tz, PosixTzEnv::Implementation("EST5EDT".into())); |
360 | | |
361 | | // We require implementation strings to be UTF-8, because we're |
362 | | // sensible. |
363 | | assert!(PosixTzEnv::parse(b":EST5\xFFEDT").is_err()); |
364 | | } |
365 | | } |