Coverage Report

Created: 2025-09-27 06:48

next uncovered line (L), next uncovered region (R), next uncovered branch (B)
/rust/registry/src/index.crates.io-1949cf8c6b5b557f/jiff-0.2.15/src/shared/mod.rs
Line
Count
Source
1
/*!
2
Defines data types shared between `jiff` and `jiff-static`.
3
4
While this module exposes types that can be imported outside of `jiff` itself,
5
there are *no* semver guarantees provided. That is, this module is _not_ part
6
of Jiff's public API. The only guarantee of compatibility that is provided
7
is that `jiff-static x.y.z` works with one and only one version of Jiff,
8
corresponding to `jiff x.y.z` (i.e., the same version number).
9
10
# Design
11
12
This module is really accomplishing two different things at the same time.
13
14
Firstly, it is a way to provide types that can be used to construct a static
15
`TimeZone`. The proc macros in `jiff-static` generate code using these
16
types (and a few routines).
17
18
Secondly, it provides a way to parse TZif data without `jiff-static`
19
depending on `jiff` via a Cargo dependency. This actually requires copying
20
the code in this module (which is why it is kinda sectioned off from the rest
21
of jiff) into the `jiff-static` crate. This can be done automatically with
22
`jiff-cli`:
23
24
```text
25
jiff-cli generate shared
26
```
27
28
The copying of code is pretty unfortunate, because it means both crates have to
29
compile it. However, the alternatives aren't great either.
30
31
One alternative is to have `jiff-static` explicitly depend on `jiff` in its
32
`Cargo.toml`. Then Jiff could expose the parsing routines, as it does here,
33
and `jiff-static` could use them directly. Unfortunately, this means that
34
`jiff` cannot depend on `jiff-static`. And that in turn means that `jiff`
35
cannot re-export the macros. Users will need to explicitly depend on and use
36
`jiff-static`. Moreover, this could result in some potential surprises
37
since `jiff-static` will need to have an `=x.y.z` dependency on Jiff for
38
compatibility reasons. That in turn means that the version of Jiff actually
39
used is not determine by the user's `jiff = "x.y.z"` line, but rather by the
40
user's `jiff-static = "x'.y'.z'"` line. This is overall annoying and not a
41
good user experience. Plus, it inverts the typical relationship between crates
42
and their proc macros (e.g., `serde` and `serde_derive`) and thus could result
43
in other unanticipated surprises.
44
45
Another obvious alternative is to split this code out into a separate crate
46
that both `jiff` and `jiff-static` depend on. However, the API exposed in
47
this module does not provide a coherent user experience. It would either need a
48
ton of work to turn it into a coherent user experience or it would need to be
49
published as a `jiff-internal-use-only` crate that I find to be very annoying
50
and confusing. Moreover, a separate crate introduces a new semver boundary
51
beneath Jiff. I've found these sorts of things to overall increase maintenance
52
burden (see ripgrep and regex for cases where I did this).
53
54
I overall decided that the least bad choice was to copy a little code (under
55
2,000 source lines of code at present I believe). Since the copy is managed
56
automatically via `jiff-cli generate shared`, we remove the downside of the
57
code getting out of sync. The only downside is extra compile time. Since I
58
generally only expect `jiff-static` to be used in niche circumstances, I
59
prefer this trade-off over the other choices.
60
61
More context on how I arrived at this design can be found here:
62
<https://github.com/BurntSushi/jiff/issues/256>
63
64
# Particulars
65
66
When this code is copied to `jiff-static`, the following transformations are
67
done:
68
69
* A header is added to indicate that the copied file is auto-generated.
70
* All `#[cfg(feature = "alloc")]` annotations are removed. The `jiff-static`
71
  proc macro always runs in a context where the standard library is available.
72
* Any code between `// only-jiff-start` and `// only-jiff-end` comments is
73
  removed. Nesting isn't supported.
74
75
Otherwise, this module is specifically organized in a way that doesn't rely on
76
any other part of Jiff. The one exception are routines to convert from these
77
exposed types to other internal types inside of Jiff. This is necessary for
78
building a static `TimeZone`. But these conversion routines are removed when
79
this module is copied to `jiff-static`.
80
*/
81
82
/// An alias for TZif data whose backing storage has a `'static` lifetime.
83
// only-jiff-start
84
pub type TzifStatic = Tzif<
85
    &'static str,
86
    &'static str,
87
    &'static [TzifLocalTimeType],
88
    &'static [i64],
89
    &'static [TzifDateTime],
90
    &'static [TzifDateTime],
91
    &'static [TzifTransitionInfo],
92
>;
93
// only-jiff-end
94
95
/// An alias for TZif data whose backing storage is on the heap.
96
#[cfg(feature = "alloc")]
97
pub type TzifOwned = Tzif<
98
    alloc::string::String,
99
    self::util::array_str::Abbreviation,
100
    alloc::vec::Vec<TzifLocalTimeType>,
101
    alloc::vec::Vec<i64>,
102
    alloc::vec::Vec<TzifDateTime>,
103
    alloc::vec::Vec<TzifDateTime>,
104
    alloc::vec::Vec<TzifTransitionInfo>,
105
>;
106
107
/// An alias for TZif transition data whose backing storage is on the heap.
108
#[cfg(feature = "alloc")]
109
pub type TzifTransitionsOwned = TzifTransitions<
110
    alloc::vec::Vec<i64>,
111
    alloc::vec::Vec<TzifDateTime>,
112
    alloc::vec::Vec<TzifDateTime>,
113
    alloc::vec::Vec<TzifTransitionInfo>,
114
>;
115
116
#[derive(Clone, Debug)]
117
pub struct Tzif<STR, ABBREV, TYPES, TIMESTAMPS, STARTS, ENDS, INFOS> {
118
    pub fixed: TzifFixed<STR, ABBREV>,
119
    pub types: TYPES,
120
    pub transitions: TzifTransitions<TIMESTAMPS, STARTS, ENDS, INFOS>,
121
}
122
123
#[derive(Clone, Debug)]
124
pub struct TzifFixed<STR, ABBREV> {
125
    pub name: Option<STR>,
126
    /// An ASCII byte corresponding to the version number. So, 0x50 is '2'.
127
    ///
128
    /// This is unused. It's only used in `test` compilation for emitting
129
    /// diagnostic data about TZif files. If we really need to use this, we
130
    /// should probably just convert it to an actual integer.
131
    pub version: u8,
132
    pub checksum: u32,
133
    pub designations: STR,
134
    pub posix_tz: Option<PosixTimeZone<ABBREV>>,
135
}
136
137
#[derive(Clone, Copy, Debug)]
138
pub struct TzifLocalTimeType {
139
    pub offset: i32,
140
    pub is_dst: bool,
141
    pub designation: (u8, u8), // inclusive..exclusive
142
    pub indicator: TzifIndicator,
143
}
144
145
/// This enum corresponds to the possible indicator values for standard/wall
146
/// and UT/local.
147
///
148
/// Note that UT+Wall is not allowed.
149
///
150
/// I honestly have no earthly clue what they mean. I've read the section about
151
/// them in RFC 8536 several times and I can't make sense of it. I've even
152
/// looked at data files that have these set and still can't make sense of
153
/// them. I've even looked at what other datetime libraries do with these, and
154
/// they all seem to just ignore them. Like, WTF. I've spent the last couple
155
/// months of my life steeped in time, and I just cannot figure this out. Am I
156
/// just dumb?
157
///
158
/// Anyway, we parse them, but otherwise ignore them because that's what all
159
/// the cool kids do.
160
///
161
/// The default is `LocalWall`, which also occurs when no indicators are
162
/// present.
163
///
164
/// I tried again and still don't get it. Here's a dump for `Pacific/Honolulu`:
165
///
166
/// ```text
167
/// $ ./scripts/jiff-debug tzif /usr/share/zoneinfo/Pacific/Honolulu
168
/// TIME ZONE NAME
169
///   /usr/share/zoneinfo/Pacific/Honolulu
170
/// LOCAL TIME TYPES
171
///   000: offset=-10:31:26, is_dst=false, designation=LMT, indicator=local/wall
172
///   001: offset=-10:30, is_dst=false, designation=HST, indicator=local/wall
173
///   002: offset=-09:30, is_dst=true, designation=HDT, indicator=local/wall
174
///   003: offset=-09:30, is_dst=true, designation=HWT, indicator=local/wall
175
///   004: offset=-09:30, is_dst=true, designation=HPT, indicator=ut/std
176
///   005: offset=-10, is_dst=false, designation=HST, indicator=local/wall
177
/// TRANSITIONS
178
///   0000: -9999-01-02T01:59:59 :: -377705023201 :: type=0, -10:31:26, is_dst=false, LMT, local/wall
179
///   0001: 1896-01-13T22:31:26 :: -2334101314 :: type=1, -10:30, is_dst=false, HST, local/wall
180
///   0002: 1933-04-30T12:30:00 :: -1157283000 :: type=2, -09:30, is_dst=true, HDT, local/wall
181
///   0003: 1933-05-21T21:30:00 :: -1155436200 :: type=1, -10:30, is_dst=false, HST, local/wall
182
///   0004: 1942-02-09T12:30:00 :: -880198200 :: type=3, -09:30, is_dst=true, HWT, local/wall
183
///   0005: 1945-08-14T23:00:00 :: -769395600 :: type=4, -09:30, is_dst=true, HPT, ut/std
184
///   0006: 1945-09-30T11:30:00 :: -765376200 :: type=1, -10:30, is_dst=false, HST, local/wall
185
///   0007: 1947-06-08T12:30:00 :: -712150200 :: type=5, -10, is_dst=false, HST, local/wall
186
/// POSIX TIME ZONE STRING
187
///   HST10
188
/// ```
189
///
190
/// See how type 004 has a ut/std indicator? What the fuck does that mean?
191
/// All transitions are defined in terms of UTC. I confirmed this with `zdump`:
192
///
193
/// ```text
194
/// $ zdump -v Pacific/Honolulu | rg 1945
195
/// Pacific/Honolulu  Tue Aug 14 22:59:59 1945 UT = Tue Aug 14 13:29:59 1945 HWT isdst=1 gmtoff=-34200
196
/// Pacific/Honolulu  Tue Aug 14 23:00:00 1945 UT = Tue Aug 14 13:30:00 1945 HPT isdst=1 gmtoff=-34200
197
/// Pacific/Honolulu  Sun Sep 30 11:29:59 1945 UT = Sun Sep 30 01:59:59 1945 HPT isdst=1 gmtoff=-34200
198
/// Pacific/Honolulu  Sun Sep 30 11:30:00 1945 UT = Sun Sep 30 01:00:00 1945 HST isdst=0 gmtoff=-37800
199
/// ```
200
///
201
/// The times match up. All of them. The indicators don't seem to make a
202
/// difference. I'm clearly missing something.
203
#[derive(Clone, Copy, Debug)]
204
pub enum TzifIndicator {
205
    LocalWall,
206
    LocalStandard,
207
    UTStandard,
208
}
209
210
/// The set of transitions in TZif data, laid out in column orientation.
211
///
212
/// The column orientation is used to make TZ lookups faster. Specifically,
213
/// for finding an offset for a timestamp, we do a binary search on
214
/// `timestamps`. For finding an offset for a local datetime, we do a binary
215
/// search on `civil_starts`. By making these two distinct sequences with
216
/// nothing else in them, we make them as small as possible and thus improve
217
/// cache locality.
218
///
219
/// All sequences in this type are in correspondence with one another. They
220
/// are all guaranteed to have the same length.
221
#[derive(Clone, Debug)]
222
pub struct TzifTransitions<TIMESTAMPS, STARTS, ENDS, INFOS> {
223
    /// The timestamp at which this transition begins.
224
    pub timestamps: TIMESTAMPS,
225
    /// The wall clock time for when a transition begins.
226
    pub civil_starts: STARTS,
227
    /// The wall clock time for when a transition ends.
228
    ///
229
    /// This is only non-zero when the transition kind is a gap or a fold.
230
    pub civil_ends: ENDS,
231
    /// Any other relevant data about a transition, such as its local type
232
    /// index and the transition kind.
233
    pub infos: INFOS,
234
}
235
236
/// TZif transition info beyond the timestamp and civil datetime.
237
///
238
/// For example, this contains a transition's "local type index," which in
239
/// turn gives access to the offset (among other metadata) for that transition.
240
#[derive(Clone, Copy, Debug)]
241
pub struct TzifTransitionInfo {
242
    /// The index into the sequence of local time type records. This is what
243
    /// provides the correct offset (from UTC) that is active beginning at
244
    /// this transition.
245
    pub type_index: u8,
246
    /// The boundary condition for quickly determining if a given wall clock
247
    /// time is ambiguous (i.e., falls in a gap or a fold).
248
    pub kind: TzifTransitionKind,
249
}
250
251
/// The kind of a transition.
252
///
253
/// This is used when trying to determine the offset for a local datetime. It
254
/// indicates how the corresponding civil datetimes in `civil_starts` and
255
/// `civil_ends` should be interpreted. That is, there are three possible
256
/// cases:
257
///
258
/// 1. The offset of this transition is equivalent to the offset of the
259
/// previous transition. That means there are no ambiguous civil datetimes
260
/// between the transitions. This can occur, e.g., when the time zone
261
/// abbreviation changes.
262
/// 2. The offset of the transition is greater than the offset of the previous
263
/// transition. That means there is a "gap" in local time between the
264
/// transitions. This typically corresponds to entering daylight saving time.
265
/// It is usually, but not always, 1 hour.
266
/// 3. The offset of the transition is less than the offset of the previous
267
/// transition. That means there is a "fold" in local time where time is
268
/// repeated. This typically corresponds to leaving daylight saving time. It
269
/// is usually, but not always, 1 hour.
270
///
271
/// # More explanation
272
///
273
/// This, when combined with `civil_starts` and `civil_ends` in
274
/// `TzifTransitions`, explicitly represents ambiguous wall clock times that
275
/// occur at the boundaries of transitions.
276
///
277
/// The start of the wall clock time is always the earlier possible wall clock
278
/// time that could occur with this transition's corresponding offset. For a
279
/// gap, it's the previous transition's offset. For a fold, it's the current
280
/// transition's offset.
281
///
282
/// For example, DST for `America/New_York` began on `2024-03-10T07:00:00+00`.
283
/// The offset prior to this instant in time is `-05`, corresponding
284
/// to standard time (EST). Thus, in wall clock time, DST began at
285
/// `2024-03-10T02:00:00`. And since this is a DST transition that jumps ahead
286
/// an hour, the start of DST also corresponds to the start of a gap. That is,
287
/// the times `02:00:00` through `02:59:59` never appear on a clock for this
288
/// hour. The question is thus: which offset should we apply to `02:00:00`?
289
/// We could apply the offset from the earlier transition `-05` and get
290
/// `2024-03-10T01:00:00-05` (that's `2024-03-10T06:00:00+00`), or we could
291
/// apply the offset from the later transition `-04` and get
292
/// `2024-03-10T03:00:00-04` (that's `2024-03-10T07:00:00+00`).
293
///
294
/// So in the above, we would have a `Gap` variant where `start` (inclusive) is
295
/// `2024-03-10T02:00:00` and `end` (exclusive) is `2024-03-10T03:00:00`.
296
///
297
/// The fold case is the same idea, but where the same time is repeated.
298
/// For example, in `America/New_York`, standard time began on
299
/// `2024-11-03T06:00:00+00`. The offset prior to this instant in time
300
/// is `-04`, corresponding to DST (EDT). Thus, in wall clock time, DST
301
/// ended at `2024-11-03T02:00:00`. However, since this is a fold, the
302
/// actual set of ambiguous times begins at `2024-11-03T01:00:00` and
303
/// ends at `2024-11-03T01:59:59.999999999`. That is, the wall clock time
304
/// `2024-11-03T02:00:00` is unambiguous.
305
///
306
/// So in the fold case above, we would have a `Fold` variant where
307
/// `start` (inclusive) is `2024-11-03T01:00:00` and `end` (exclusive) is
308
/// `2024-11-03T02:00:00`.
309
///
310
/// Since this gets bundled in with the sorted sequence of transitions, we'll
311
/// use the "start" time in all three cases as our target of binary search.
312
/// Once we land on a transition, we'll know our given wall clock time is
313
/// greater than or equal to its start wall clock time. At that point, to
314
/// determine if there is ambiguity, we merely need to determine if the given
315
/// wall clock time is less than the corresponding `end` time. If it is, then
316
/// it falls in a gap or fold. Otherwise, it's unambiguous.
317
///
318
/// Note that we could compute these datetime values while searching for the
319
/// correct transition, but there's a fair bit of math involved in going
320
/// between timestamps (which is what TZif gives us) and calendar datetimes
321
/// (which is what we're given as input). It is also necessary that we offset
322
/// the timestamp given in TZif at some point, since it is in UTC and the
323
/// datetime given is in wall clock time. So I decided it would be worth
324
/// pre-computing what we need in terms of what the input is. This way, we
325
/// don't need to do any conversions, or indeed, any arithmetic at all, for
326
/// time zone lookups. We *could* store these as transitions, but then the
327
/// input datetime would need to be converted to a timestamp before searching
328
/// the transitions.
329
#[derive(Clone, Copy, Debug)]
330
pub enum TzifTransitionKind {
331
    /// This transition cannot possibly lead to an unambiguous offset because
332
    /// its offset is equivalent to the offset of the previous transition.
333
    ///
334
    /// Has an entry in `civil_starts`, but corresponding entry in `civil_ends`
335
    /// is always zeroes (i.e., meaningless).
336
    Unambiguous,
337
    /// This occurs when this transition's offset is strictly greater than the
338
    /// previous transition's offset. This effectively results in a "gap" of
339
    /// time equal to the difference in the offsets between the two
340
    /// transitions.
341
    ///
342
    /// Has an entry in `civil_starts` for when the gap starts (inclusive) in
343
    /// local time. Also has an entry in `civil_ends` for when the fold ends
344
    /// (exclusive) in local time.
345
    Gap,
346
    /// This occurs when this transition's offset is strictly less than the
347
    /// previous transition's offset. This results in a "fold" of time where
348
    /// the two transitions have an overlap where it is ambiguous which one
349
    /// applies given a wall clock time. In effect, a span of time equal to the
350
    /// difference in the offsets is repeated.
351
    ///
352
    /// Has an entry in `civil_starts` for when the fold starts (inclusive) in
353
    /// local time. Also has an entry in `civil_ends` for when the fold ends
354
    /// (exclusive) in local time.
355
    Fold,
356
}
357
358
/// The representation we use to represent a civil datetime.
359
///
360
/// We don't use `shared::util::itime::IDateTime` here because we specifically
361
/// do not need to represent fractional seconds. This lets us easily represent
362
/// what we need in 8 bytes instead of the 12 bytes used by `IDateTime`.
363
///
364
/// Moreover, we pack the fields into a single `i64` to make comparisons
365
/// extremely cheap. This is especially useful since we do a binary search on
366
/// `&[TzifDateTime]` when doing a TZ lookup for a civil datetime.
367
#[derive(Clone, Copy, Debug, Eq, Hash, PartialEq, PartialOrd, Ord)]
368
pub struct TzifDateTime {
369
    bits: i64,
370
}
371
372
impl TzifDateTime {
373
    pub const ZERO: TzifDateTime = TzifDateTime::new(0, 0, 0, 0, 0, 0);
374
375
0
    pub const fn new(
376
0
        year: i16,
377
0
        month: i8,
378
0
        day: i8,
379
0
        hour: i8,
380
0
        minute: i8,
381
0
        second: i8,
382
0
    ) -> TzifDateTime {
383
        // TzifDateTime { year, month, day, hour, minute, second }
384
0
        let mut bits = (year as u64) << 48;
385
0
        bits |= (month as u64) << 40;
386
0
        bits |= (day as u64) << 32;
387
0
        bits |= (hour as u64) << 24;
388
0
        bits |= (minute as u64) << 16;
389
0
        bits |= (second as u64) << 8;
390
        // The least significant 8 bits remain 0.
391
0
        TzifDateTime { bits: bits as i64 }
392
0
    }
393
394
0
    pub const fn year(self) -> i16 {
395
0
        (self.bits as u64 >> 48) as u16 as i16
396
0
    }
397
398
0
    pub const fn month(self) -> i8 {
399
0
        (self.bits as u64 >> 40) as u8 as i8
400
0
    }
401
402
0
    pub const fn day(self) -> i8 {
403
0
        (self.bits as u64 >> 32) as u8 as i8
404
0
    }
405
406
0
    pub const fn hour(self) -> i8 {
407
0
        (self.bits as u64 >> 24) as u8 as i8
408
0
    }
409
410
0
    pub const fn minute(self) -> i8 {
411
0
        (self.bits as u64 >> 16) as u8 as i8
412
0
    }
413
414
0
    pub const fn second(self) -> i8 {
415
0
        (self.bits as u64 >> 8) as u8 as i8
416
0
    }
417
}
418
419
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
420
pub struct PosixTimeZone<ABBREV> {
421
    pub std_abbrev: ABBREV,
422
    pub std_offset: PosixOffset,
423
    pub dst: Option<PosixDst<ABBREV>>,
424
}
425
426
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
427
pub struct PosixDst<ABBREV> {
428
    pub abbrev: ABBREV,
429
    pub offset: PosixOffset,
430
    pub rule: PosixRule,
431
}
432
433
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
434
pub struct PosixRule {
435
    pub start: PosixDayTime,
436
    pub end: PosixDayTime,
437
}
438
439
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
440
pub struct PosixDayTime {
441
    pub date: PosixDay,
442
    pub time: PosixTime,
443
}
444
445
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
446
pub enum PosixDay {
447
    /// Julian day in a year, no counting for leap days.
448
    ///
449
    /// Valid range is `1..=365`.
450
    JulianOne(i16),
451
    /// Julian day in a year, counting for leap days.
452
    ///
453
    /// Valid range is `0..=365`.
454
    JulianZero(i16),
455
    /// The nth weekday of a month.
456
    WeekdayOfMonth {
457
        /// The month.
458
        ///
459
        /// Valid range is: `1..=12`.
460
        month: i8,
461
        /// The week.
462
        ///
463
        /// Valid range is `1..=5`.
464
        ///
465
        /// One interesting thing to note here (or my interpretation anyway),
466
        /// is that a week of `4` means the "4th weekday in a month" where as
467
        /// a week of `5` means the "last weekday in a month, even if it's the
468
        /// 4th weekday."
469
        week: i8,
470
        /// The weekday.
471
        ///
472
        /// Valid range is `0..=6`, with `0` corresponding to Sunday.
473
        weekday: i8,
474
    },
475
}
476
477
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
478
pub struct PosixTime {
479
    pub second: i32,
480
}
481
482
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
483
pub struct PosixOffset {
484
    pub second: i32,
485
}
486
487
// only-jiff-start
488
impl TzifStatic {
489
0
    pub const fn into_jiff(self) -> crate::tz::tzif::TzifStatic {
490
0
        crate::tz::tzif::TzifStatic::from_shared_const(self)
491
0
    }
492
}
493
// only-jiff-end
494
495
// only-jiff-start
496
impl PosixTimeZone<&'static str> {
497
0
    pub const fn into_jiff(self) -> crate::tz::posix::PosixTimeZoneStatic {
498
0
        crate::tz::posix::PosixTimeZone::from_shared_const(self)
499
0
    }
500
}
501
// only-jiff-end
502
503
// Does not require `alloc`, but is only used when `alloc` is enabled.
504
#[cfg(feature = "alloc")]
505
pub(crate) mod crc32;
506
pub(crate) mod posix;
507
#[cfg(feature = "alloc")]
508
pub(crate) mod tzif;
509
pub(crate) mod util;