LCOV - cov.lcov - discof/repair/fd_repair

LCOV - code coverage report

Current view:	top level - discof/repair - fd_repair_tile.c (source / functions)		Hit	Total	Coverage
Test:	cov.lcov	Lines:	0	688	0.0 %
Date:	2026-03-19 18:19:27	Functions:	0	56	0.0 %

          Line data    Source code

       1             : /* The repair tile is responsible for repairing missing shreds that were
       2             :    not received via Turbine.  The goal is to ensure that slots we "care"
       3             :    about have their FEC sets inserted into store.
       4             : 
       5             :    Generally there are two distinct traffic patterns:
       6             : 
       7             :    a. Firedancer boots up and fires off a large number of repairs to
       8             :       recover all the blocks between the snapshot on which it is booting
       9             :       and the head of the chain.  In this mode, repair tile utilization
      10             :       is very high along with net and sign utilization.
      11             : 
      12             :    b. Firedancer catches up to the head of the chain and enters steady
      13             :       state where most shred traffic is delivered over turbine.  In this
      14             :       state, repairs are only occasionally needed to recover shreds lost
      15             :       due to anomalies like packet loss, transmitter (leader) never sent
      16             :       them or even a malicious leader etc.  On rare occasion, repair
      17             :       will also need to recover a different version of a block that
      18             :       equivocated.
      19             : 
      20             :    To accomplish the above, repair mainly processes 4 kinds of frags:
      21             : 
      22             :    1. Shred data (from shred tile)
      23             : 
      24             :       Any shred (coding or data) that passes validation and filtering in
      25             :       the shred tile is forwarded to repair.  Repair uses these to track
      26             :       which shreds have been received in `fd_forest`, a tree data
      27             :       structure that mirrors the block/slot ancestry chain.  It also
      28             :       uses these shreds to discover slots or ancestries that were not
      29             :       known. fd_forest tracks metadata for each slot, including slot
      30             :       completion status, merkle roots, and metrics.
      31             : 
      32             :       Any shred that we can correlate with a repair request we made is
      33             :       used to update peer response latency metrics in fd_policy (See
      34             :       fd_policy.h for more details).
      35             : 
      36             :    2. FEC status messages (from shred tile)
      37             : 
      38             :       These fall under two categories: FEC completion and FEC eviction.
      39             :       When all shreds in a FEC set have been recovered, the shred tile
      40             :       sends a completion message. This may trigger chained merkle
      41             :       verification if the slot has a confirmed block_id.  The completed
      42             :       FEC message is always forwarded to replay via repair_out.
      43             : 
      44             :       When an incomplete FEC set is evicted from the shred tile's FEC
      45             :       resolver (e.g. due to capacity), it also notifies repair.  Repair
      46             :       clears the corresponding FEC set entries from the forest so those
      47             :       shred indices can be re-requested if they are necessary.  As
      48             :       mentioned in fd_forest.h, forest needs to maintain a strict subset
      49             :       of shreds that are known by fec_resolver, store, and reasm in
      50             :       order to guarantee forward progress always.
      51             : 
      52             :    3. Pings (from net tile)
      53             : 
      54             :       Repair peers use a ping-pong protocol to verify liveness before
      55             :       serving repair requests.  When a ping arrives over the network,
      56             :       repair validates the message and constructs a pong response. To
      57             :       prevent spam attacks, repair has stopgaps like tracking how many
      58             :       pongs per peer are currently in the sign queue and dropping pings
      59             :       from unknown peers.  These are the only untrusted inputs to
      60             :       repair.
      61             : 
      62             :    4. Sign task responses (from sign tile)
      63             : 
      64             :       Repair requests are signed asynchronously; the repair tile
      65             :       constructs a repair request, dispatches it to a sign tile via the
      66             :       repair_sign output link.  The repair-sign communication is
      67             :       manually managed via credit tracking in the repair tile (see
      68             :       comment in out_ctx_t struct definition).
      69             : 
      70             :       After receiving the signature back from sign tile, repair injects
      71             :       the signature into the pending request and dispatches it to the
      72             :       net tile via the repair_net output link. The behavior depends on
      73             :       the request type:
      74             :       - Pong: the signed pong is sent to the peer that pinged us.
      75             :       - Warmup: a proactive request sent when a new peer's contact info
      76             :         first arrives, prepaying the ping-pong RTT cost.  The signed
      77             :         request is sent to the peer if it is still active.
      78             :       - Regular shred request: the request is recorded in an inflight
      79             :         table (for tracking response latency and timeouts) and the
      80             :         signed packet is sent to the selected peer.
      81             : 
      82             :    Secondary "other" frags that are processed but not part of core
      83             :    repair logic:
      84             : 
      85             :    5. Confirmation messages from tower
      86             : 
      87             :       Tower sends two kinds of messages relevant to repair:
      88             :       - slot_done: indicates a slot has finished replay and may advance
      89             :         the root.  Repair publishes (roots) the forest up to that slot,
      90             :         pruning old ancestry.
      91             :       - slot_confirmed: indicates a slot has reached a confirmation
      92             :         level (e.g. duplicate-confirmed).  If the slot is not yet in the
      93             :         forest, repair creates a sentinel block so it can be repaired.
      94             :         It also stores the confirmed block_id and may trigger chained
      95             :         merkle verification. See fd_forest.h on more details about
      96             :         chained merkle verification.
      97             : 
      98             :    6. Eviction messages from replay (reasm)
      99             : 
     100             :       When the replay tile's reassembly buffer evicts a FEC set (e.g.
     101             :       due to pool capacity) from itself and from store, it notifies
     102             :       repair with the slot and fec_set_idx.  Repair clears those FEC
     103             :       entries from the forest so the shreds can be re-requested.
     104             : 
     105             :    7. Contact info messages from gossip
     106             : 
     107             :       Gossip forwards contact info updates and removals for other
     108             :       validators.  Repair uses these to maintain a list of peers to
     109             :       make requests to.
     110             : 
     111             :    If fd_forest tracks what we know about each shred, fd_policy and
     112             :    fd_inflights is responsible for deciding what next repair request to
     113             :    make. fd_policy and fd_inflights split responsibility: fd_policy
     114             :    makes any new requests, orphan requests, and requests directly off
     115             :    the forest iterator, while fd_inflights re-requests anything that has
     116             :    been requested but not received yet within a timeout window. */
     117             : 
     118             : #define _GNU_SOURCE
     119             : 
     120             : #include "../genesis/fd_genesi_tile.h"
     121             : #include "../../disco/topo/fd_topo.h"
     122             : #include "generated/fd_repair_tile_seccomp.h"
     123             : #include "../../disco/fd_disco.h"
     124             : #include "../../disco/keyguard/fd_keyload.h"
     125             : #include "../../disco/keyguard/fd_keyguard.h"
     126             : #include "../../disco/keyguard/fd_keyswitch.h"
     127             : #include "../../disco/net/fd_net_tile.h"
     128             : #include "../../disco/shred/fd_rnonce_ss.h"
     129             : #include "../../flamenco/gossip/fd_gossip_message.h"
     130             : #include "../tower/fd_tower_tile.h"
     131             : #include "../../discof/restore/utils/fd_ssmsg.h"
     132             : #include "../../util/net/fd_net_headers.h"
     133             : #include "../../util/pod/fd_pod_format.h"
     134             : #include "../../tango/fd_tango_base.h"
     135             : 
     136             : #include "../forest/fd_forest.h"
     137             : #include "fd_repair_metrics.h"
     138             : #include "fd_inflight.h"
     139             : #include "fd_repair.h"
     140             : #include "fd_policy.h"
     141             : #include "../replay/fd_replay_tile.h"
     142             : 
     143             : #define LOGGING       1
     144             : #define DEBUG_LOGGING 0
     145             : 
     146             : #define IN_KIND_CONTACT (0)
     147           0 : #define IN_KIND_NET     (1)
     148           0 : #define IN_KIND_TOWER   (2)
     149           0 : #define IN_KIND_SHRED   (3)
     150           0 : #define IN_KIND_SIGN    (4)
     151           0 : #define IN_KIND_SNAP    (5)
     152           0 : #define IN_KIND_GOSSIP  (6)
     153           0 : #define IN_KIND_GENESIS (7)
     154           0 : #define IN_KIND_REPLAY  (8)
     155             : 
     156             : #define MAX_IN_LINKS    (32)
     157             : #define MAX_SHRED_TILE_CNT ( 16UL )
     158             : #define MAX_SIGN_TILE_CNT  ( 16UL )
     159             : 
     160             : /* Max number of validators that can be actively queried */
     161           0 : #define FD_REPAIR_PEER_MAX (FD_CONTACT_INFO_TABLE_SIZE)
     162             : 
     163             : /* Max number of pending repair requests recently made to keep track of.
     164             :    Calculated generally as we estimate around 50k/s/core to sign
     165             :    requests. Assuming an over-provisioned 4 sign tiles just for repair,
     166             :    this means we can make up to ~200k requests per second.  With a dedup
     167             :    timeout of 80ms, this means we can make up to ~16k requests within
     168             :    the dedup timeout window.  We round up to the next power of two to
     169             :    get the dedup cache max.  Since we are sizing the dedup cache for a
     170             :    generous margin, and this number not particularly fragile or
     171             :    sensitive, we can leave it static. */
     172           0 : #define FD_DEDUP_CACHE_MAX (1<<15)
     173             : 
     174             : /* static map from request type to metric array index */
     175             : static uint metric_index[FD_REPAIR_KIND_ORPHAN + 1] = {
     176             :   [FD_REPAIR_KIND_PONG]          = FD_METRICS_ENUM_REPAIR_SENT_REQUEST_TYPES_V_PONG_IDX,
     177             :   [FD_REPAIR_KIND_SHRED]         = FD_METRICS_ENUM_REPAIR_SENT_REQUEST_TYPES_V_NEEDED_WINDOW_IDX,
     178             :   [FD_REPAIR_KIND_HIGHEST_SHRED] = FD_METRICS_ENUM_REPAIR_SENT_REQUEST_TYPES_V_NEEDED_HIGHEST_WINDOW_IDX,
     179             :   [FD_REPAIR_KIND_ORPHAN]        = FD_METRICS_ENUM_REPAIR_SENT_REQUEST_TYPES_V_NEEDED_ORPHAN_IDX,
     180             : };
     181             : 
     182             : typedef union {
     183             :   struct {
     184             :     fd_wksp_t * mem;
     185             :     ulong       chunk0;
     186             :     ulong       wmark;
     187             :     ulong       mtu;
     188             :   };
     189             :   fd_net_rx_bounds_t net_rx;
     190             : } in_ctx_t;
     191             : 
     192             : struct out_ctx {
     193             :   ulong         idx;
     194             :   fd_wksp_t *   mem;
     195             :   ulong         chunk0;
     196             :   ulong         wmark;
     197             :   ulong         chunk;
     198             : 
     199             :   /* Repair tile directly tracks credit outside of stem for these
     200             :      asynchronous sign links.  In particular, credits tracks the RETURN
     201             :      sign_repair link.  This is because repair_sign is reliable, and
     202             :      sign_repair is unreliable.  If both links were reliable, and the
     203             :      links filled completely, stem would get into a deadlock. Neither
     204             :      repair or sign would have credits, which would prevent frags from
     205             :      getting polled in repair or sign, which would prevent any credits
     206             :      from getting returned back to the tiles.  So the sign_repair return
     207             :      link must be unreliable. credits / max_credits are used by the
     208             :      repair_sign link.  In particular, credits manages the RETURN
     209             :      sign_repair link.
     210             : 
     211             :      Consider the scenario:
     212             : 
     213             :              repair_sign (depth 128)        sign_repair (depth 128)
     214             :      repair  ---------------------->  sign ------------------------> repair
     215             :              [rest free, r130, r129]       [r128, r127, ... , r1] (full)
     216             : 
     217             :      If repair is publishing too many requests too fast(common in
     218             :      catchup), and not polling enough frags from sign, without manual
     219             :      management the sign_repair link would be overrun.  Nothing is
     220             :      stopping repair from publishing more requests, because sign is
     221             :      functioning fast enough to handle the requests. However, nothing is
     222             :      stopping sign from polling the next request and signing it, and
     223             :      PUBLISHING it on the sign_repair link that is already full, because
     224             :      the sign_repair link is unreliable.
     225             : 
     226             :      This is why we need to manually track credits for the sign_repair
     227             :      link. We must ensure that there are never more than 128 items in
     228             :      the ENTIRE repair_sign -> sign tile -> sign_repair work queue, else
     229             :      there is always a possibility of an overrun in the sign_repair
     230             :      link.
     231             : 
     232             :      We can furthermore ensure some nice properties by having the
     233             :      repair_sign link have a greater depth than the sign_repair link.
     234             :      This way, we exclusively use manual credit management to control
     235             :      the rate at which we publish requests to sign.  We can then avoid
     236             :      being stem backpressured, which allows us to keep polling frags and
     237             :      reading incoming shreds, even when the repair sign link is "full."
     238             :      This is a non-necessary property for good performance.
     239             : 
     240             :      To lose a frag to overrun isn't necessarily critical, but in
     241             :      general the repair tile relies on the fact that a signing task
     242             :      published to sign tile will always come back.  If we lose a frag to
     243             :      overrun, then there will be an entry in the pending signs structure
     244             :      that is never removed, and theoretically the map could fill up.
     245             :      Conceptually, with a reliable sign->repair->sign structure, there
     246             :      should be no eviction needed in this pending signs structure. */
     247             : 
     248             :   ulong in_idx;      /* index of the incoming link */
     249             :   ulong credits;     /* available credits for link */
     250             :   ulong max_credits; /* maximum credits (depth) */
     251             : };
     252             : typedef struct out_ctx out_ctx_t;
     253             : 
     254             : /* Data needed to sign and send a pong that is not contained in the
     255             :    pong msg itself. */
     256             : 
     257             : struct pong_data {
     258             :   fd_ip4_port_t  peer_addr;
     259             :   fd_hash_t      hash;
     260             :   uint           daddr;
     261             :   fd_pubkey_t    key;
     262             : };
     263             : typedef struct pong_data pong_data_t;
     264             : 
     265             : struct sign_req {
     266             :   ulong       key;        /* map key, ctx->pending_key_next */
     267             :   ulong       buflen;
     268             :   union {
     269             :     uchar           buf[sizeof(fd_repair_msg_t)];
     270             :     fd_repair_msg_t msg;
     271             :   };
     272             :   pong_data_t  pong_data; /* populated only for pong msgs */
     273             : };
     274             : typedef struct sign_req sign_req_t;
     275             : 
     276             : #define MAP_NAME         fd_signs_map
     277           0 : #define MAP_KEY          key
     278           0 : #define MAP_KEY_NULL     ULONG_MAX
     279           0 : #define MAP_KEY_INVAL(k) (k==ULONG_MAX)
     280           0 : #define MAP_T            sign_req_t
     281             : #define MAP_MEMOIZE      0
     282             : #include "../../util/tmpl/fd_map_dynamic.c"
     283             : 
     284             : /* Because the sign tiles could be all busy when a contact info or a
     285             :    ping arrives, we need to save ping messages to be signed in a queue
     286             :    and dispatched in after_credit when there are sign tiles available.
     287             :    The size of the queue is sized to be the number of warm up
     288             :    requests we might burst to the queue all at once (at most
     289             :    FD_REPAIR_PEER_MAX), then doubled for good measure.
     290             : 
     291             :    There is a possibility that someone could spam pings to block other
     292             :    peers' pings (and prevent us from responding to those pings). To
     293             :    mitigate this, we track the number of pings currently living in the
     294             :    sign queue that belong to each peer.  If a peer already has a pong
     295             :    living in the sign queue, we drop the pings from that peer.
     296             : 
     297             :    The peer could send us a new bogus ping every time we pop their ping
     298             :    from the sign queue, but there would be no way to prevent other
     299             :    peers' pings from getting processed, so the wasted work and impact
     300             :    would be minimal.
     301             : 
     302             :    Typical flow is that a pong will get added to the pong_queue during
     303             :    an after_frag call.  Then on the following after_credit will get
     304             :    popped from the sign_queue and added to sign_map, and then dispatched
     305             :    to the sign tile. */
     306             : 
     307             : struct sign_pending {
     308             :   fd_repair_msg_t msg;
     309             :   pong_data_t     pong_data; /* populated only for pong msgs */
     310             : };
     311             : typedef struct sign_pending sign_pending_t;
     312             : 
     313             : #define QUEUE_NAME       fd_signs_queue
     314           0 : #define QUEUE_T          sign_pending_t
     315           0 : #define QUEUE_MAX        (2*FD_REPAIR_PEER_MAX)
     316             : #include "../../util/tmpl/fd_queue.c"
     317             : 
     318             : struct ctx {
     319             :   long tsdebug; /* timestamp for debug printing */
     320             : 
     321             :   ulong repair_seed;
     322             : 
     323             :   fd_keyswitch_t * keyswitch;
     324             :   int              halt_signing;
     325             : 
     326             :   fd_ip4_port_t repair_intake_addr;
     327             :   fd_ip4_port_t repair_serve_addr;
     328             : 
     329             :   fd_forest_t    * forest;
     330             :   fd_policy_t    * policy;
     331             :   fd_inflights_t * inflights;
     332             :   fd_repair_t    * protocol;
     333             : 
     334             :   ulong enforce_fixed_fec_set; /* min slot where the feature is enforced */
     335             : 
     336             :   fd_pubkey_t identity_public_key;
     337             : 
     338             :   fd_wksp_t * wksp;
     339             : 
     340             :   fd_stem_context_t * stem;
     341             : 
     342             :   uchar    in_kind[ MAX_IN_LINKS ];
     343             :   in_ctx_t in_links[ MAX_IN_LINKS ];
     344             : 
     345             :   int skip_frag;
     346             : 
     347             :   out_ctx_t net_out_ctx[1];
     348             : 
     349             :   out_ctx_t repair_out_ctx[1];
     350             : 
     351             :   /* repair_sign links (to sign tiles 1+) - for round-robin distribution */
     352             : 
     353             :   ulong     repair_sign_cnt;
     354             :   out_ctx_t repair_sign_out_ctx[ MAX_SIGN_TILE_CNT ];
     355             : 
     356             :   ulong     sign_rrobin_idx;
     357             : 
     358             :   /* Pending sign requests for async operations */
     359             : 
     360             :   uint             pending_key_next;
     361             :   sign_req_t     * signs_map;  /* contains any request currently in the repair->sign or sign->repair dcache */
     362             :   sign_pending_t * pong_queue;  /* contains any pong or initial warmup request waiting to be dispatched to repair->sign. Size is 2*FD_REPAIR_PEER_MAX */
     363             : 
     364             :   ushort net_id;
     365             : 
     366             :   /* Buffers for incoming unreliable frags */
     367             :   uchar net_buf[ FD_NET_MTU ];
     368             :   uchar sign_buf[ sizeof(fd_ed25519_sig_t) ];
     369             : 
     370             :   /* Store chunk for incoming reliable frags */
     371             :   ulong chunk;
     372             :   ulong snap_out_chunk; /* store second to last chunk for snap_out */
     373             : 
     374             :   fd_ip4_udp_hdrs_t intake_hdr[1];
     375             :   fd_ip4_udp_hdrs_t serve_hdr [1];
     376             : 
     377             :   fd_rnonce_ss_t repair_nonce_ss[1];
     378             : 
     379             :   ulong manifest_slot;
     380             :   struct {
     381             :     ulong send_pkt_cnt;
     382             :     ulong sent_pkt_types[FD_METRICS_ENUM_REPAIR_SENT_REQUEST_TYPES_CNT];
     383             :     ulong repaired_slots;
     384             :     ulong current_slot;
     385             :     ulong sign_tile_unavail;
     386             :     ulong rerequest;
     387             :     ulong malformed_ping;
     388             :     ulong unknown_peer_ping;
     389             :     ulong fail_sigverify_ping;
     390             :     fd_histf_t slot_compl_time[ 1 ];
     391             :     fd_histf_t response_latency[ 1 ];
     392             :     ulong blk_evicted;
     393             :     ulong blk_failed_insert;
     394             :   } metrics[ 1 ];
     395             : 
     396             :   /* Slot-level metrics */
     397             : 
     398             :   fd_repair_metrics_t * slot_metrics;
     399             :   ulong turbine_slot0;  // catchup considered complete after this slot
     400             :   struct {
     401             :     int   enabled;
     402             :     int   eqvoc;     /* if eqvoc is enabled, the end_slot will first be generated incorrectly, and then confirmed correctly */
     403             :     ulong end_slot;
     404             :     int   complete;
     405             :   } profiler;
     406             : };
     407             : typedef struct ctx ctx_t;
     408             : 
     409             : FD_FN_CONST static inline ulong
     410           0 : scratch_align( void ) {
     411           0 :   return 128UL;
     412           0 : }
     413             : 
     414             : FD_FN_PURE static inline ulong
     415           0 : loose_footprint( fd_topo_tile_t const * tile FD_PARAM_UNUSED ) {
     416           0 :   return 1UL * FD_SHMEM_GIGANTIC_PAGE_SZ;
     417           0 : }
     418             : 
     419             : FD_FN_PURE static inline ulong
     420           0 : scratch_footprint( fd_topo_tile_t const * tile ) {
     421           0 :   ulong total_sign_depth = tile->repair.repair_sign_depth * tile->repair.repair_sign_cnt;
     422           0 :   int   lg_sign_depth    = fd_ulong_find_msb( fd_ulong_pow2_up(total_sign_depth) ) + 1;
     423             : 
     424           0 :   ulong l = FD_LAYOUT_INIT;
     425           0 :   l = FD_LAYOUT_APPEND( l, alignof(ctx_t),            sizeof(ctx_t)                                                      );
     426           0 :   l = FD_LAYOUT_APPEND( l, fd_repair_align(),         fd_repair_footprint     ()                                         );
     427           0 :   l = FD_LAYOUT_APPEND( l, fd_forest_align(),         fd_forest_footprint     ( tile->repair.slot_max )                  );
     428           0 :   l = FD_LAYOUT_APPEND( l, fd_policy_align(),         fd_policy_footprint     ( FD_DEDUP_CACHE_MAX, FD_REPAIR_PEER_MAX ) );
     429           0 :   l = FD_LAYOUT_APPEND( l, fd_inflights_align(),      fd_inflights_footprint  ()                                         );
     430           0 :   l = FD_LAYOUT_APPEND( l, fd_signs_map_align(),      fd_signs_map_footprint  ( lg_sign_depth )                          );
     431           0 :   l = FD_LAYOUT_APPEND( l, fd_signs_queue_align(),    fd_signs_queue_footprint()                                         );
     432           0 :   l = FD_LAYOUT_APPEND( l, fd_repair_metrics_align(), fd_repair_metrics_footprint()                                      );
     433           0 :   return FD_LAYOUT_FINI( l, scratch_align() );
     434           0 : }
     435             : 
     436             : /* Below functions manage the current pending sign requests. */
     437             : 
     438             : static sign_req_t *
     439             : sign_map_insert( ctx_t *                 ctx,
     440             :                  fd_repair_msg_t const * msg,
     441           0 :                  pong_data_t const     * opt_pong_data ) {
     442             : 
     443             :   /* Check if there is any space for a new pending sign request. Should never fail as long as credit management is working. */
     444           0 :   if( FD_UNLIKELY( fd_signs_map_key_cnt( ctx->signs_map )==fd_signs_map_key_max( ctx->signs_map ) ) ) return NULL;
     445             : 
     446           0 :   sign_req_t * pending = fd_signs_map_insert( ctx->signs_map, ctx->pending_key_next++ );
     447           0 :   if( FD_UNLIKELY( !pending ) ) return NULL; /* Not possible, unless the same key is used twice. */
     448           0 :   pending->msg    = *msg;
     449           0 :   pending->buflen = fd_repair_sz( msg );
     450           0 :   if( FD_UNLIKELY( opt_pong_data ) ) pending->pong_data = *opt_pong_data;
     451           0 :   return pending;
     452           0 : }
     453             : 
     454             : static int
     455             : sign_map_remove( ctx_t * ctx,
     456           0 :                  ulong   key ) {
     457           0 :   sign_req_t * pending = fd_signs_map_query( ctx->signs_map, key, NULL );
     458           0 :   if( FD_UNLIKELY( !pending ) ) return -1;
     459           0 :   fd_signs_map_remove( ctx->signs_map, pending );
     460           0 :   return 0;
     461           0 : }
     462             : 
     463             : static void
     464             : send_packet( ctx_t             * ctx,
     465             :              fd_stem_context_t * stem,
     466             :              int                 is_intake,
     467             :              uint                dst_ip_addr,
     468             :              ushort              dst_port,
     469             :              uint                src_ip_addr,
     470             :              uchar const *       payload,
     471             :              ulong               payload_sz,
     472           0 :              ulong               tsorig ) {
     473           0 :   ctx->metrics->send_pkt_cnt++;
     474           0 :   uchar * packet = fd_chunk_to_laddr( ctx->net_out_ctx->mem, ctx->net_out_ctx->chunk );
     475           0 :   fd_ip4_udp_hdrs_t * hdr = (fd_ip4_udp_hdrs_t *)packet;
     476           0 :   *hdr = *(is_intake ? ctx->intake_hdr : ctx->serve_hdr);
     477             : 
     478           0 :   fd_ip4_hdr_t * ip4 = hdr->ip4;
     479           0 :   ip4->saddr       = src_ip_addr;
     480           0 :   ip4->daddr       = dst_ip_addr;
     481           0 :   ip4->net_id      = fd_ushort_bswap( ctx->net_id++ );
     482           0 :   ip4->check       = 0U;
     483           0 :   ip4->net_tot_len = fd_ushort_bswap( (ushort)(payload_sz + sizeof(fd_ip4_hdr_t)+sizeof(fd_udp_hdr_t)) );
     484           0 :   ip4->check       = fd_ip4_hdr_check_fast( ip4 );
     485             : 
     486           0 :   fd_udp_hdr_t * udp = hdr->udp;
     487           0 :   udp->net_dport = dst_port;
     488           0 :   udp->net_len   = fd_ushort_bswap( (ushort)(payload_sz + sizeof(fd_udp_hdr_t)) );
     489           0 :   fd_memcpy( packet+sizeof(fd_ip4_udp_hdrs_t), payload, payload_sz );
     490           0 :   hdr->udp->check = 0U;
     491             : 
     492           0 :   ulong tspub     = fd_frag_meta_ts_comp( fd_tickcount() );
     493           0 :   ulong sig       = fd_disco_netmux_sig( dst_ip_addr, dst_port, dst_ip_addr, DST_PROTO_OUTGOING, sizeof(fd_ip4_udp_hdrs_t) );
     494           0 :   ulong packet_sz = payload_sz + sizeof(fd_ip4_udp_hdrs_t);
     495           0 :   ulong chunk     = ctx->net_out_ctx->chunk;
     496           0 :   fd_stem_publish( stem, ctx->net_out_ctx->idx, sig, chunk, packet_sz, 0UL, tsorig, tspub );
     497           0 :   ctx->net_out_ctx->chunk = fd_dcache_compact_next( chunk, packet_sz, ctx->net_out_ctx->chunk0, ctx->net_out_ctx->wmark );
     498           0 : }
     499             : 
     500             : /* Returns a sign_out context with max available credits.
     501             :    If no sign_out context has available credits, returns NULL. */
     502             : static out_ctx_t *
     503           0 : sign_avail_credits( ctx_t * ctx ) {
     504           0 :   out_ctx_t * sign_out = NULL;
     505           0 :   ulong max_credits = 0;
     506           0 :   for( uint i = 0; i < ctx->repair_sign_cnt; i++ ) {
     507           0 :     if( ctx->repair_sign_out_ctx[i].credits > max_credits ) {
     508           0 :       max_credits =  ctx->repair_sign_out_ctx[i].credits;
     509           0 :       sign_out    = &ctx->repair_sign_out_ctx[i];
     510           0 :     }
     511           0 :   }
     512           0 :   return sign_out;
     513           0 : }
     514             : 
     515             : /* Prepares the signing preimage and publishes a signing request that
     516             :    will be signed asynchronously by the sign tile.  The signed data will
     517             :    be returned via dcache as a frag. */
     518             : static void
     519             : fd_repair_send_sign_request( ctx_t                 * ctx,
     520             :                              out_ctx_t             * sign_out,
     521             :                              fd_repair_msg_t const * msg,
     522           0 :                              pong_data_t     const * opt_pong_data ) {
     523             : 
     524           0 :   if( FD_UNLIKELY( ctx->halt_signing ) ) FD_LOG_CRIT(( "can't dispatch sign requests while halting signing" ));
     525             : 
     526             :   /* New sign request */
     527           0 :   sign_req_t * pending = sign_map_insert( ctx, msg, opt_pong_data );
     528           0 :   if( FD_UNLIKELY( !pending ) ) return;
     529             : 
     530           0 :   ulong   sig         = 0;
     531           0 :   ulong   preimage_sz = 0;
     532           0 :   uchar * dst         = fd_chunk_to_laddr( sign_out->mem, sign_out->chunk );
     533             : 
     534           0 :   if( FD_UNLIKELY( msg->kind == FD_REPAIR_KIND_PONG ) ) {
     535           0 :     uchar pre_image[FD_REPAIR_PONG_PREIMAGE_SZ];
     536           0 :     preimage_pong( &opt_pong_data->hash, pre_image, sizeof(pre_image) );
     537           0 :     preimage_sz = FD_REPAIR_PONG_PREIMAGE_SZ;
     538           0 :     fd_memcpy( dst, pre_image, preimage_sz );
     539           0 :     sig = ((ulong)pending->key << 32) | (uint)FD_KEYGUARD_SIGN_TYPE_SHA256_ED25519;
     540           0 :   } else {
     541             :     /* Sign and prepare the message directly into the pending buffer */
     542           0 :     uchar * preimage = preimage_req( &pending->msg, &preimage_sz );
     543           0 :     fd_memcpy( dst, preimage, preimage_sz );
     544           0 :     sig = ((ulong)pending->key << 32) | (uint)FD_KEYGUARD_SIGN_TYPE_ED25519;
     545           0 :   }
     546             : 
     547           0 :   fd_stem_publish( ctx->stem, sign_out->idx, sig, sign_out->chunk, preimage_sz, 0UL, 0UL, 0UL );
     548           0 :   sign_out->chunk = fd_dcache_compact_next( sign_out->chunk, preimage_sz, sign_out->chunk0, sign_out->wmark );
     549             : 
     550           0 :   ctx->metrics->sent_pkt_types[metric_index[msg->kind]]++;
     551           0 :   sign_out->credits--;
     552           0 : }
     553             : 
     554             : static inline int
     555             : before_frag( ctx_t * ctx,
     556             :              ulong   in_idx,
     557             :              ulong   seq FD_PARAM_UNUSED,
     558           0 :              ulong   sig ) {
     559           0 :   uint in_kind = ctx->in_kind[ in_idx ];
     560           0 :   if( FD_LIKELY  ( in_kind==IN_KIND_NET   ) ) return fd_disco_netmux_sig_proto( sig )!=DST_PROTO_REPAIR;
     561           0 :   if( FD_UNLIKELY( in_kind==IN_KIND_SHRED ) ) return fd_int_if( fd_forest_root_slot( ctx->forest )==ULONG_MAX, -1, 0 ); /* not ready to read frag */
     562           0 :   if( FD_UNLIKELY( in_kind==IN_KIND_GOSSIP ) ) {
     563           0 :     return sig!=FD_GOSSIP_UPDATE_TAG_CONTACT_INFO &&
     564           0 :            sig!=FD_GOSSIP_UPDATE_TAG_CONTACT_INFO_REMOVE;
     565           0 :   }
     566           0 :   if( FD_UNLIKELY( in_kind==IN_KIND_REPLAY ) ) return sig!=REPLAY_SIG_REASM_EVICTED;
     567           0 :   return 0;
     568           0 : }
     569             : 
     570             : static void
     571             : during_frag( ctx_t * ctx,
     572             :              ulong   in_idx,
     573             :              ulong   seq FD_PARAM_UNUSED,
     574             :              ulong   sig,
     575             :              ulong   chunk,
     576             :              ulong   sz,
     577           0 :              ulong   ctl ) {
     578           0 :   ctx->skip_frag = 0;
     579             : 
     580           0 :   uint             in_kind =  ctx->in_kind[ in_idx ];
     581           0 :   in_ctx_t const * in_ctx  = &ctx->in_links[ in_idx ];
     582           0 :   ctx->chunk = chunk;
     583             : 
     584           0 :   if( FD_UNLIKELY( in_kind==IN_KIND_NET ) ) {
     585           0 :     ulong hdr_sz = fd_disco_netmux_sig_hdr_sz( sig );
     586           0 :     FD_TEST( hdr_sz <= sz ); /* Should be ensured by the net tile */
     587           0 :     uchar const * dcache_entry = fd_net_rx_translate_frag( &in_ctx->net_rx, chunk, ctl, sz );
     588           0 :     fd_memcpy( ctx->net_buf, dcache_entry, sz );
     589           0 :     return;
     590           0 :   }
     591             : 
     592           0 :   if( FD_UNLIKELY( in_kind==IN_KIND_GENESIS ) ) {
     593           0 :     FD_TEST( sizeof(fd_genesis_meta_t)<=sig );
     594           0 :     return;
     595           0 :   }
     596             : 
     597           0 :   if( FD_UNLIKELY( sz!=0UL && ( chunk<in_ctx->chunk0 || chunk>in_ctx->wmark || sz>in_ctx->mtu ) ) )
     598           0 :     FD_LOG_ERR(( "chunk %lu %lu corrupt, not in range [%lu,%lu] in kind %u", chunk, sz, in_ctx->chunk0, in_ctx->wmark, in_kind ));
     599             : 
     600           0 :   if( FD_UNLIKELY( in_kind==IN_KIND_SNAP ) ) {
     601           0 :     if( FD_UNLIKELY( fd_ssmsg_sig_message( sig )!=FD_SSMSG_DONE ) ) ctx->snap_out_chunk = chunk;
     602           0 :     return;
     603           0 :   }
     604             : 
     605           0 :   if( FD_UNLIKELY( in_kind==IN_KIND_SIGN ) ) {
     606             :     /* sign_repair is unreliable, so we copy the frag for convention.
     607             :        Theoretically impossible to overrun. */
     608           0 :     uchar const * dcache_entry = fd_chunk_to_laddr_const( in_ctx->mem, chunk );
     609           0 :     fd_memcpy( ctx->sign_buf, dcache_entry, sz );
     610           0 :     return;
     611           0 :   }
     612           0 : }
     613             : 
     614             : static inline void
     615             : after_snap( ctx_t * ctx,
     616             :                  ulong                  sig,
     617           0 :                  uchar const          * chunk ) {
     618           0 :   if( FD_UNLIKELY( fd_ssmsg_sig_message( sig )!=FD_SSMSG_DONE ) ) return;
     619           0 :   fd_snapshot_manifest_t * manifest = (fd_snapshot_manifest_t *)chunk;
     620             : 
     621           0 :   fd_forest_init( ctx->forest, manifest->slot );
     622           0 : }
     623             : 
     624             : static inline void
     625           0 : after_contact( ctx_t * ctx, fd_gossip_update_message_t const * msg ) {
     626           0 :   fd_gossip_contact_info_t const * contact_info = msg->contact_info->value;
     627           0 :   fd_ip4_port_t repair_peer;
     628           0 :   repair_peer.addr = contact_info->sockets[ FD_GOSSIP_CONTACT_INFO_SOCKET_SERVE_REPAIR ].is_ipv6 ? 0U : contact_info->sockets[ FD_GOSSIP_CONTACT_INFO_SOCKET_SERVE_REPAIR ].ip4;
     629           0 :   repair_peer.port = contact_info->sockets[ FD_GOSSIP_CONTACT_INFO_SOCKET_SERVE_REPAIR ].port;
     630           0 :   if( FD_UNLIKELY( !repair_peer.addr || !repair_peer.port ) ) return;
     631           0 :   fd_policy_peer_t const * peer = fd_policy_peer_upsert( ctx->policy, fd_type_pun_const( msg->origin ), &repair_peer );
     632           0 :   if( FD_LIKELY( peer && !fd_signs_queue_full( ctx->pong_queue ) ) ) {
     633             :     /* The repair process uses a Ping-Pong protocol that incurs one
     634             :        round-trip time (RTT) for the initial repair request.  To
     635             :        optimize this, we proactively send a placeholder repair request
     636             :        as soon as we receive a peer's contact information for the first
     637             :        time, effectively prepaying the RTT cost. */
     638           0 :     fd_repair_msg_t * init = fd_repair_shred( ctx->protocol, fd_type_pun_const( msg->origin ), (ulong)fd_log_wallclock()/1000000L, 0, 0, 0 );
     639           0 :     fd_signs_queue_push( ctx->pong_queue, (sign_pending_t){ .msg = *init } );
     640           0 :   }
     641           0 : }
     642             : 
     643             : static inline void
     644             : after_sign( ctx_t             * ctx,
     645             :             ulong               in_idx,
     646             :             ulong               sig,
     647           0 :             fd_stem_context_t * stem ) {
     648           0 :   ulong pending_key = sig >> 32;
     649             :   /* Look up the pending request. Since the repair_sign links are
     650             :      reliable, the incoming sign_repair fragments represent a complete
     651             :      set of the previously sent outgoing messages. However, with
     652             :      multiple sign tiles, the responses may arrive interleaved. */
     653             : 
     654             :   /* Find which sign tile sent this response and increment its credits */
     655           0 :   for( uint i = 0; i < ctx->repair_sign_cnt; i++ ) {
     656           0 :     if( ctx->repair_sign_out_ctx[i].in_idx == in_idx ) {
     657           0 :       if( ctx->repair_sign_out_ctx[i].credits < ctx->repair_sign_out_ctx[i].max_credits ) {
     658           0 :         ctx->repair_sign_out_ctx[i].credits++;
     659           0 :       }
     660           0 :       break;
     661           0 :     }
     662           0 :   }
     663             : 
     664           0 :   sign_req_t * pending_ = fd_signs_map_query( ctx->signs_map, pending_key, NULL );
     665           0 :   if( FD_UNLIKELY( !pending_ ) ) FD_LOG_CRIT(( "No pending request found for key %lu", pending_key )); /* implies either bad programmer error or something happened with sign tile */
     666             : 
     667           0 :   sign_req_t   pending[1] = { *pending_ }; /* Make a copy of the pending request so we can sign_map_remove immediately. */
     668           0 :   sign_map_remove( ctx, pending_key );
     669             : 
     670             :   /* This is a pong message */
     671           0 :   if( FD_UNLIKELY( pending->msg.kind == FD_REPAIR_KIND_PONG ) ) {
     672           0 :     fd_policy_peer_t * peer = fd_policy_peer_query( ctx->policy, &pending->pong_data.key );
     673           0 :     if( FD_LIKELY( peer && peer->ping ) ) peer->ping--; /* prevent underflow if the peer was removed/readded */
     674             : 
     675           0 :     fd_memcpy( pending->msg.pong.sig, ctx->sign_buf, 64UL );
     676           0 :     send_packet( ctx, stem, 1, pending->pong_data.peer_addr.addr, pending->pong_data.peer_addr.port, pending->pong_data.daddr, pending->buf, fd_repair_sz( &pending->msg ), fd_frag_meta_ts_comp( fd_tickcount() ) );
     677           0 :     return;
     678           0 :   }
     679             : 
     680             :   /* Inject the signature into the pending request */
     681           0 :   fd_memcpy( pending->buf + 4, ctx->sign_buf, 64UL );
     682           0 :   uint  src_ip4 = 0U;
     683             : 
     684             :   /* This is a warmup message */
     685           0 :   if( FD_UNLIKELY( pending->msg.kind == FD_REPAIR_KIND_SHRED && pending->msg.shred.slot == 0 ) ) {
     686           0 :     fd_policy_peer_t * peer = fd_policy_peer_query( ctx->policy, &pending->msg.shred.to );
     687           0 :     if( FD_UNLIKELY( peer ) ) send_packet( ctx, stem, 1, peer->ip4, peer->port, src_ip4, pending->buf, pending->buflen, fd_frag_meta_ts_comp( fd_tickcount() ) );
     688           0 :     else { /* This is a warmup request for a peer that is no longer active.  There's no reason to pick another peer for a warmup rq, so just drop it. */ }
     689           0 :     return;
     690           0 :   }
     691             : 
     692             :   /* This is a regular repair shred request
     693             : 
     694             :      We need to ensure we always send out any shred requests we have,
     695             :      because policy_next has no way to revisit a shred.  But the fact
     696             :      that peers can drop out of the peer list makes this complicated.
     697             :      If the peer is still there (common), it's fine.  If the peer is not
     698             :      there, we can add this request to the inflights table, pretend
     699             :      we've sent it and let the inflight timeout request it down the
     700             :      line. */
     701             : 
     702           0 :   fd_policy_peer_t * active         = fd_policy_peer_query( ctx->policy, &pending->msg.shred.to );
     703           0 :   int                is_regular_req = pending->msg.kind == FD_REPAIR_KIND_SHRED && pending->msg.shred.nonce > 0; // not a highest/orphan request
     704             : 
     705           0 :   if( FD_UNLIKELY( !active ) ) {
     706           0 :     if( FD_LIKELY( is_regular_req ) ) {
     707             :       /* Artificially add to the inflights table, pretend we've sent it
     708             :          and let the inflight timeout request it down the line. */
     709           0 :       fd_inflights_request_insert( ctx->inflights, pending->msg.shred.nonce, &pending->msg.shred.to, pending->msg.shred.slot, pending->msg.shred.shred_idx );
     710           0 :     }
     711           0 :     return;
     712           0 :   }
     713             :   /* Happy path - all is well, our peer didn't drop out from beneath us. */
     714           0 :   if( FD_LIKELY( is_regular_req ) ) {
     715           0 :     fd_inflights_request_insert( ctx->inflights, pending->msg.shred.nonce, &pending->msg.shred.to, pending->msg.shred.slot, pending->msg.shred.shred_idx );
     716           0 :     fd_policy_peer_request_update( ctx->policy, &pending->msg.shred.to );
     717           0 :   }
     718           0 :   send_packet( ctx, stem, 1, active->ip4, active->port, src_ip4, pending->buf, pending->buflen, fd_frag_meta_ts_comp( fd_tickcount() ) );
     719           0 : }
     720             : 
     721             : static int
     722           0 : blk_insert_check( ctx_t * ctx, fd_forest_blk_t * new_blk, ulong new_slot, ulong evicted ) {
     723           0 :   if( FD_UNLIKELY( !new_blk ) ) {
     724           0 :     FD_LOG_WARNING(( "fd_forest_blk_insert: ignoring new slot %lu. pool is full and cannot evict", new_slot ));
     725           0 :     ctx->metrics->blk_failed_insert++;
     726           0 :     return 0;
     727           0 :   } else {
     728           0 :     if( FD_UNLIKELY( evicted != ULONG_MAX ) ) {
     729           0 :       FD_LOG_WARNING(( "fd_forest_blk_insert: evicted %lu and inserting new slot %lu", evicted, new_slot ));
     730           0 :       ctx->metrics->blk_evicted++;
     731           0 :     }
     732           0 :     return 1;
     733           0 :   }
     734           0 : }
     735             : 
     736             : static inline void
     737             : after_shred( ctx_t      * ctx,
     738             :              ulong        sig,
     739             :              fd_shred_t * shred,
     740             :              ulong        nonce,
     741             :              fd_hash_t *  mr,
     742           0 :              fd_hash_t *  cmr ) {
     743             :   /* Insert the shred sig (shared by all shred members in the FEC set)
     744             :       into the map. */
     745           0 :   int is_code = fd_shred_is_code( fd_shred_type( shred->variant ) );
     746           0 :   int src     = fd_disco_shred_out_shred_sig_is_turbine( sig ) ? SHRED_SRC_TURBINE : SHRED_SRC_REPAIR;
     747             : 
     748           0 :   if( FD_LIKELY( !is_code ) ) {
     749           0 :     long rtt = 0;
     750           0 :     fd_pubkey_t peer;
     751           0 :     if( FD_UNLIKELY( src == SHRED_SRC_REPAIR && ( rtt = fd_inflights_request_remove( ctx->inflights, nonce, shred->slot, shred->idx, &peer ) ) > 0 ) ) {
     752           0 :       fd_policy_peer_response_update( ctx->policy, &peer, rtt );
     753           0 :       fd_histf_sample( ctx->metrics->response_latency, (ulong)rtt );
     754           0 :     }
     755             : 
     756             :     /* we don't want to add a slot to the forest that chains to a slot
     757             :        older than root, to avoid filling forest up with junk.
     758             :        Especially if we are close to full and we are having trouble
     759             :        rooting, we can't rely on publishing to prune these useless
     760             :        subtrees. TODO: do the same with reasm/store/shred? */
     761           0 :     if( FD_UNLIKELY( shred->slot - shred->data.parent_off < fd_forest_root_slot( ctx->forest ) ) ) return;
     762             : 
     763           0 :     int slot_complete = !!(shred->data.flags & FD_SHRED_DATA_FLAG_SLOT_COMPLETE);
     764           0 :     int ref_tick      = shred->data.flags & FD_SHRED_DATA_REF_TICK_MASK;
     765           0 :     ulong evicted     = ULONG_MAX;
     766           0 :     fd_forest_blk_t * blk = fd_forest_blk_insert( ctx->forest, shred->slot, shred->slot - shred->data.parent_off, &evicted );
     767           0 :     if( FD_LIKELY( blk_insert_check( ctx, blk, shred->slot, evicted ) ) ) {
     768           0 :       fd_forest_data_shred_insert( ctx->forest, shred->slot, shred->slot - shred->data.parent_off, shred->idx, shred->fec_set_idx, slot_complete, ref_tick, src, mr, cmr );
     769           0 :     }
     770           0 :   } else {
     771           0 :     fd_forest_code_shred_insert( ctx->forest, shred->slot, shred->idx );
     772           0 :   }
     773           0 : }
     774             : 
     775             : /* Kicks off the chained merkle verification starting at a slot with
     776             :    a confirmed, canonical block_id.  Either finishes successfully and
     777             :    returns early, or detects an incorrect FEC set and clears it.  In
     778             :    this case the verification is paused and state is saved at where
     779             :    it left off.  Verification can be re-triggered in after_fec as well. */
     780             : static inline void
     781             : check_confirmed( ctx_t           * ctx,
     782             :                  fd_forest_blk_t * blk,
     783           0 :                  fd_hash_t const * confirmed_bid ) {
     784             : 
     785           0 :   if( FD_LIKELY( !blk->chain_confirmed && blk->complete_idx != UINT_MAX && blk->buffered_idx == blk->complete_idx ) ) {
     786             :     /* The above conditions say that all the shreds of the block have arrived. */
     787           0 :     fd_forest_blk_t * bad_blk = fd_forest_fec_chain_verify( ctx->forest, blk, confirmed_bid );
     788           0 :     if( FD_LIKELY( !bad_blk ) ) {
     789             :       /* chain verified successfully from blk to as far as we have fec data */
     790           0 :       return;
     791           0 :     }
     792             : 
     793           0 :     uint bad_fec_idx = fd_forest_merkle_last_incorrect_idx( bad_blk );
     794           0 :     FD_LOG_WARNING(( "slot %lu is complete but has incorrect FECs. bad blk %lu. last verified fec %u", blk->slot, bad_blk->slot, bad_fec_idx ));
     795             :     /* If we have a bad block, we need to dump and repair backwards from
     796             :        the point where the merkle root is incorrect.
     797             :        We start only by dumping the last incorrect FEC. It's possible that
     798             :        this is the only incorrect one.  If it isn't though, when the slot
     799             :        recompletes, this function will trigger again and we will dump the
     800             :        second to last incorrect FEC. */
     801             : 
     802           0 :     fd_forest_fec_clear( ctx->forest, bad_blk->slot, bad_fec_idx, FD_FEC_SHRED_CNT - 1 );
     803           0 :   }
     804           0 : }
     805             : 
     806             : static inline void
     807             : after_fec( ctx_t      * ctx,
     808             :            fd_shred_t * shred,
     809             :            fd_hash_t  * mr,
     810           0 :            fd_hash_t  * cmr ) {
     811             : 
     812             :   /* When this is a FEC completes msg, it is implied that all the
     813             :      other shreds in the FEC set can also be inserted.  Shred inserts
     814             :      into the forest are idempotent so it is fine to insert the same
     815             :      shred multiple times. */
     816             : 
     817           0 :   int slot_complete = !!( shred->data.flags & FD_SHRED_DATA_FLAG_SLOT_COMPLETE );
     818           0 :   int ref_tick      = shred->data.flags & FD_SHRED_DATA_REF_TICK_MASK;
     819             : 
     820             :   /* Similar to after_shred, do not insert a slot that chains to a slot older than root */
     821           0 :   if( FD_UNLIKELY( shred->slot - shred->data.parent_off < fd_forest_root_slot( ctx->forest ) ) ) return;
     822           0 :   ulong evicted  = ULONG_MAX;
     823           0 :   fd_forest_blk_t * ele = fd_forest_blk_insert( ctx->forest, shred->slot, shred->slot - shred->data.parent_off, &evicted );
     824           0 :   if( FD_UNLIKELY( !blk_insert_check( ctx, ele, shred->slot, evicted ) ) ) return;
     825           0 :   fd_forest_fec_insert( ctx->forest, shred->slot, shred->slot - shred->data.parent_off, shred->idx, shred->fec_set_idx, slot_complete, ref_tick, mr, cmr );
     826             : 
     827             :   /* metrics for completed slots */
     828           0 :   if( FD_UNLIKELY( ele->complete_idx != UINT_MAX && ele->buffered_idx==ele->complete_idx ) ) {
     829           0 :     long now = fd_tickcount();
     830           0 :     long start_ts = ele->first_req_ts == 0 || ele->slot >= ctx->turbine_slot0 ? ele->first_shred_ts : ele->first_req_ts;
     831           0 :     ulong duration_ticks = (ulong)(now - start_ts);
     832           0 :     fd_histf_sample( ctx->metrics->slot_compl_time, duration_ticks );
     833           0 :     fd_repair_metrics_add_slot( ctx->slot_metrics, ele->slot, start_ts, now, ele->repair_cnt, ele->turbine_cnt );
     834             :     /* Note: this log now no longer implies that the slot is fully
     835             :        executable, as we don't wait for FEC completion msgs to log this,
     836             :        only that a shred for every index has been received. It's
     837             :        possible that we have an unverified slot that doesn't chain
     838             :        verify, which is technically un-executable. */
     839           0 :     FD_LOG_INFO(( "slot is complete %lu. num_data_shreds: %u, num_repaired: %u, num_turbine: %u, num_recovered: %u, duration: %.2f ms", ele->slot, ele->complete_idx + 1, ele->repair_cnt, ele->turbine_cnt, ele->recovered_cnt, (double)fd_metrics_convert_ticks_to_nanoseconds(duration_ticks) / 1e6 ));
     840           0 :   }
     841             : 
     842             :   /* re-trigger continuation of chained merkle verification if this FEC
     843             :      set enables it */
     844           0 :   if( FD_UNLIKELY( ele->lowest_verified_fec == (shred->fec_set_idx / 32UL) + 1 ) &&
     845           0 :                    ele->buffered_idx == ele->complete_idx ) {
     846           0 :     check_confirmed( ctx, ele, &ele->confirmed_bid /* if lowest_verified_fec is not UINT_MAX, confirmed_bid must be populated */ );
     847           0 :   }
     848             : 
     849           0 :   if( FD_UNLIKELY( ctx->profiler.enabled ) ) {
     850             :     // If turbine slot 0 is in the consumed frontier, and it satisfies the
     851             :     // above conditions for completions, then catchup is complete
     852           0 :     fd_forest_blk_t * turbine0     = fd_forest_query( ctx->forest, ctx->turbine_slot0 );
     853           0 :     ulong             turbine0_idx = fd_forest_pool_idx( fd_forest_pool( ctx->forest ), turbine0 );
     854           0 :     fd_forest_ref_t * consumed     = fd_forest_consumed_ele_query( fd_forest_consumed( ctx->forest ), &turbine0_idx, NULL, fd_forest_conspool( ctx->forest ) );
     855           0 :     if( FD_UNLIKELY( consumed && turbine0->complete_idx != UINT_MAX && turbine0->complete_idx == turbine0->buffered_idx ) ) {
     856           0 :       FD_COMPILER_MFENCE();
     857           0 :       FD_VOLATILE( ctx->profiler.complete ) = 1;
     858           0 :     }
     859           0 :   }
     860           0 : }
     861             : 
     862             : static inline void
     863             : after_net( ctx_t * ctx,
     864           0 :            ulong   sz  ) {
     865           0 :   fd_eth_hdr_t * eth; fd_ip4_hdr_t * ip4; fd_udp_hdr_t * udp;
     866           0 :   uchar * data; ulong data_sz;
     867           0 :   if( FD_UNLIKELY( !fd_ip4_udp_hdr_strip( ctx->net_buf, sz, &data, &data_sz, &eth, &ip4, &udp ) ) ) {
     868           0 :     ctx->metrics->malformed_ping++;
     869           0 :     return;
     870           0 :   }
     871           0 :   fd_ip4_port_t peer_addr = { .addr=ip4->saddr, .port=udp->net_sport };
     872             : 
     873           0 :   fd_repair_ping_t ping[1];
     874           0 :   int err = fd_repair_ping_de( ping, data, data_sz );
     875           0 :   if( FD_UNLIKELY( err ) ) {
     876           0 :     ctx->metrics->malformed_ping++;
     877           0 :     return;
     878           0 :   }
     879             : 
     880           0 :   fd_policy_peer_t * peer = fd_policy_peer_query( ctx->policy, &ping->ping.from );
     881           0 :   if( FD_UNLIKELY( !peer ) ) {
     882           0 :     ctx->metrics->unknown_peer_ping++;
     883           0 :     return;
     884           0 :   }
     885           0 :   if( FD_UNLIKELY( peer->ping ) ) return;
     886           0 :   if( FD_UNLIKELY( fd_signs_queue_full( ctx->pong_queue ) ) ) return;
     887             : 
     888           0 :   fd_sha512_t sha[1];
     889           0 :   if( FD_UNLIKELY( FD_ED25519_SUCCESS != fd_ed25519_verify( ping->ping.hash.uc, 32UL, ping->ping.sig, ping->ping.from.uc, sha ) ) ) {
     890           0 :     ctx->metrics->fail_sigverify_ping++;
     891           0 :     return;
     892           0 :   }
     893             : 
     894             :   /* Any gossip peer can send a ping, but they are bounded to at most
     895             :      one ping in the queue so they can't evict others' pings without
     896             :      multiple gossip identities. */
     897             : 
     898           0 :   fd_repair_msg_t * pong = fd_repair_pong( ctx->protocol, &ping->ping.hash );
     899           0 :   fd_signs_queue_push( ctx->pong_queue, (sign_pending_t){ .msg = *pong, .pong_data = { .peer_addr = peer_addr, .hash = ping->ping.hash, .daddr = ip4->daddr, .key = ping->ping.from } } );
     900           0 :   peer->ping++;
     901           0 : }
     902             : 
     903             : static inline void
     904             : after_evict( ctx_t * ctx,
     905           0 :              ulong   sig ) {
     906           0 :   ulong spilled_slot        = fd_disco_shred_out_shred_sig_slot       ( sig );
     907           0 :   uint  spilled_fec_set_idx = fd_disco_shred_out_shred_sig_fec_set_idx( sig );
     908           0 :   uint  spilled_max_idx     = fd_disco_shred_out_shred_sig_shred_idx  ( sig );
     909             : 
     910           0 :   fd_forest_fec_clear( ctx->forest, spilled_slot, spilled_fec_set_idx, spilled_max_idx );
     911           0 : }
     912             : 
     913             : static void
     914             : after_frag( ctx_t *             ctx,
     915             :             ulong               in_idx,
     916             :             ulong               seq    FD_PARAM_UNUSED,
     917             :             ulong               sig,
     918             :             ulong               sz,
     919             :             ulong               tsorig FD_PARAM_UNUSED,
     920             :             ulong               tspub,
     921           0 :             fd_stem_context_t * stem ) {
     922           0 :   if( FD_UNLIKELY( ctx->skip_frag ) ) return;
     923             : 
     924           0 :   ctx->stem = stem;
     925           0 :   in_ctx_t const * in_ctx  = &ctx->in_links[ in_idx ];
     926           0 :   uint             in_kind = ctx->in_kind[ in_idx ];
     927             : 
     928           0 :   switch( in_kind ) {
     929             :     /* Unreliable frags */
     930           0 :     case IN_KIND_NET:  {
     931           0 :       after_net( ctx, sz );
     932           0 :       break;
     933           0 :     }
     934           0 :     case IN_KIND_SIGN: {
     935           0 :       after_sign( ctx, in_idx, sig, stem );
     936           0 :       break;
     937           0 :     }
     938             :     /* Reliable frags read directly from dcache */
     939           0 :     case IN_KIND_SNAP: {
     940           0 :       after_snap( ctx, sig, fd_chunk_to_laddr( ctx->in_links[ in_idx ].mem, ctx->snap_out_chunk ) );
     941           0 :       break;
     942           0 :     }
     943           0 :     case IN_KIND_GENESIS: {
     944           0 :       fd_genesis_meta_t const * meta = (fd_genesis_meta_t const *)fd_type_pun_const( fd_chunk_to_laddr( in_ctx->mem, ctx->chunk ) );
     945           0 :       if( meta->bootstrap ) fd_forest_init( ctx->forest, 0 );
     946           0 :       break;
     947           0 :     }
     948           0 :     case IN_KIND_GOSSIP: {
     949           0 :       fd_gossip_update_message_t const * msg = (fd_gossip_update_message_t const *)fd_type_pun_const( fd_chunk_to_laddr( in_ctx->mem, ctx->chunk ) );
     950           0 :       if( FD_LIKELY( sig==FD_GOSSIP_UPDATE_TAG_CONTACT_INFO ) ){
     951           0 :         after_contact( ctx, msg );
     952           0 :       } else {
     953           0 :         fd_policy_peer_remove( ctx->policy, fd_type_pun_const( msg->origin ) );
     954           0 :       }
     955           0 :       break;
     956           0 :     }
     957           0 :     case IN_KIND_REPLAY: {
     958           0 :       fd_replay_fec_evicted_t const * msg = (fd_replay_fec_evicted_t const *)fd_type_pun_const( fd_chunk_to_laddr( in_ctx->mem, ctx->chunk ) );
     959           0 :       fd_forest_fec_clear( ctx->forest, msg->slot, msg->fec_set_idx, FD_FEC_SHRED_CNT - 1 );
     960           0 :       break;
     961           0 :     }
     962           0 :     case IN_KIND_TOWER: {
     963           0 :       if( FD_LIKELY( sig==FD_TOWER_SIG_SLOT_DONE ) ) {
     964           0 :         fd_tower_slot_done_t const * msg = (fd_tower_slot_done_t const *)fd_type_pun_const( fd_chunk_to_laddr( in_ctx->mem, ctx->chunk ) );
     965           0 :         if( FD_LIKELY( msg->root_slot!=ULONG_MAX && msg->root_slot > fd_forest_root_slot( ctx->forest ) ) ) fd_forest_publish( ctx->forest, msg->root_slot );
     966           0 :       } else if( FD_LIKELY( sig==FD_TOWER_SIG_SLOT_CONFIRMED ) ) {
     967           0 :         fd_tower_slot_confirmed_t const * msg = (fd_tower_slot_confirmed_t const *)fd_type_pun_const( fd_chunk_to_laddr( in_ctx->mem, ctx->chunk ) );
     968           0 :         if( msg->slot > fd_forest_root_slot( ctx->forest ) && (msg->level >= FD_TOWER_SLOT_CONFIRMED_DUPLICATE ) ) {
     969           0 :           fd_forest_blk_t * blk = fd_forest_query( ctx->forest, msg->slot );
     970           0 :           if( FD_UNLIKELY( !blk ) ) {
     971             : 
     972             :             /* If we receive a confirmation for a slot we don't have,
     973             :                create a sentinel forest block that we can repair from. */
     974             : 
     975           0 :             ulong evicted = ULONG_MAX;
     976           0 :             blk = fd_forest_blk_insert( ctx->forest, msg->slot, msg->slot, &evicted );
     977           0 :             if( FD_LIKELY( blk_insert_check( ctx, blk, msg->slot, evicted ) ) ) {
     978           0 :               blk->confirmed_bid = msg->block_id;
     979           0 :               check_confirmed( ctx, blk, &msg->block_id );
     980           0 :             }
     981           0 :           }
     982           0 :         }
     983           0 :       }
     984           0 :       break;
     985           0 :     }
     986           0 :     case IN_KIND_SHRED: {
     987             : 
     988             :       /* There are 3 message types from shred:
     989             :           1. resolver evict - incomplete FEC set is evicted by resolver
     990             :           2. fec complete   - FEC set is completed by resolver. Also contains a shred.
     991             :           3. shred          - new shred
     992             : 
     993             :           Msgs 2 and 3 have a shred header in the dcache.  Msg 1 is empty. */
     994             : 
     995           0 :       int resolver_evicted = sz == 0;
     996           0 :       int fec_completes    = fd_disco_shred_out_msg_type( sig )==FD_SHRED_OUT_MSG_TYPE_FEC;
     997           0 :       if( FD_UNLIKELY( resolver_evicted ) ) {
     998           0 :         after_evict( ctx, sig );
     999           0 :         return;
    1000           0 :       }
    1001             : 
    1002           0 :       uchar * src = fd_chunk_to_laddr( in_ctx->mem, ctx->chunk );
    1003           0 :       fd_shred_t * shred = (fd_shred_t *)fd_type_pun( src );
    1004           0 :       fd_hash_t  * mr    = (fd_hash_t *)(src + fd_shred_header_sz( shred->variant ));
    1005           0 :       fd_hash_t  * cmr   = (fd_hash_t *)(src + fd_shred_header_sz( shred->variant ) + sizeof(fd_hash_t) );
    1006           0 :       uint         nonce = FD_LOAD(uint, src + fd_shred_header_sz( shred->variant ) + sizeof(fd_hash_t) + sizeof(fd_hash_t) ); /* gibberish if not shred msg */
    1007           0 :       if( FD_UNLIKELY( shred->slot <= fd_forest_root_slot( ctx->forest ) ) ) {
    1008           0 :         FD_LOG_INFO(( "shred %lu %u %u too old, ignoring", shred->slot, shred->idx, shred->fec_set_idx ));
    1009           0 :         return;
    1010           0 :       };
    1011             : 
    1012           0 :       if( FD_UNLIKELY( ctx->profiler.enabled && ctx->turbine_slot0 != ULONG_MAX && ( shred->slot > ctx->turbine_slot0 ) ) ) return;
    1013           0 :   #   if LOGGING
    1014           0 :       if( FD_UNLIKELY( shred->slot > ctx->metrics->current_slot ) ) {
    1015           0 :         FD_LOG_INFO(( "\n\n[Turbine]\n"
    1016           0 :                       "slot:             %lu\n"
    1017           0 :                       "root:             %lu\n",
    1018           0 :                       shred->slot,
    1019           0 :                       fd_forest_root_slot( ctx->forest ) ));
    1020           0 :       }
    1021           0 :   #   endif
    1022           0 :       ctx->metrics->current_slot  = fd_ulong_max( shred->slot, ctx->metrics->current_slot );
    1023           0 :       if( FD_UNLIKELY( ctx->turbine_slot0 == ULONG_MAX ) ) {
    1024             : 
    1025           0 :         if( FD_UNLIKELY( ctx->profiler.enabled ) ) {
    1026             :           /* we wait until the first turbine shred arrives to kick off
    1027             :              the profiler.  This is to let gossip peers accumulate similar
    1028             :              to a regular Firedancer run. */
    1029           0 :           fd_forest_blk_insert( ctx->forest, ctx->profiler.end_slot, ctx->profiler.end_slot, NULL );
    1030           0 :           fd_forest_code_shred_insert( ctx->forest, ctx->profiler.end_slot, 0 );
    1031             : 
    1032           0 :           ctx->turbine_slot0 = ctx->profiler.end_slot;
    1033           0 :           fd_repair_metrics_set_turbine_slot0( ctx->slot_metrics, ctx->profiler.end_slot );
    1034           0 :           fd_policy_set_turbine_slot0( ctx->policy, ctx->profiler.end_slot );
    1035           0 :           return;
    1036           0 :         }
    1037             : 
    1038           0 :         ctx->turbine_slot0 = shred->slot;
    1039           0 :         fd_repair_metrics_set_turbine_slot0( ctx->slot_metrics, shred->slot );
    1040           0 :         fd_policy_set_turbine_slot0( ctx->policy, shred->slot );
    1041           0 :       }
    1042             : 
    1043           0 :       if( FD_UNLIKELY( fec_completes ) ) {
    1044           0 :         after_fec( ctx, shred, mr, cmr );
    1045             : 
    1046             :         /* forward along to replay */
    1047           0 :         memcpy( fd_chunk_to_laddr( ctx->repair_out_ctx->mem, ctx->repair_out_ctx->chunk ), src, sz );
    1048           0 :         fd_stem_publish( ctx->stem, ctx->repair_out_ctx->idx, sig, ctx->repair_out_ctx->chunk, sz, 0UL, 0UL, tspub );
    1049           0 :         ctx->repair_out_ctx->chunk = fd_dcache_compact_next( ctx->repair_out_ctx->chunk, sz, ctx->repair_out_ctx->chunk0, ctx->repair_out_ctx->wmark );
    1050           0 :       } else {
    1051           0 :         after_shred( ctx, sig, shred, nonce, mr, cmr );
    1052           0 :       }
    1053             : 
    1054             :       /* update metrics */
    1055           0 :       ctx->metrics->repaired_slots = fd_forest_highest_repaired_slot( ctx->forest );
    1056           0 :       return;
    1057           0 :     }
    1058           0 :     default: FD_LOG_ERR(( "bad in_kind %u", in_kind )); /* Should never reach here since before_frag should have filtered out any unexpected frags. */
    1059           0 :   }
    1060           0 : }
    1061             : 
    1062             : static inline void
    1063             : after_credit( ctx_t *             ctx,
    1064             :               fd_stem_context_t * stem FD_PARAM_UNUSED,
    1065             :               int *               opt_poll_in FD_PARAM_UNUSED,
    1066           0 :               int *               charge_busy ) {
    1067           0 :   long now = fd_log_wallclock();
    1068             : 
    1069           0 :   if( FD_UNLIKELY( ctx->halt_signing ) ) {
    1070           0 :     *charge_busy = 1;
    1071           0 :     return;
    1072           0 :   }
    1073             : 
    1074             :   /* Verify that there is at least one sign tile with available credits.
    1075             :      If not, we can't send any requests and leave early. */
    1076           0 :   out_ctx_t * sign_out = sign_avail_credits( ctx );
    1077           0 :   if( FD_UNLIKELY( !sign_out ) ) {
    1078           0 :     ctx->metrics->sign_tile_unavail++;
    1079           0 :     return;
    1080           0 :   }
    1081             : 
    1082             :   /* If inflights is at capacity, then the only thing we can send is:
    1083             :      pongs, initial warmup requests, or resend things that are already
    1084             :      inflight.  Any new requests that would cause an inflight to be
    1085             :      added to the queue must be deferred. */
    1086             : 
    1087           0 :   if( FD_UNLIKELY( !fd_signs_queue_empty( ctx->pong_queue ) ) ) {
    1088           0 :     sign_pending_t signable = fd_signs_queue_pop( ctx->pong_queue );
    1089           0 :     fd_repair_send_sign_request( ctx, sign_out, &signable.msg, signable.msg.kind == FD_REPAIR_KIND_PONG ? &signable.pong_data : NULL );
    1090           0 :     *charge_busy = 1;
    1091           0 :     return;
    1092           0 :   }
    1093             : 
    1094           0 :   if( FD_UNLIKELY( fd_inflights_should_drain( ctx->inflights, now ) ) ) {
    1095           0 :     ulong nonce; ulong slot; ulong shred_idx;
    1096           0 :     *charge_busy = 1;
    1097           0 :     fd_inflights_request_pop( ctx->inflights, &nonce, &slot, &shred_idx );
    1098           0 :     fd_forest_blk_t * blk = fd_forest_query( ctx->forest, slot );
    1099           0 :     if( FD_UNLIKELY( blk && !fd_forest_blk_idxs_test( blk->idxs, shred_idx ) ) ) {
    1100           0 :       fd_pubkey_t const * peer = fd_policy_peer_select( ctx->policy );
    1101           0 :       ctx->metrics->rerequest++;
    1102           0 :       nonce = fd_rnonce_ss_compute( ctx->repair_nonce_ss, 1, slot, (uint)shred_idx, now );
    1103           0 :       if( FD_UNLIKELY( !peer ) ) {
    1104             :         /* No peers. But we CANNOT lose this request. */
    1105             :         /* Add this request to the inflights table, pretend we've sent it and let the inflight timeout request it down the line. */
    1106           0 :         fd_hash_t hash = { .ul[0] = 0 };
    1107           0 :         fd_inflights_request_insert( ctx->inflights, nonce, &hash, slot, shred_idx );
    1108           0 :       } else {
    1109           0 :         fd_repair_msg_t * msg = fd_repair_shred( ctx->protocol, peer, (ulong)now/(ulong)1e6, (uint)nonce, slot, shred_idx );
    1110           0 :         fd_repair_send_sign_request( ctx, sign_out, msg, NULL );
    1111           0 :         return;
    1112           0 :       }
    1113           0 :     }
    1114           0 :   }
    1115             : 
    1116           0 :   if( FD_UNLIKELY( fd_inflights_outstanding_free( ctx->inflights ) <= fd_signs_map_key_cnt( ctx->signs_map ) ) ) return; /* no new requests allowed */
    1117             : 
    1118           0 :   fd_repair_msg_t const * cout = fd_policy_next( ctx->policy, ctx->forest, ctx->protocol, now, ctx->metrics->current_slot, charge_busy );
    1119           0 :   if( FD_UNLIKELY( !cout ) ) return;
    1120           0 :   fd_repair_send_sign_request( ctx, sign_out, cout, NULL );
    1121           0 : }
    1122             : 
    1123             : static void
    1124           0 : signs_queue_update_identity( ctx_t * ctx ) {
    1125           0 :   ulong queue_cnt = fd_signs_queue_cnt( ctx->pong_queue );
    1126           0 :   for( ulong i=0UL; i<queue_cnt; i++ ) {
    1127           0 :     sign_pending_t signable = fd_signs_queue_pop( ctx->pong_queue );
    1128           0 :     switch( signable.msg.kind ) {
    1129           0 :       case FD_REPAIR_KIND_PONG:
    1130           0 :         memcpy( signable.msg.pong.from.uc, ctx->identity_public_key.uc, sizeof(fd_pubkey_t) );
    1131           0 :         break;
    1132           0 :       case FD_REPAIR_KIND_SHRED:
    1133           0 :         memcpy( signable.msg.shred.from.uc, ctx->identity_public_key.uc, sizeof(fd_pubkey_t) );
    1134           0 :         break;
    1135           0 :       case FD_REPAIR_KIND_HIGHEST_SHRED:
    1136           0 :         memcpy( signable.msg.highest_shred.from.uc, ctx->identity_public_key.uc, sizeof(fd_pubkey_t) );
    1137           0 :         break;
    1138           0 :       case FD_REPAIR_KIND_ORPHAN:
    1139           0 :         memcpy( signable.msg.orphan.from.uc, ctx->identity_public_key.uc, sizeof(fd_pubkey_t) );
    1140           0 :         break;
    1141           0 :       default:
    1142           0 :         FD_LOG_CRIT(( "Unhandled repair kind %u", signable.msg.kind ));
    1143           0 :         break;
    1144           0 :     }
    1145           0 :     fd_signs_queue_push( ctx->pong_queue, signable );
    1146           0 :   }
    1147           0 : }
    1148             : 
    1149             : static inline void
    1150           0 : during_housekeeping( ctx_t * ctx ) {
    1151             : # if DEBUG_LOGGING
    1152             :   long now = fd_log_wallclock();
    1153             :   if( FD_UNLIKELY( now - ctx->tsdebug > (long)10e9 ) ) {
    1154             :     fd_forest_print( ctx->forest );
    1155             :     ctx->tsdebug = fd_log_wallclock();
    1156             :   }
    1157             : # endif
    1158             : 
    1159           0 :   if( FD_UNLIKELY( fd_keyswitch_state_query( ctx->keyswitch )==FD_KEYSWITCH_STATE_UNHALT_PENDING ) ) {
    1160           0 :     FD_LOG_DEBUG(( "keyswitch: unhalting" ));
    1161           0 :     FD_CRIT( ctx->halt_signing, "state machine corruption" );
    1162           0 :     ctx->halt_signing = 0;
    1163           0 :     fd_keyswitch_state( ctx->keyswitch, FD_KEYSWITCH_STATE_COMPLETED );
    1164           0 :   }
    1165             : 
    1166           0 :   if( FD_UNLIKELY( fd_keyswitch_state_query( ctx->keyswitch )==FD_KEYSWITCH_STATE_SWITCH_PENDING ) ) {
    1167             : 
    1168           0 :     if( !ctx->halt_signing ) {
    1169             :       /* At this point, stop sending new sign requests to the sign tile
    1170             :          and wait for all outstanding sign requests to be received back
    1171             :          from the sign tile.  We also need to update any pending
    1172             :          outgoing sign requests with the new identity key. */
    1173           0 :       FD_LOG_DEBUG(( "keyswitch: halting signing" ));
    1174           0 :       ctx->halt_signing = 1;
    1175           0 :       memcpy( ctx->identity_public_key.uc, ctx->keyswitch->bytes, 32UL );
    1176           0 :       ctx->protocol->identity_key = ctx->identity_public_key;
    1177           0 :       signs_queue_update_identity( ctx );
    1178           0 :     }
    1179             : 
    1180           0 :     if( fd_signs_map_key_cnt( ctx->signs_map )==0UL ) {
    1181             :       /* Once there are no more in flight sign requests, we are ready to
    1182             :          say that the keyswitch is completed. */
    1183           0 :       FD_LOG_DEBUG(( "keyswitch: completed, no more outstanding stale sign requests" ));
    1184           0 :       fd_keyswitch_state( ctx->keyswitch, FD_KEYSWITCH_STATE_COMPLETED );
    1185           0 :     }
    1186           0 :   }
    1187           0 : }
    1188             : 
    1189             : static void
    1190             : privileged_init( fd_topo_t *      topo,
    1191           0 :                  fd_topo_tile_t * tile ) {
    1192           0 :   void * scratch = fd_topo_obj_laddr( topo, tile->tile_obj_id );
    1193             : 
    1194           0 :   FD_SCRATCH_ALLOC_INIT( l, scratch );
    1195           0 :   ctx_t * ctx = FD_SCRATCH_ALLOC_APPEND( l, alignof(ctx_t), sizeof(ctx_t) );
    1196           0 :   fd_memset( ctx, 0, sizeof(ctx_t) );
    1197             : 
    1198           0 :   uchar const * identity_key = fd_keyload_load( tile->repair.identity_key_path, /* pubkey only: */ 1 );
    1199           0 :   fd_memcpy( ctx->identity_public_key.uc, identity_key, sizeof(fd_pubkey_t) );
    1200             : 
    1201           0 :   FD_TEST( fd_rng_secure( &ctx->repair_seed, sizeof(ulong) ) );
    1202             : 
    1203           0 :   FD_LOG_DEBUG(( "Generating rnonce_ss" ));
    1204           0 :   ulong rnonce_ss_id = fd_pod_queryf_ulong( topo->props, ULONG_MAX, "rnonce_ss" );
    1205           0 :   FD_TEST( rnonce_ss_id!=ULONG_MAX );
    1206           0 :   void * shared_rnonce = fd_topo_obj_laddr( topo, rnonce_ss_id );
    1207           0 :   ulong * nonce_initialized = (ulong *)(sizeof(fd_rnonce_ss_t)+(uchar *)shared_rnonce);
    1208           0 :   FD_TEST( fd_rng_secure( shared_rnonce, sizeof(fd_rnonce_ss_t) ) );
    1209           0 :   memcpy( ctx->repair_nonce_ss, shared_rnonce, sizeof(fd_rnonce_ss_t) );
    1210           0 :   FD_COMPILER_MFENCE();
    1211           0 :   FD_VOLATILE( *nonce_initialized ) = 1UL;
    1212           0 : }
    1213             : 
    1214             : static void
    1215             : unprivileged_init( fd_topo_t *      topo,
    1216           0 :                    fd_topo_tile_t * tile ) {
    1217           0 :   void * scratch = fd_topo_obj_laddr( topo, tile->tile_obj_id );
    1218             : 
    1219           0 :   ulong total_sign_depth = tile->repair.repair_sign_depth * tile->repair.repair_sign_cnt;
    1220           0 :   int   lg_sign_depth    = fd_ulong_find_msb( fd_ulong_pow2_up(total_sign_depth) ) + 1;
    1221             : 
    1222           0 :   FD_SCRATCH_ALLOC_INIT( l, scratch );
    1223           0 :   ctx_t * ctx       = FD_SCRATCH_ALLOC_APPEND( l, alignof(ctx_t),            sizeof(ctx_t)                                                 );
    1224           0 :   ctx->protocol     = FD_SCRATCH_ALLOC_APPEND( l, fd_repair_align(),         fd_repair_footprint()                                         );
    1225           0 :   ctx->forest       = FD_SCRATCH_ALLOC_APPEND( l, fd_forest_align(),         fd_forest_footprint( tile->repair.slot_max )                  );
    1226           0 :   ctx->policy       = FD_SCRATCH_ALLOC_APPEND( l, fd_policy_align(),         fd_policy_footprint( FD_DEDUP_CACHE_MAX, FD_REPAIR_PEER_MAX ) );
    1227           0 :   ctx->inflights    = FD_SCRATCH_ALLOC_APPEND( l, fd_inflights_align(),      fd_inflights_footprint()                                      );
    1228           0 :   ctx->signs_map    = FD_SCRATCH_ALLOC_APPEND( l, fd_signs_map_align(),      fd_signs_map_footprint( lg_sign_depth )                       );
    1229           0 :   ctx->pong_queue   = FD_SCRATCH_ALLOC_APPEND( l, fd_signs_queue_align(),    fd_signs_queue_footprint()                                    );
    1230           0 :   ctx->slot_metrics = FD_SCRATCH_ALLOC_APPEND( l, fd_repair_metrics_align(), fd_repair_metrics_footprint()                                 );
    1231           0 :   FD_TEST( FD_SCRATCH_ALLOC_FINI( l, scratch_align() ) == (ulong)scratch + scratch_footprint( tile ) );
    1232             : 
    1233           0 :   ctx->protocol     = fd_repair_join        ( fd_repair_new        ( ctx->protocol, &ctx->identity_public_key                                                      ) );
    1234           0 :   ctx->forest       = fd_forest_join        ( fd_forest_new        ( ctx->forest,   tile->repair.slot_max, ctx->repair_seed                                        ) );
    1235           0 :   ctx->policy       = fd_policy_join        ( fd_policy_new        ( ctx->policy,   FD_DEDUP_CACHE_MAX, FD_REPAIR_PEER_MAX, ctx->repair_seed, ctx->repair_nonce_ss ) );
    1236           0 :   ctx->inflights    = fd_inflights_join     ( fd_inflights_new     ( ctx->inflights, ctx->repair_seed+1234UL                                                       ) );
    1237           0 :   ctx->signs_map    = fd_signs_map_join     ( fd_signs_map_new     ( ctx->signs_map, lg_sign_depth, 0UL                                                            ) );
    1238           0 :   ctx->pong_queue   = fd_signs_queue_join   ( fd_signs_queue_new   ( ctx->pong_queue                                                                               ) );
    1239           0 :   ctx->slot_metrics = fd_repair_metrics_join( fd_repair_metrics_new( ctx->slot_metrics                                                                             ) );
    1240             : 
    1241           0 :   ctx->keyswitch = fd_keyswitch_join( fd_topo_obj_laddr( topo, tile->id_keyswitch_obj_id ) );
    1242           0 :   FD_TEST( ctx->keyswitch );
    1243             : 
    1244           0 :   ctx->halt_signing = 0;
    1245             : 
    1246             :   /* Process in links */
    1247             : 
    1248           0 :   if( FD_UNLIKELY( tile->in_cnt > MAX_IN_LINKS ) ) FD_LOG_ERR(( "repair tile has too many input links" ));
    1249             : 
    1250           0 :   uint  sign_repair_in_idx[ MAX_SIGN_TILE_CNT ] = {0};
    1251           0 :   uint  sign_repair_idx  = 0;
    1252           0 :   ulong sign_link_depth  = 0;
    1253             : 
    1254           0 :   for( uint in_idx=0U; in_idx<(tile->in_cnt); in_idx++ ) {
    1255           0 :     fd_topo_link_t * link = &topo->links[ tile->in_link_id[ in_idx ] ];
    1256           0 :     if( 0==strcmp( link->name, "net_repair" ) ) {
    1257           0 :       ctx->in_kind[ in_idx ] = IN_KIND_NET;
    1258           0 :       fd_net_rx_bounds_init( &ctx->in_links[ in_idx ].net_rx, link->dcache );
    1259           0 :       continue;
    1260           0 :     } else if( 0==strcmp( link->name, "sign_repair" ) ) {
    1261           0 :       ctx->in_kind[ in_idx ]                  = IN_KIND_SIGN;
    1262           0 :       sign_repair_in_idx[ sign_repair_idx++ ] = in_idx;
    1263           0 :       sign_link_depth                         = link->depth;
    1264           0 :     }
    1265           0 :     else if( 0==strcmp( link->name, "gossip_out"   ) ) ctx->in_kind[ in_idx ] = IN_KIND_GOSSIP;
    1266           0 :     else if( 0==strcmp( link->name, "tower_out"    ) ) ctx->in_kind[ in_idx ] = IN_KIND_TOWER;
    1267           0 :     else if( 0==strcmp( link->name, "shred_out"    ) ) ctx->in_kind[ in_idx ] = IN_KIND_SHRED;
    1268           0 :     else if( 0==strcmp( link->name, "snapin_manif" ) ) ctx->in_kind[ in_idx ] = IN_KIND_SNAP;
    1269           0 :     else if( 0==strcmp( link->name, "genesi_out"   ) ) ctx->in_kind[ in_idx ] = IN_KIND_GENESIS;
    1270           0 :     else if( 0==strcmp( link->name, "replay_out"   ) ) ctx->in_kind[ in_idx ] = IN_KIND_REPLAY;
    1271           0 :     else FD_LOG_ERR(( "repair tile has unexpected input link %s", link->name ));
    1272             : 
    1273           0 :     ctx->in_links[ in_idx ].mem    = topo->workspaces[ topo->objs[ link->dcache_obj_id ].wksp_id ].wksp;
    1274           0 :     ctx->in_links[ in_idx ].chunk0 = fd_dcache_compact_chunk0( ctx->in_links[ in_idx ].mem, link->dcache );
    1275           0 :     ctx->in_links[ in_idx ].wmark  = fd_dcache_compact_wmark ( ctx->in_links[ in_idx ].mem, link->dcache, link->mtu );
    1276           0 :     ctx->in_links[ in_idx ].mtu    = link->mtu;
    1277             : 
    1278           0 :     FD_TEST( fd_dcache_compact_is_safe( ctx->in_links[in_idx].mem, link->dcache, link->mtu, link->depth ) );
    1279           0 :   }
    1280             : 
    1281           0 :   ctx->net_out_ctx->idx    = UINT_MAX;
    1282           0 :   ctx->repair_out_ctx->idx = UINT_MAX;
    1283           0 :   ctx->repair_sign_cnt   = 0;
    1284           0 :   ctx->sign_rrobin_idx   = 0;
    1285             : 
    1286           0 :   for( uint out_idx=0U; out_idx<(tile->out_cnt); out_idx++ ) {
    1287           0 :     fd_topo_link_t * link = &topo->links[ tile->out_link_id[ out_idx ] ];
    1288             : 
    1289           0 :     if( 0==strcmp( link->name, "repair_net" ) ) {
    1290             : 
    1291           0 :       if( ctx->net_out_ctx->idx!=UINT_MAX ) continue; /* only use first net link */
    1292           0 :       ctx->net_out_ctx->idx    = out_idx;
    1293           0 :       ctx->net_out_ctx->mem    = topo->workspaces[ topo->objs[ link->dcache_obj_id ].wksp_id ].wksp;
    1294           0 :       ctx->net_out_ctx->chunk0 = fd_dcache_compact_chunk0( ctx->net_out_ctx->mem, link->dcache );
    1295           0 :       ctx->net_out_ctx->wmark  = fd_dcache_compact_wmark( ctx->net_out_ctx->mem, link->dcache, link->mtu );
    1296           0 :       ctx->net_out_ctx->chunk  = ctx->net_out_ctx->chunk0;
    1297             : 
    1298           0 :     } else if( 0==strcmp( link->name, "repair_out" ) ) {
    1299             : 
    1300           0 :       out_ctx_t * replay_out = ctx->repair_out_ctx;
    1301           0 :       replay_out->idx        = out_idx;
    1302           0 :       replay_out->mem        = topo->workspaces[ topo->objs[ link->dcache_obj_id ].wksp_id ].wksp;
    1303           0 :       replay_out->chunk0     = fd_dcache_compact_chunk0( replay_out->mem, link->dcache );
    1304           0 :       replay_out->wmark      = fd_dcache_compact_wmark( replay_out->mem, link->dcache, link->mtu );
    1305           0 :       replay_out->chunk      = replay_out->chunk0;
    1306             : 
    1307           0 :     } else if( 0==strcmp( link->name, "repair_sign" ) ) {
    1308             : 
    1309           0 :       out_ctx_t * repair_sign_out  = &ctx->repair_sign_out_ctx[ ctx->repair_sign_cnt ];
    1310           0 :       repair_sign_out->idx         = out_idx;
    1311           0 :       repair_sign_out->mem         = topo->workspaces[ topo->objs[ link->dcache_obj_id ].wksp_id ].wksp;
    1312           0 :       repair_sign_out->chunk0      = fd_dcache_compact_chunk0( repair_sign_out->mem, link->dcache );
    1313           0 :       repair_sign_out->wmark       = fd_dcache_compact_wmark( repair_sign_out->mem, link->dcache, link->mtu );
    1314           0 :       repair_sign_out->chunk       = repair_sign_out->chunk0;
    1315           0 :       repair_sign_out->in_idx      = sign_repair_in_idx[ ctx->repair_sign_cnt++ ]; /* match to the sign_repair input link */
    1316           0 :       repair_sign_out->max_credits = sign_link_depth;
    1317           0 :       repair_sign_out->credits     = sign_link_depth;
    1318             : 
    1319           0 :     } else {
    1320           0 :       FD_LOG_ERR(( "repair tile has unexpected output link %s", link->name ));
    1321           0 :     }
    1322           0 :   }
    1323           0 :   FD_TEST( ctx->net_out_ctx->idx!=UINT_MAX );
    1324           0 :   FD_TEST( ctx->repair_out_ctx->idx!=UINT_MAX );
    1325           0 :   if( FD_UNLIKELY( ctx->repair_sign_cnt!=sign_repair_idx ) ) {
    1326           0 :     FD_LOG_ERR(( "Mismatch between repair_sign output links (%lu) and sign_repair input links (%u)", ctx->repair_sign_cnt, sign_repair_idx ));
    1327           0 :   }
    1328             : 
    1329             : # if DEBUG_LOGGING
    1330             :   if( fd_signs_map_key_max( ctx->signs_map ) < tile->repair.repair_sign_depth * tile->repair.repair_sign_cnt ) {
    1331             :     FD_LOG_ERR(( "repair pending signs tracking map is too small: %lu < %lu.  Increase the key_max", fd_signs_map_key_max( ctx->signs_map ), tile->repair.repair_sign_depth * tile->repair.repair_sign_cnt ));
    1332             :   }
    1333             : # endif
    1334             : 
    1335           0 :   ctx->wksp = topo->workspaces[ topo->objs[ tile->tile_obj_id ].wksp_id ].wksp;
    1336           0 :   ctx->repair_intake_addr.port = fd_ushort_bswap( tile->repair.repair_intake_listen_port );
    1337           0 :   ctx->repair_serve_addr.port  = fd_ushort_bswap( tile->repair.repair_serve_listen_port  );
    1338             : 
    1339             :   /* TODO clean these up */
    1340           0 :   ctx->net_id = (ushort)0;
    1341           0 :   fd_ip4_udp_hdr_init( ctx->intake_hdr, 0, 0, tile->repair.repair_intake_listen_port );
    1342           0 :   fd_ip4_udp_hdr_init( ctx->serve_hdr,  0, 0, tile->repair.repair_serve_listen_port  );
    1343             : 
    1344             :   /* Repair set up */
    1345             : 
    1346           0 :   ctx->turbine_slot0 = ULONG_MAX;
    1347           0 :   FD_LOG_INFO(( "repair my addr - intake addr: " FD_IP4_ADDR_FMT ":%u, serve_addr: " FD_IP4_ADDR_FMT ":%u",
    1348           0 :     FD_IP4_ADDR_FMT_ARGS( ctx->repair_intake_addr.addr ), fd_ushort_bswap( ctx->repair_intake_addr.port ),
    1349           0 :     FD_IP4_ADDR_FMT_ARGS( ctx->repair_serve_addr.addr ), fd_ushort_bswap( ctx->repair_serve_addr.port ) ));
    1350             : 
    1351           0 :   memset( ctx->metrics, 0, sizeof(ctx->metrics) );
    1352             : 
    1353           0 :   fd_histf_join( fd_histf_new( ctx->metrics->slot_compl_time, FD_MHIST_SECONDS_MIN( REPAIR, SLOT_COMPLETE_TIME ),
    1354           0 :                                                               FD_MHIST_SECONDS_MAX( REPAIR, SLOT_COMPLETE_TIME ) ) );
    1355           0 :   fd_histf_join( fd_histf_new( ctx->metrics->response_latency, FD_MHIST_MIN( REPAIR, RESPONSE_LATENCY ),
    1356           0 :                                                                FD_MHIST_MAX( REPAIR, RESPONSE_LATENCY ) ) );
    1357             : 
    1358           0 :   ctx->tsdebug = fd_log_wallclock();
    1359           0 :   ctx->pending_key_next = 0;
    1360           0 :   ctx->profiler.enabled  = tile->repair.end_slot != 0UL;
    1361           0 :   ctx->profiler.end_slot = tile->repair.end_slot;
    1362           0 :   if( FD_UNLIKELY( ctx->profiler.enabled ) ) {
    1363           0 :     ctx->metrics->current_slot = tile->repair.end_slot + 1; /* +1 to allow the turbine slot 0 to be completed */
    1364           0 :     ctx->profiler.complete     = 0;
    1365           0 :   }
    1366           0 : }
    1367             : 
    1368             : static ulong
    1369             : populate_allowed_seccomp( fd_topo_t const *      topo FD_PARAM_UNUSED,
    1370             :                           fd_topo_tile_t const * tile FD_PARAM_UNUSED,
    1371             :                           ulong                  out_cnt,
    1372           0 :                           struct sock_filter *   out ) {
    1373           0 :   populate_sock_filter_policy_fd_repair_tile( out_cnt, out, (uint)fd_log_private_logfile_fd() );
    1374           0 :   return sock_filter_policy_fd_repair_tile_instr_cnt;
    1375           0 : }
    1376             : 
    1377             : static ulong
    1378             : populate_allowed_fds( fd_topo_t const *      topo FD_PARAM_UNUSED,
    1379             :                       fd_topo_tile_t const * tile FD_PARAM_UNUSED,
    1380             :                       ulong                  out_fds_cnt,
    1381           0 :                       int *                  out_fds ) {
    1382           0 :   if( FD_UNLIKELY( out_fds_cnt<2UL ) ) FD_LOG_ERR(( "out_fds_cnt %lu", out_fds_cnt ));
    1383             : 
    1384           0 :   ulong out_cnt = 0UL;
    1385           0 :   out_fds[ out_cnt++ ] = 2; /* stderr */
    1386           0 :   if( FD_LIKELY( -1!=fd_log_private_logfile_fd() ) )
    1387           0 :     out_fds[ out_cnt++ ] = fd_log_private_logfile_fd(); /* logfile */
    1388           0 :   return out_cnt;
    1389           0 : }
    1390             : 
    1391             : static inline void
    1392           0 : metrics_write( ctx_t * ctx ) {
    1393           0 :   FD_MCNT_SET( REPAIR, CURRENT_SLOT,      ctx->metrics->current_slot );
    1394           0 :   FD_MCNT_SET( REPAIR, REPAIRED_SLOTS,    ctx->metrics->repaired_slots );
    1395           0 :   FD_MCNT_SET( REPAIR, REQUEST_PEERS,     fd_policy_peer_pool_used( ctx->policy->peers.pool ) );
    1396           0 :   FD_MCNT_SET( REPAIR, SIGN_TILE_UNAVAIL, ctx->metrics->sign_tile_unavail );
    1397           0 :   FD_MCNT_SET( REPAIR, REREQUEST_QUEUE,   ctx->metrics->rerequest );
    1398             : 
    1399           0 :   FD_MCNT_SET      ( REPAIR, TOTAL_PKT_COUNT, ctx->metrics->send_pkt_cnt   );
    1400           0 :   FD_MCNT_ENUM_COPY( REPAIR, SENT_PKT_TYPES,  ctx->metrics->sent_pkt_types );
    1401             : 
    1402           0 :   FD_MHIST_COPY( REPAIR, SLOT_COMPLETE_TIME, ctx->metrics->slot_compl_time );
    1403           0 :   FD_MHIST_COPY( REPAIR, RESPONSE_LATENCY,   ctx->metrics->response_latency );
    1404             : 
    1405           0 :   FD_MCNT_SET( REPAIR, BLK_EVICTED,       ctx->metrics->blk_evicted );
    1406           0 :   FD_MCNT_SET( REPAIR, BLK_FAILED_INSERT, ctx->metrics->blk_failed_insert );
    1407             : 
    1408           0 :   FD_MCNT_SET( REPAIR, UNKNOWN_PEER_PING,     ctx->metrics->unknown_peer_ping );
    1409           0 :   FD_MCNT_SET( REPAIR, MALFORMED_PING,        ctx->metrics->malformed_ping );
    1410           0 :   FD_MCNT_SET( REPAIR, FAILED_SIGVERIFY_PING, ctx->metrics->fail_sigverify_ping );
    1411           0 : }
    1412             : 
    1413             : #undef DEBUG_LOGGING
    1414             : 
    1415             : /* At most one sign request is made in after_credit.  Then at most one
    1416             :    message is published in after_frag. */
    1417           0 : #define STEM_BURST (2UL)
    1418             : 
    1419             : /* Sign manual credit management, backpressuring, sign tile count, &
    1420             :    sign speed effect this lazy value. The main goal of repair's highest
    1421             :    workload (catchup) is to have high send packet rate.  Repair is
    1422             :    regularly idle, and mostly waiting for dispatched signs to come
    1423             :    in. Processing shreds from shred tile is a relatively fast operation.
    1424             :    Thus we only worry about fully utilizing the sign tiles' capacity.
    1425             : 
    1426             :    Assuming standard 2 sign tiles & reasonably fast signing rate & if
    1427             :    repair_sign_depth==sign_repair_depth: the lower the LAZY, the less
    1428             :    time is spent in backpressure, and the higher the packet send rate
    1429             :    gets.  As expected, up until a certain point, credit return is slower
    1430             :    than signing. This starts to plateau at ~10k LAZY (for a box that can
    1431             :    sign at ~20k repair pps, but is fully dependent on the sign tile's
    1432             :    speed).
    1433             : 
    1434             :    At this point we start returning credits faster than we actually get
    1435             :    them from the sign tile, so signing becomes the bottleneck.  The
    1436             :    extreme case is when we set it to standard lazy (289 ns);
    1437             :    housekeeping time spikes, but backpressure time drops (to a lower but
    1438             :    inconsistent value). But because we are usually idling in the repair
    1439             :    tile, higher housekeeping doesn't really effect the send packet rate.
    1440             : 
    1441             :    Recall that repair_sign_depth is actually > sign_repair_depth (see
    1442             :    long comment in ctx_t struct).  So repair_sign is NEVER
    1443             :    backpressuring the repair tile.  When we set
    1444             :    repair_sign_depth>sign_repair_depth, we spend very little time in
    1445             :    backpressure (repair_sign always has available credits), and most of
    1446             :    the time idling.  Theoretically, this uncouples repair tile with
    1447             :    credit return and basically sends at rate as close to as we can sign.
    1448             :    This is a small improvement over the first case (low lazy,
    1449             :    repair_sign_depth==sign_repair_depth).
    1450             : 
    1451             :    Since we don't ever fill up repair_sign link, we can set LAZY to any
    1452             :    reasonable value that keeps housekeeping time low. */
    1453           0 : #define STEM_LAZY  (64000)
    1454             : 
    1455           0 : #define STEM_CALLBACK_CONTEXT_TYPE  ctx_t
    1456           0 : #define STEM_CALLBACK_CONTEXT_ALIGN alignof(ctx_t)
    1457             : 
    1458           0 : #define STEM_CALLBACK_AFTER_CREDIT        after_credit
    1459           0 : #define STEM_CALLBACK_BEFORE_FRAG         before_frag
    1460           0 : #define STEM_CALLBACK_DURING_FRAG         during_frag
    1461           0 : #define STEM_CALLBACK_AFTER_FRAG          after_frag
    1462           0 : #define STEM_CALLBACK_DURING_HOUSEKEEPING during_housekeeping
    1463           0 : #define STEM_CALLBACK_METRICS_WRITE       metrics_write
    1464             : 
    1465             : #include "../../disco/stem/fd_stem.c"
    1466             : 
    1467             : fd_topo_run_tile_t fd_tile_repair = {
    1468             :   .name                     = "repair",
    1469             :   .loose_footprint          = loose_footprint,
    1470             :   .populate_allowed_seccomp = populate_allowed_seccomp,
    1471             :   .populate_allowed_fds     = populate_allowed_fds,
    1472             :   .scratch_align            = scratch_align,
    1473             :   .scratch_footprint        = scratch_footprint,
    1474             :   .unprivileged_init        = unprivileged_init,
    1475             :   .privileged_init          = privileged_init,
    1476             :   .run                      = stem_run,
    1477             : };

Generated by: LCOV version 1.14